arturmatos/gemma-1b-new_dataset-policy-checkpoint-200-grpo-post-dpo Text Generation • 1.0B • Updated 30 days ago • 6
arturmatos/gemma-1b-new_dataset-policy-checkpoint-200-dpo-if-tulu-only Text Generation • 1.0B • Updated 30 days ago • 7
arturmatos/gemma-1b-new_dataset-policy-checkpoint-234-dpo-if-tulu-only Text Generation • 1.0B • Updated 30 days ago • 10
arturmatos/gemma-1b-new_dataset-policy-checkpoint-60-dpo-if-tulu-only Text Generation • 1.0B • Updated 30 days ago • 7
arturmatos/gemma-1b-new_dataset-policy-checkpoint-260-dpo-if Text Generation • 1.0B • Updated about 1 month ago • 8
arturmatos/gemma-1b-new_dataset-policy-checkpoint-320-dpo-if Text Generation • 1.0B • Updated about 1 month ago • 7
arturmatos/gemma-1b-new_dataset-policy-checkpoint-420-dpo-if Text Generation • 1.0B • Updated about 1 month ago • 22
arturmatos/gemma-1b-new_dataset-policy-checkpoint-280-dpo-if Text Generation • 1.0B • Updated about 1 month ago • 6
arturmatos/gemma-1b-new_dataset-policy-checkpoint-120-dpo-if Text Generation • 1.0B • Updated about 1 month ago • 4
arturmatos/gemma-1b-new_dataset-policy-checkpoint-250-grpo-chatfix Text Generation • 1.0B • Updated about 1 month ago • 3
arturmatos/gemma-1b-new_dataset-policy-checkpoint-450-grpo-chatfix Text Generation • 1.0B • Updated about 1 month ago • 5
arturmatos/gemma-1b-new_dataset-policy-checkpoint-400-grpo-chatfix Text Generation • 1.0B • Updated about 1 month ago • 4
arturmatos/gemma-1b-new_dataset-policy-checkpoint-100-grpo-dpo Text Generation • 1.0B • Updated about 1 month ago • 5
arturmatos/gemma-1b-new_dataset-policy-checkpoint-700-grpo-chatfix-h1002 Text Generation • 1.0B • Updated about 1 month ago • 7
arturmatos/gemma-1b-new_dataset-policy-checkpoint-500-grpo-chatfix-h1002 Text Generation • 1.0B • Updated about 1 month ago • 5
arturmatos/gemma-1b-new_dataset-policy-checkpoint-150-grpo-chatfix Text Generation • 1.0B • Updated about 1 month ago • 5
arturmatos/gemma-1b-new_dataset-policy-checkpoint-650-grpo-chatfix Text Generation • 1.0B • Updated about 1 month ago • 5
arturmatos/gemma-1b-new_dataset-policy-checkpoint-350-grpo-chatfix Text Generation • 1.0B • Updated about 1 month ago • 4
arturmatos/gemma-1b-new_dataset-policy-checkpoint-363-dpo Text Generation • 1.0B • Updated about 1 month ago • 9
arturmatos/gemma-1b-new_dataset-policy-checkpoint-1200-my-rm Text Generation • 1.0B • Updated Nov 6 • 5