-
-
-
-
-
-
Inference Providers
Active filters: dpo, trl
mradermacher/gpt-4o-distil-Llama-3.1-8B-Instruct-PaperWitch-heresy-GGUF
8B • Updated
• 1.31k
• 4
HumanLLMs/Human-Like-Mistral-Nemo-Instruct-2407
Text Generation
• 12B • Updated
• 463
• • 26
MuXodious/gpt-4o-distil-Llama-3.1-8B-Instruct-PaperWitch-heresy
Text Generation
• 8B • Updated
• 47
• 2
anilsanli/Turkish-Llama-DPO-Programming-Adapter
Text Generation
• Updated
• 18
• 1
mradermacher/gpt-4o-distil-Llama-3.1-8B-Instruct-PaperWitch-heresy-i1-GGUF
8B • Updated
• 3.82k
• 1
MuXodious/gpt-4o-distil-Llama-3.3-70B-Instruct-PaperWitch-heresy
Text Generation
• 71B • Updated
• 55
• 1
lewtun/zephyr-7b-dpo-full
Text Generation
• 7B • Updated
• 2
alignment-handbook/zephyr-7b-dpo-full
Text Generation
• 7B • Updated
• 37
• 3
alignment-handbook/zephyr-7b-dpo-qlora
Updated
• 9
• 9
amirali1985/gpt-neo-125m_hh_reward
Text Generation
• 0.1B • Updated
• 7
lewtun/zephyr-7b-dpo-qlora
Updated
sambar/zephyr-7b-ipo-lora
Text Generation
• Updated
• 1
nikkoyabut/merged_model_dpo
Updated
sambar/zephyr-7b-ipo-lora-5ep
Text Generation
• Updated
• 1
alexredna/TinyLlama-1.1B-Chat-v1.0-reasoning-v2-dpo
Text Generation
• 1B • Updated
• 4
• 2
Yaxin1992/mixtral-dpo-1000
adhi29/openhermes-mistral-dpo-gptq
Updated
Text Generation
• 1.03M • Updated
• 1
ybelkada/test-tags-model-2
Text Generation
• 1.03M • Updated
• 4
justinj92/dpoplatypus-phi2
Text Generation
• 3B • Updated
lewtun/zephyr-7b-dpo-qlora-8e0975a
Updated
akashkumarbtc/openhermes-mistral-dpo-gptq
Updated
darshan8950/openhermes-mistral-dpo-gptq
Updated
ondevicellm/zephyr-7b-dpo-full
Text Generation
• 7B • Updated
• 2