A finetune of Qwen3.5-Antirep-27B to reduce infinite looping and repetition in multiturn conversations.
Trained using int8 LoRA with DPO on 1x A100.
Chat template
Files info
Base model