JayHyeon/Qwen_1.5B-math-cDPO_5e-7_0.1lsmooth-1.0vpo_constant-1ep Text Generation • 2B • Updated Aug 5 • 4
JayHyeon/Qwen_1.5B-math-cDPO_5e-7_0.3lsmooth-1.0vpo_constant-1ep Text Generation • 2B • Updated Aug 5 • 4
JayHyeon/Qwen_1.5B-math-rDPO_5e-7_0.1lsmooth-1.0vpo_constant-1ep Text Generation • 2B • Updated Aug 5 • 9
JayHyeon/Qwen_1.5B-math-rDPO_5e-7_0.3lsmooth-1.0vpo_constant-1ep Text Generation • 2B • Updated Aug 5 • 3