ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning
Paper
•
2512.05111
•
Published
•
37
None defined yet.
ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning
Think Visually, Reason Textually: Vision-Language Synergy in ARC