Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens Paper • 2511.19418 • Published 16 days ago • 27
UnSAMv2: Self-Supervised Learning Enables Segment Anything at Any Granularity Paper • 2511.13714 • Published 23 days ago • 10
Reconstruction Alignment Improves Unified Multimodal Models Paper • 2509.07295 • Published Sep 8 • 40