SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications Paper • 2303.15446 • Published Mar 27, 2023 • 1
GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model Paper • 2407.13772 • Published Jul 18, 2024
VideoMathQA: Benchmarking Mathematical Reasoning via Multimodal Understanding in Videos Paper • 2506.05349 • Published Jun 5, 2025 • 24