MedSAM3: Delving into Segment Anything with Medical Concepts Paper • 2511.19046 • Published Nov 24, 2025 • 51
Drive as You Speak: Enabling Human-Like Interaction with Large Language Models in Autonomous Vehicles Paper • 2309.10228 • Published Sep 19, 2023
On-Board Vision-Language Models for Personalized Autonomous Vehicle Motion Control: System Design and Real-World Validation Paper • 2411.11913 • Published Nov 17, 2024
MedSAM3: Delving into Segment Anything with Medical Concepts Paper • 2511.19046 • Published Nov 24, 2025 • 51
Context Engineering for Trustworthiness: Rescorla Wagner Steering Under Mixed and Inappropriate Contexts Paper • 2509.04500 • Published Sep 2, 2025 • 4
See and Think: Embodied Agent in Virtual Environment Paper • 2311.15209 • Published Nov 26, 2023 • 3
Adaptive Graph Pruning for Multi-Agent Communication Paper • 2506.02951 • Published Jun 3, 2025 • 2
Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs Paper • 2506.21656 • Published Jun 26, 2025 • 16
SocialGesture: Delving into Multi-person Gesture Understanding Paper • 2504.02244 • Published Apr 3, 2025
Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs Paper • 2506.21656 • Published Jun 26, 2025 • 16
Towards Adversarially Robust Dataset Distillation by Curvature Regularization Paper • 2403.10045 • Published Mar 15, 2024 • 1
Practical Region-level Attack against Segment Anything Models Paper • 2404.08255 • Published Apr 12, 2024
SafeScientist: Toward Risk-Aware Scientific Discoveries by LLM Agents Paper • 2505.23559 • Published May 29, 2025 • 11
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents Paper • 2401.00812 • Published Jan 1, 2024 • 11
What is the Visual Cognition Gap between Humans and Multimodal LLMs? Paper • 2406.10424 • Published Jun 14, 2024