arxiv:2605.09063
Seungyeop Yi
devpotatopotato
ยท
AI & ML interests
None yet
Recent Activity
authored a paper 3 days ago
CostNav: A Navigation Benchmark for Real-World Economic-Cost Evaluation of Physical AI Agents authored a paper 3 days ago
Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs