Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Dongchan Shin's picture
3 6

Dongchan Shin

ShinDC
SteveSHEN's profile picture 21world's profile picture
·

AI & ML interests

NLP

Organizations

McGill NLP Group's profile picture

upvoted 4 papers 8 months ago

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories

Paper • 2504.08942 • Published Apr 11 • 28

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Paper • 2404.07972 • Published Apr 11, 2024 • 50

Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows

Paper • 2411.07763 • Published Nov 12, 2024 • 2

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published Apr 2 • 86
upvoted a paper 9 months ago

SafeArena: Evaluating the Safety of Autonomous Web Agents

Paper • 2503.04957 • Published Mar 6 • 21
upvoted a paper about 2 years ago

OpenAgents: An Open Platform for Language Agents in the Wild

Paper • 2310.10634 • Published Oct 16, 2023 • 9
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs