Assessing Demographic Bias in Named Entity Recognition Paper โข 2008.03415 โข Published Aug 8, 2020
ChessArena: A Chess Testbed for Evaluating Strategic Reasoning Capabilities of Large Language Models Paper โข 2509.24239 โข Published Sep 29, 2025 โข 1