Doppler-Enhanced Deep Learning: Improving Thyroid Nodule Segmentation with YOLOv5 Instance Segmentation Paper • 2512.00639 • Published 10 days ago
Parrot: Persuasion and Agreement Robustness Rating of Output Truth -- A Sycophancy Robustness Benchmark for LLMs Paper • 2511.17220 • Published 19 days ago • 16
TurkColBERT: A Benchmark of Dense and Late-Interaction Models for Turkish Information Retrieval Paper • 2511.16528 • Published 19 days ago • 16
Mask-to-Height: A YOLOv11-Based Architecture for Joint Building Instance Segmentation and Height Classification from Satellite Imagery Paper • 2510.27224 • Published Oct 31 • 2
Turk-LettuceDetect: A Hallucination Detection Models for Turkish RAG Applications Paper • 2509.17671 • Published Sep 22 • 9
EthicsMH: A Pilot Benchmark for Ethical Reasoning in Mental Health AI Paper • 2509.11648 • Published Sep 15 • 1
D-HUMOR: Dark Humor Understanding via Multimodal Open-ended Reasoning Paper • 2509.06771 • Published Sep 8 • 5
Guided Decoding and Its Critical Role in Retrieval-Augmented Generation Paper • 2509.06631 • Published Sep 8 • 10
Query Attribute Modeling: Improving search relevance with Semantic Search and Meta Data Filtering Paper • 2508.04683 • Published Aug 6
DSBC : Data Science task Benchmarking with Context engineering Paper • 2507.23336 • Published Jul 31 • 2
NeurIPS 2025 E2LM Competition : Early Training Evaluation of Language Models Paper • 2506.07731 • Published Jun 9 • 2
Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance Paper • 2507.22448 • Published Jul 30 • 66
Guaranteed Guess: A Language Modeling Approach for CISC-to-RISC Transpilation with Testing Guarantees Paper • 2506.14606 • Published Jun 17 • 11
A Technical Study into Small Reasoning Language Models Paper • 2506.13404 • Published Jun 16 • 8
A Technical Study into Small Reasoning Language Models Paper • 2506.13404 • Published Jun 16 • 8
CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark Paper • 2505.16968 • Published May 22 • 40
SAEs $\textit{Can}$ Improve Unlearning: Dynamic Sparse Autoencoder Guardrails for Precision Unlearning in LLMs Paper • 2504.08192 • Published Apr 11 • 3
Position: Mechanistic Interpretability Should Prioritize Feature Consistency in SAEs Paper • 2505.20254 • Published May 26 • 5
Uncovering Cultural Representation Disparities in Vision-Language Models Paper • 2505.14729 • Published May 20 • 1