MixEval

community

https://mixeval.github.io/

NiJinjie

Psycoy

Activity Feed

AI & ML interests

LLM & LMM evaluation

Recent Activity

jinjieni authored a paper about 1 month ago

Diffusion Language Models are Super Data Learners

jinjieni authored a paper about 1 month ago

Training Optimal Large Diffusion Language Models

jinjieni authored a paper about 1 month ago

Logical Reasoning over Natural Language as Knowledge Representation: A Survey

View all activity

jinjieni

authored 7 papers about 1 month ago

luodian

authored 4 papers 2 months ago

MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs

Paper • 2411.15296 • Published Nov 22, 2024 • 21

Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos

Paper • 2501.13826 • Published Jan 23 • 24

LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training

Paper • 2509.23661 • Published Sep 28 • 46

Visual Jigsaw Post-Training Improves MLLMs

Paper • 2509.25190 • Published Sep 29 • 36

yuexiang96

authored 9 papers 5 months ago

Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published Feb 17 • 40

Evaluating Vision-Language Models as Evaluators in Path Planning

Paper • 2411.18711 • Published Nov 27, 2024

VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search

Paper • 2503.10582 • Published Mar 13 • 24

Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators

Paper • 2503.19877 • Published Mar 25 • 1

VisualPuzzles: Decoupling Multimodal Reasoning Evaluation from Domain Knowledge

Paper • 2504.10342 • Published Apr 14 • 10

Speculative Thinking: Enhancing Small-Model Reasoning with Large Model Guidance at Inference Time

Paper • 2504.12329 • Published Apr 12

Overtrained Language Models Are Harder to Fine-Tune

Paper • 2503.19206 • Published Mar 24 • 2

The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think

Paper • 2505.10185 • Published May 15 • 26

VisCoder: Fine-Tuning LLMs for Executable Python Visualization Code Generation

Paper • 2506.03930 • Published Jun 4 • 26

AI & ML interests

Recent Activity

Team members 7

MixEval's activity