-
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance
Paper • 2511.13254 • Published • 134 -
OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models
Paper • 2511.14582 • Published • 17 -
EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control
Paper • 2511.15248 • Published • 6
Collections
Discover the best community collections!
Collections including paper arxiv:2511.13254
-
LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers
Paper • 2507.04404 • Published • 21 -
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float
Paper • 2504.11651 • Published • 31 -
A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone
Paper • 2505.12781 • Published • 2 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 259
-
The Curse of Depth in Large Language Models
Paper • 2502.05795 • Published • 40 -
Transformers without Normalization
Paper • 2503.10622 • Published • 171 -
Parallel Scaling Law for Language Models
Paper • 2505.10475 • Published • 83 -
Learning to Skip the Middle Layers of Transformers
Paper • 2506.21103 • Published • 18
-
If You Can't Use Them, Recycle Them: Optimizing Merging at Scale Mitigates Performance Tradeoffs
Paper • 2412.04144 • Published • 6 -
Merging in a Bottle: Differentiable Adaptive Merging (DAM) and the Path from Averaging to Automation
Paper • 2410.08371 • Published • 3 -
MERGE^3: Efficient Evolutionary Merging on Consumer-grade GPUs
Paper • 2502.10436 • Published • 1 -
Mergenetic: a Simple Evolutionary Model Merging Library
Paper • 2505.11427 • Published • 14
-
Demystifying Reinforcement Learning in Agentic Reasoning
Paper • 2510.11701 • Published • 31 -
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
Paper • 2510.19363 • Published • 61 -
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning
Paper • 2510.25992 • Published • 44 -
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
Paper • 2511.07384 • Published • 16
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 7.7k • 1.22k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 140 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models
Paper • 2410.14059 • Published • 62 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
Token-Efficient Long Video Understanding for Multimodal LLMs
Paper • 2503.04130 • Published • 96 -
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
Paper • 2503.10639 • Published • 53
-
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance
Paper • 2511.13254 • Published • 134 -
OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models
Paper • 2511.14582 • Published • 17 -
EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control
Paper • 2511.15248 • Published • 6
-
If You Can't Use Them, Recycle Them: Optimizing Merging at Scale Mitigates Performance Tradeoffs
Paper • 2412.04144 • Published • 6 -
Merging in a Bottle: Differentiable Adaptive Merging (DAM) and the Path from Averaging to Automation
Paper • 2410.08371 • Published • 3 -
MERGE^3: Efficient Evolutionary Merging on Consumer-grade GPUs
Paper • 2502.10436 • Published • 1 -
Mergenetic: a Simple Evolutionary Model Merging Library
Paper • 2505.11427 • Published • 14
-
Demystifying Reinforcement Learning in Agentic Reasoning
Paper • 2510.11701 • Published • 31 -
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
Paper • 2510.19363 • Published • 61 -
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning
Paper • 2510.25992 • Published • 44 -
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
Paper • 2511.07384 • Published • 16
-
LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers
Paper • 2507.04404 • Published • 21 -
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float
Paper • 2504.11651 • Published • 31 -
A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone
Paper • 2505.12781 • Published • 2 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 259
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 7.7k • 1.22k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 140 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
The Curse of Depth in Large Language Models
Paper • 2502.05795 • Published • 40 -
Transformers without Normalization
Paper • 2503.10622 • Published • 171 -
Parallel Scaling Law for Language Models
Paper • 2505.10475 • Published • 83 -
Learning to Skip the Middle Layers of Transformers
Paper • 2506.21103 • Published • 18
-
UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models
Paper • 2410.14059 • Published • 62 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
Token-Efficient Long Video Understanding for Multimodal LLMs
Paper • 2503.04130 • Published • 96 -
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
Paper • 2503.10639 • Published • 53