Adaptive Multi-Agent Response Refinement in Conversational Systems Paper • 2511.08319 • Published 28 days ago • 40
Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs Paper • 2510.09201 • Published Oct 10 • 49
Distilling LLM Agent into Small Models with Retrieval and Code Tools Paper • 2505.17612 • Published May 23 • 81
SEA: Sparse Linear Attention with Estimated Attention Mask Paper • 2310.01777 • Published Oct 3, 2023 • 1
Scalable Set Encoding with Universal Mini-Batch Consistency and Unbiased Full Set Gradient Approximation Paper • 2208.12401 • Published Aug 26, 2022 • 1
Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction Paper • 2505.11254 • Published May 16 • 48
Scalable Set Encoding with Universal Mini-Batch Consistency and Unbiased Full Set Gradient Approximation Paper • 2208.12401 • Published Aug 26, 2022 • 1
FedSVD: Adaptive Orthogonalization for Private Federated Learning with LoRA Paper • 2505.12805 • Published May 19 • 22
Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction Paper • 2505.11254 • Published May 16 • 48
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Paper • 2502.08910 • Published Feb 13 • 148