view article Article Introducing HELMET: Holistically Evaluating Long-context Language Models +5 Apr 16 • 40
view article Article Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques Mar 24 • 20
view article Article Universal Assisted Generation: Faster Decoding with Any Assistant Model +6 Oct 29, 2024 • 59
view article Article Blazing Fast SetFit Inference with 🤗 Optimum Intel on Xeon +4 Apr 3, 2024 • 11
view article Article CPU Optimized Embeddings with 🤗 Optimum Intel and fastRAG +4 Mar 15, 2024 • 13
view article Article Accelerate StarCoder with 🤗 Optimum Intel on Xeon: Q8/Q4 and Speculative Decoding +9 Jan 30, 2024 • 9
view article Article SetFitABSA: Few-Shot Aspect Based Sentiment Analysis using SetFit +4 Dec 6, 2023 • 15