RedHatAI
/

Llama-3.3-70B-Instruct-speculator.eagle3

Text Generation

Model card Files Files and versions

ekurtic commited on Sep 18

Commit

8d36c47

·

verified ·

1 Parent(s): d9f8d09

Create README.md

Files changed (1) hide show

README.md +46 -0

README.md ADDED Viewed

	@@ -0,0 +1,46 @@

+---
+language:
+- en
+- de
+- fr
+- it
+- pt
+- hi
+- es
+- th
+license: llama3.3
+pipeline_tag: text-generation
+tags:
+- facebook
+- meta
+- pytorch
+- llama
+- llama-3
+- neuralmagic
+- redhat
+- speculators
+- eagle3
+---
+# Llama-3.3-70B-Instruct-speculator.eagle3
+## Model Overview
+- **Verifier:** meta-llama/Llama-3.3-70B-Instruct
+- **Speculative Decoding Algorithm:** EAGLE-3
+- **Model Architecture:** Eagle3Speculator
+- **Release Date:** 09/15/2025
+- **Version:** 1.0
+- **Model Developers:** RedHat
+This is a speculator model designed for use with [meta-llama/Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct), based on the [EAGLE-3](https://arxiv.org/abs/2503.01840) speculative decoding algorithm.
+It was trained using the [speculators](https://github.com/vllm-project/speculators) library on a combination of the [Aeala/ShareGPT_Vicuna_unfiltered](https://huggingface.co/datasets/Aeala/ShareGPT_Vicuna_unfiltered) and the `train_sft` split of [HuggingFaceH4/ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) datasets.
+## Evaluations
+Subset of GSM8k (math reasoning):
+* acceptance_rate = [0.801, 0.637, 0.464]
+* conditional_acceptance_rate = [0.801, 0.795, 0.729]
+Subset of MTBench:
+* acceptance_rate = [0.733, 0.537, 0.384]
+* conditional_acceptance_rate = [0.733, 0.733, 0.715]