Whisper Large V3 Turbo - MLX Q8 Quantized

8-bit quantized version of OpenAI's Whisper Large V3 Turbo, optimized for Apple Silicon with MLX-compatible weights.

Model Description

This is a quantized conversion of openai/whisper-large-v3-turbo with the following characteristics:

Property	Value
Parameters	~809M
Quantization	INT8 (Q8)
Decoder Layers	4 (vs 32 in full model)
Size	~900MB
Speed	~10x faster than whisper-large-v3

Intended Use

Real-time speech-to-text transcription
Voice dictation applications
Audio transcription pipelines
Multilingual speech recognition

Hardware Requirements

Apple Silicon Mac (M1/M2/M3/M4)
Metal GPU acceleration
Minimum 8GB RAM recommended

Files

whisper-large-v3-turbo-mlx-q8/
├── config.json           # Model configuration
├── weights.safetensors   # Q8 quantized weights (~900MB)
├── tokenizer.json        # Whisper tokenizer vocabulary
└── mel_filters.npz       # Mel filterbank coefficients

Usage with CodeScribe

# Download model
huggingface-cli download LibraxisAI/whisper-large-v3-turbo-mlx-q8 \
    --local-dir ~/.codescribe/models/whisper-large-v3-turbo-mlx-q8

# Use with CodeScribe CLI
codescribe transcribe audio.wav --model whisper-large-v3-turbo-mlx-q8

Usage with Python (mlx-whisper)

import mlx_whisper

result = mlx_whisper.transcribe(
    "audio.wav",
    path_or_hf_repo="LibraxisAI/whisper-large-v3-turbo-mlx-q8"
)
print(result["text"])

Supported Languages

Inherits full multilingual support from Whisper Large V3:

English, Polish, German, French, Spanish, Italian, Portuguese
Dutch, Russian, Chinese, Japanese, Korean, Arabic, Hindi
And 90+ additional languages

Quantization Details

The model weights are stored in INT8 format with per-tensor scaling factors. The quantization was performed using MLX's native quantization tools, preserving the original model's accuracy while reducing memory footprint by ~4x.

License

This model inherits the MIT license from the original OpenAI Whisper model.

Citation

@misc{whisper-large-v3-turbo-mlx-q8,
  author = {LibraxisAI},
  title = {Whisper Large V3 Turbo - MLX Q8 Quantized},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/LibraxisAI/whisper-large-v3-turbo-mlx-q8}}
}

Acknowledgments

OpenAI Whisper - Original model
Apple MLX - Framework for Apple Silicon
mlx-community - MLX model conversions

Created by LibraxisAI | Part of the CodeScribe project

Downloads last month: 28

MLX

Hardware compatibility

Quantized

Model tree for LibraxisAI/whisper-large-v3-turbo-mlx-q8

Base model

openai/whisper-large-v3

Finetuned

openai/whisper-large-v3-turbo

Finetuned

(453)

this model