Whisper Large V3 Turbo - MLX Q8 Quantized

8-bit quantized version of OpenAI's Whisper Large V3 Turbo, optimized for Apple Silicon with MLX-compatible weights.

Model Description

This is a quantized conversion of openai/whisper-large-v3-turbo with the following characteristics:

Property Value
Parameters ~809M
Quantization INT8 (Q8)
Decoder Layers 4 (vs 32 in full model)
Size ~900MB
Speed ~10x faster than whisper-large-v3

Intended Use

  • Real-time speech-to-text transcription
  • Voice dictation applications
  • Audio transcription pipelines
  • Multilingual speech recognition

Hardware Requirements

  • Apple Silicon Mac (M1/M2/M3/M4)
  • Metal GPU acceleration
  • Minimum 8GB RAM recommended

Files

whisper-large-v3-turbo-mlx-q8/
โ”œโ”€โ”€ config.json           # Model configuration
โ”œโ”€โ”€ weights.safetensors   # Q8 quantized weights (~900MB)
โ”œโ”€โ”€ tokenizer.json        # Whisper tokenizer vocabulary
โ””โ”€โ”€ mel_filters.npz       # Mel filterbank coefficients

Usage with CodeScribe

# Download model
huggingface-cli download LibraxisAI/whisper-large-v3-turbo-mlx-q8 \
    --local-dir ~/.codescribe/models/whisper-large-v3-turbo-mlx-q8

# Use with CodeScribe CLI
codescribe transcribe audio.wav --model whisper-large-v3-turbo-mlx-q8

Usage with Python (mlx-whisper)

import mlx_whisper

result = mlx_whisper.transcribe(
    "audio.wav",
    path_or_hf_repo="LibraxisAI/whisper-large-v3-turbo-mlx-q8"
)
print(result["text"])

Supported Languages

Inherits full multilingual support from Whisper Large V3:

  • English, Polish, German, French, Spanish, Italian, Portuguese
  • Dutch, Russian, Chinese, Japanese, Korean, Arabic, Hindi
  • And 90+ additional languages

Quantization Details

The model weights are stored in INT8 format with per-tensor scaling factors. The quantization was performed using MLX's native quantization tools, preserving the original model's accuracy while reducing memory footprint by ~4x.

License

This model inherits the MIT license from the original OpenAI Whisper model.

Citation

@misc{whisper-large-v3-turbo-mlx-q8,
  author = {LibraxisAI},
  title = {Whisper Large V3 Turbo - MLX Q8 Quantized},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/LibraxisAI/whisper-large-v3-turbo-mlx-q8}}
}

Acknowledgments


Created by LibraxisAI | Part of the CodeScribe project

Downloads last month
28
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for LibraxisAI/whisper-large-v3-turbo-mlx-q8

Finetuned
(453)
this model