MMS Adapter Fine-tuned for Ruuli

This model is a fine-tuned version of facebook/mms-1b-all on the Mozilla Common Voice Spontaneous Speech dataset for Ruuli (ruc).

Training

  • Base model: facebook/mms-1b-all
  • Fine-tuning method: Adapter layers
  • Dataset: Mozilla Common Voice Spontaneous Speech

Usage

from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
import torch

processor = Wav2Vec2Processor.from_pretrained("vitthalbhandari/mms-1b-all-aft-mid-ruc")
model = Wav2Vec2ForCTC.from_pretrained("vitthalbhandari/mms-1b-all-aft-mid-ruc")

# Load adapter
model.load_adapter("ruc")

# Transcribe audio
inputs = processor(audio_array, sampling_rate=16000, return_tensors="pt")
with torch.no_grad():
    logits = model(**inputs).logits
predicted_ids = torch.argmax(logits, dim=-1)
transcription = processor.batch_decode(predicted_ids)
Downloads last month
21
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support