vanshnawander/whisper-small-telugu

This is a fine-tuned version of openai/whisper-small for Telugu automatic speech recognition (ASR).

Model Description

  • Base Model: openai/whisper-small
  • Language: Telugu (te)
  • Task: Automatic Speech Recognition (transcribe)
  • Training Data: ai4bharat/Kathbath
  • Fine-tuning Framework: Transformers + Custom DALI Pipeline

Training Details

The model was fine-tuned on the Kathbath Telugu dataset with the following configuration:

  • Epochs: 3
  • Batch Size: 16 (effective ~96 with gradient accumulation)
  • Learning Rate: 1e-5
  • Mixed Precision: FP16
  • Gradient Checkpointing: Enabled

Evaluation Results

Evaluated on the Shrutilipi benchmark - a large-scale ASR dataset for Indian languages.

Model WER CER Improvement
Base (openai/whisper-small) N/A% N/A% -
This Model N/A% N/A% N/A%

Usage

Basic Usage

from transformers import WhisperProcessor, WhisperForConditionalGeneration
import librosa

# Load model and processor
processor = WhisperProcessor.from_pretrained("vanshnawander/whisper-small-telugu")
model = WhisperForConditionalGeneration.from_pretrained("vanshnawander/whisper-small-telugu")

# Load audio
audio, sr = librosa.load("audio.wav", sr=16000)

# Transcribe
input_features = processor(audio, sampling_rate=16000, return_tensors="pt").input_features
generated_ids = model.generate(input_features, language="te", task="transcribe")
transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]

print(transcription)

Using Pipeline

from transformers import pipeline

pipe = pipeline(
    "automatic-speech-recognition",
    model="vanshnawander/whisper-small-telugu",
    chunk_length_s=30,
)

result = pipe("audio.wav", generate_kwargs={"language": "te", "task": "transcribe"})
print(result["text"])

Limitations

  • Optimized for Telugu speech; may not perform well on other languages
  • Best performance on clear audio with minimal background noise
  • May struggle with very fast speech or heavy code-mixing

Citation

If you use this model, please cite:

@misc{vanshnawander_whisper_small_telugu},
  author = {AI4Bharat},
  title = {vanshnawander/whisper-small-telugu},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/vanshnawander/whisper-small-telugu}
}

Acknowledgments

Downloads last month
20
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for vanshnawander/whisper-small-telugu

Finetuned
(3161)
this model

Dataset used to train vanshnawander/whisper-small-telugu

Evaluation results