vanshnawander/whisper-small-telugu
This is a fine-tuned version of openai/whisper-small for Telugu automatic speech recognition (ASR).
Model Description
- Base Model: openai/whisper-small
- Language: Telugu (te)
- Task: Automatic Speech Recognition (transcribe)
- Training Data: ai4bharat/Kathbath
- Fine-tuning Framework: Transformers + Custom DALI Pipeline
Training Details
The model was fine-tuned on the Kathbath Telugu dataset with the following configuration:
- Epochs: 3
- Batch Size: 16 (effective ~96 with gradient accumulation)
- Learning Rate: 1e-5
- Mixed Precision: FP16
- Gradient Checkpointing: Enabled
Evaluation Results
Evaluated on the Shrutilipi benchmark - a large-scale ASR dataset for Indian languages.
| Model | WER | CER | Improvement |
|---|---|---|---|
| Base (openai/whisper-small) | N/A% | N/A% | - |
| This Model | N/A% | N/A% | N/A% |
Usage
Basic Usage
from transformers import WhisperProcessor, WhisperForConditionalGeneration
import librosa
# Load model and processor
processor = WhisperProcessor.from_pretrained("vanshnawander/whisper-small-telugu")
model = WhisperForConditionalGeneration.from_pretrained("vanshnawander/whisper-small-telugu")
# Load audio
audio, sr = librosa.load("audio.wav", sr=16000)
# Transcribe
input_features = processor(audio, sampling_rate=16000, return_tensors="pt").input_features
generated_ids = model.generate(input_features, language="te", task="transcribe")
transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(transcription)
Using Pipeline
from transformers import pipeline
pipe = pipeline(
"automatic-speech-recognition",
model="vanshnawander/whisper-small-telugu",
chunk_length_s=30,
)
result = pipe("audio.wav", generate_kwargs={"language": "te", "task": "transcribe"})
print(result["text"])
Limitations
- Optimized for Telugu speech; may not perform well on other languages
- Best performance on clear audio with minimal background noise
- May struggle with very fast speech or heavy code-mixing
Citation
If you use this model, please cite:
@misc{vanshnawander_whisper_small_telugu},
author = {AI4Bharat},
title = {vanshnawander/whisper-small-telugu},
year = {2024},
publisher = {HuggingFace},
url = {https://huggingface.co/vanshnawander/whisper-small-telugu}
}
Acknowledgments
- OpenAI Whisper for the base model
- AI4Bharat for the Kathbath and Shrutilipi datasets
- Hugging Face for the transformers library
- Downloads last month
- 20
Model tree for vanshnawander/whisper-small-telugu
Base model
openai/whisper-smallDataset used to train vanshnawander/whisper-small-telugu
Evaluation results
- Word Error Rate on Shrutilipi (Telugu)self-reportedN/A
- Character Error Rate on Shrutilipi (Telugu)self-reportedN/A