KandirResearch
/

Whisper-Small-Darija

Automatic Speech Recognition

Moroccan Arabic

Model card Files Files and versions

Metrics Training metrics Community

Lyte commited on Mar 13

Commit

35f4069

·

verified ·

1 Parent(s): 7f48b06

Create README.md

Files changed (1) hide show

README.md +72 -0

README.md ADDED Viewed

	@@ -0,0 +1,72 @@

+---
+language:
+- ary
+license: apache-2.0
+datasets:
+- Lyte/DarijaTTS-clean
+metrics:
+- wer
+base_model:
+- openai/whisper-small
+pipeline_tag: automatic-speech-recognition
+library_name: transformers
+tags:
+- asr
+- darija
+---
+# Whisper Small Darija - Lyte (Yassine Ennour)
+This is a fine-tuned version of OpenAI's `whisper-small` model on the DarijaTTS-clean dataset. The goal of this project is to improve automatic speech recognition (ASR) for Moroccan Darija (ary).
+## Model Details
+- **Model Name**: [`Lyte/Whisper-Small-Darija`](https://huggingface.co/Lyte/Whisper-Small-Darija)
+- **Base Model**: [`openai/whisper-small`](https://huggingface.co/openai/whisper-small)
+- **Fine-Tuned On**: [`Lyte/DarijaTTS-clean`](https://huggingface.co/datasets/Lyte/DarijaTTS-clean)
+- **Language**: Moroccan Darija (ary)
+- **Task**: Automatic Speech Recognition (ASR)
+- **Dataset Size**: 19.9k training samples, 1k test samples
+## Training Progress
+Training was started but interrupted. The training will be resumed from step **600**. Below is the progress so far:
+| Step | Training Loss | Validation Loss | WER |
+|------|--------------|----------------|-----|
+| 200  | 1.0142       | 1.0804         | 129.35 |
+| 400  | 0.8288       | 0.9905         | 72.44  |
+| 600  | 0.7618       | 0.9656         | 70.41  |
+## Usage
+You can use this model with Hugging Face's `transformers` library:
+```python
+from transformers import WhisperProcessor, WhisperForConditionalGeneration
+import torch
+model_id = "Lyte/whisper-small-darija"
+processor = WhisperProcessor.from_pretrained(model_id)
+model = WhisperForConditionalGeneration.from_pretrained(model_id)
+# Load an audio file and preprocess
+input_features = processor("path_to_audio.wav", return_tensors="pt").input_features
+generated_ids = model.generate(input_features)
+predicted_text = processor.batch_decode(generated_ids, skip_special_tokens=True)
+print(predicted_text)
+```
+## Next Steps
+- **Resume training** from step 600 to further improve WER.
+- **Optimize hyperparameters** to reduce validation loss.
+- **Expand dataset** for better generalization.
+## Acknowledgments
+Special thanks to OpenAI for Whisper and Hugging Face for their amazing platform. This model is built as part of my ongoing research in ASR for Darija.
+---
+For updates and more details, stay tuned to this repository!