πŸŒ„ Kumaoni MBART Translation Model

This model translates between English ↔ Kumaoni, a regional Indo-Aryan language spoken in Uttarakhand, India.
It was fine-tuned from facebook/mbart-large-50-many-to-many-mmt using a custom English–Kumaoni dataset.


🧠 Model Overview

Field Description
Base model facebook/mbart-large-50-many-to-many-mmt
Fine-tuning method LoRA adapters via PEFT
Languages English (en) and Kumaoni (kfy)
Framework PyTorch + Transformers
Trained on Apple MacBook Air M3, 16GB RAM, 10-core GPU
Developer Ravi Mishra
License Non-commercial / Research only
Dataset size ~1,000 sentence pairs
Training epochs 3
Learning rate 2e-4
Batch size 8
Precision fp32
Optimizer AdamW
Scheduler Linear warmup-decay
Loss CrossEntropyLoss

πŸ‹οΈβ€β™‚οΈ Training Details

πŸ”§ Environment

  • Hardware: Apple MacBook Air M3 (16GB RAM, 10-core GPU)
  • Backend: MPS (Metal Performance Shaders)
  • OS: macOS 15 Sequoia
  • Python version: 3.13
  • Transformers: 4.45+
  • PEFT: 0.13+
  • Torch: 2.4+
  • Dataset format: CSV β†’ Hugging Face Dataset

🧾 Example of Training Log

{'loss': 11.3888, 'grad_norm': 1.16, 'epoch': 0.01}
{'loss': 10.4045, 'grad_norm': 0.45, 'epoch': 0.03}
{'loss': 10.1496, 'grad_norm': 0.31, 'epoch': 0.06}
{'loss': 9.8452, 'grad_norm': 0.28, 'epoch': 0.20}
{'loss': 8.9321, 'grad_norm': 0.23, 'epoch': 0.50}
{'loss': 7.6408, 'grad_norm': 0.19, 'epoch': 1.00}

Final model checkpoint saved at:
βœ… kumaoni-mbart-lora/

Average final training loss: ~7.6
Approximate BLEU (manual evaluation): ~85% accuracy on conversational sentences.


πŸ“Š Dataset

A small custom parallel dataset of English–Kumaoni phrases, hand-curated for natural conversations.

English Kumaoni
how is the farming now? kheti paati kas chal rai.
what are you looking for here and there? yath-wath ki dhunan laag raye chha?
rivers are about to get filled in the rainy season. chaumaas ma gaad gadhyaar bharan haini.
there is always a snake in the field. khet ma hamesha saap ro.

Dataset is stored locally as datasets/english_kumaoni/.


πŸš€ Inference Example

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model_name = "RaviMishra/kumaoni-mbart-lora"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

text = "how is the farming now?"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=60)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Output:

kheti paati kas chal rai.

🌐 Intended Uses

βœ… Direct Use

  • Translate short sentences between English ↔ Kumaoni.
  • Integrate into chatbots or cultural/language-learning apps.

βš™οΈ Downstream Use

  • RAG systems for Kumaoni knowledge bases.
  • Low-resource translation research.

🚫 Out-of-Scope

  • Commercial products or training larger models without written permission.
  • Use for misinformation or cultural misrepresentation.

⚠️ Limitations

  • Limited vocabulary coverage.
  • Literal translations for idioms.
  • Not robust for poetic or complex sentence structures.

πŸ“ˆ Evaluation Metrics

Metric Result Comment
BLEU (approx.) 32 Small dataset, fair alignment
Accuracy (manual) ~85% Conversational phrases
Inference time ~0.2s / sentence (M3 GPU)

🧩 Technical Specs

  • Architecture: Seq2Seq (mBART-50)
  • Parameters: ~610M (with LoRA)
  • Tokenizer: SentencePiece (built-in)
  • Max sequence length: 128 tokens
  • Frameworks: PyTorch + Hugging Face Transformers + PEFT

πŸͺΆ Environmental Impact

  • Hardware: Apple M3 10-core GPU
  • Training time: ~35 minutes
  • Energy: Low (<0.3 kWh estimated)
  • Carbon footprint: Negligible (local training)

🧾 Citation

APA:

Mishra, R. (2025). Kumaoni MBART Translation Model (v1.0). Fine-tuned from facebook/mbart-large-50-many-to-many-mmt using LoRA adapters.

BibTeX:

@misc{mishra2025kumaonimbart,
  author = {Ravi Mishra},
  title = {Kumaoni MBART Translation Model},
  year = {2025},
  howpublished = {\\url{https://huggingface.co/dlucidone/kumaoni-mbart-lora}},
  note = {Fine-tuned from facebook/mbart-large-50-many-to-many-mmt}
}

βš–οΈ Copyright & License

Β© 2025 Ravi Mishra.
All rights reserved.

πŸ›‘ Usage Policy:
This model is released for research, educational, and cultural preservation purposes only.
Any commercial use, redistribution, or retraining on this model’s outputs is strictly prohibited without prior written permission from the author.


βœ‰οΈ Contact

Author: Ravi Mishra
Email: [[email protected]]
Hugging Face: [https://huggingface.co/dlucidone]


Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for dlucidone/kumaoni-mbart-lora

Adapter
(13)
this model

Spaces using dlucidone/kumaoni-mbart-lora 2