--- license: mit language: - vi base_model: - sail/Sailor2-8B-Chat ---
Logo

# 🌟 BloomVN-8B-chat 🌟
### A fine-tuned multilingual model for Vietnamese language ## 📋 Overview - A bilingual text generation model with strong capabilities in both Vietnamese and English languages. - This base model can handle a wide range of text generation tasks while maintaining high quality output in both languages, making it particularly valuable for Vietnamese-English content creation and language processing applications. ## 🔧 Method The training process consists of three main steps: - Continuous Pre-training (CPT) from Sailor2-8B-Chat using [unsloth](https://github.com/unslothai/unsloth) - Fine-tuning with [Vietnamese instruction dataset](https://huggingface.co/datasets/BlossomsAI/reduced_vietnamese_instruction_dataset) - Applied refusal direction tuning based on ["Refusal in LLMs is Mediated by a Single Direction"](https://www.alignmentforum.org/posts/jGuXSZgv6qfdhMCuJ/refusal-in-llms-is-mediated-by-a-single-direction) ## 📊 VLMU Benchmark | EVALUATION DATE | STEM 🔬 | SOCIAL SCIENCE 🌍 | HUMANITIES 📚 | OTHERS 🎯 | AVG ⭐ | |----------------|--------|------------------|---------------|-----------|--------| | 07/02/2025 | 50.72 | 62.81 | 60.47 | 55.4 | 56.56 | ## 💫 Quantization - Coming Soon! ## 🤝 Contributors Developed with ❤️ by [BlossomAI](https://github.com/BlossomAI)

---
Star ⭐️ this repo if you find it valuable!