AI & ML interests

CVPR Demo Track @ CVPR 2022

DavidVivancosΒ 
posted an update 1 day ago
DavidVivancosΒ 
posted an update 21 days ago
DavidVivancosΒ 
posted an update about 1 month ago
AbhaykoulΒ 
posted an update 3 months ago
view post
Post
3104
πŸš€ Ever dreamed of training your own Large Language Model from scratch? What if I told you it doesn't require a supercomputer or PhD in ML? 🀯

Introducing LLM Trainer - the educational framework that makes LLM training accessible to EVERYONE! Whether you're on a CPU-only laptop or scaling to distributed GPUs, we've got you covered. πŸ’»βž‘οΈπŸ–₯️

Why LLM Trainer? Because existing tools are either too simplistic (hiding the magic) or too complex (requiring expert knowledge). We bridge the gap with:

πŸŽ“ Educational transparency - every component built from scratch with clear code
πŸ’» CPU-first approach - start training immediately, no GPU needed
πŸ”§ Full customization - modify anything you want
πŸ“ˆ Seamless scaling - from laptop to cluster without code changes
🀝 HuggingFace integration - works with existing models & tokenizers

Key highlights:
βœ… Built-in tokenizers (BPE, WordPiece, HF wrappers)
βœ… Complete Transformer implementation from scratch
βœ… Optimized for CPU training
βœ… Advanced features: mixed precision, gradient checkpointing, multiple generation strategies
βœ… Comprehensive monitoring & metrics

Perfect for:
- Students learning transformers
- Researchers prototyping new ideas
- Developers building domain-specific models

Ready to train your first LLM? It's easier than you think!

πŸ”— Check it out: https://github.com/HelpingAI/llm-trainer
πŸ“š Docs: Getting Started Guide
πŸ’¬ Join the community: GitHub Discussions

#AI #MachineLearning #LLM #DeepLearning #OpenSource #Python #HuggingFace #NLP

Special thanks to HuggingFace and PyTorch teams for the amazing ecosystem! πŸ™
  • 1 reply
Β·
AbhaykoulΒ 
posted an update 4 months ago
view post
Post
4133
πŸš€ Dhanishtha-2.0-preview-0825 Is Here

The Intermediate Thinking Model just leveled up again.

With sharper reasoning, better tool use, and expanded capabilities, Dhanishtha-2.0-preview-0825 is now live and ready to impress.

🧠 What Makes Dhanishtha Special?
Unlike typical CoT models that only thinks one time, Dhanishtha thinks iteratively:

> Think β†’ Answer β†’ Rethink β†’ Improve β†’ Rethink again if needed.

πŸ”— Try it now: HelpingAI/Dhanishtha-2.0-preview-0825

πŸ”ž Dhanishtha NSFW Preview

For those exploring more expressive and immersive roleplay scenarios, we’re also releasing:

HelpingAI/Dhanishtha-nsfw
A specialized version tuned for adult-themed interactions and character-driven roleplay.

πŸ”— Explore it here: HelpingAI/Dhanishtha-nsfw

πŸ’¬ You can also try all of these live at chat.helpingai.co
Β·
AtAndDevΒ 
posted an update 5 months ago
view post
Post
562
Qwen 3 Coder is a personal attack to k2, and I love it.
It achieves near SOTA on LCB while not having reasoning.
Finally people are understanding that reasoning isnt necessary for high benches...

Qwen ftw!

DECENTRALIZE DECENTRALIZE DECENTRALIZE
AbhaykoulΒ 
posted an update 5 months ago
view post
Post
3158
πŸŽ‰ Dhanishtha-2.0-preview-0725 is Now Live

The Intermediate Thinking Model just got even better.
With the new update, Dhanishtha is now sharper, smarter, and trained further on tool use

🧠 What Makes Dhanishtha Different?
Unlike standard COT models that give one-shot responses, Dhanishtha thinks in layers:

> Think β†’ Answer β†’ Rethink β†’ Improve β†’ Rethink again if needed.

HelpingAI/Dhanishtha-2.0-preview-0725