i3-architecture

FlameF0X 's Collections

updated about 4 hours ago

Note: The models are listed in the default order set by Hugging Face, so the latest model appears at the bottom.

Running

FlameF0X/i3-Series

🐢

Chat with the i3 model series
FlameF0X/i3-tiny

Text Generation • 711k • Updated Oct 17 • 22 • 1
FlameF0X/i3-12m

Text Generation • 12.7M • Updated Oct 23 • 36 • 3

Note Our first usable i3 model (meaning that we added Transformers support and some code for it)
FlameF0X/i3-22m

Text Generation • 22.6M • Updated Oct 31 • 22 • 2

Note Smol stable text generator that took over 14 hours to pre-train :) --- Changes --- Trained on over 1T tokens LoRPt layers
FlameF0X/i3-80m

Text Generation • 82.8M • Updated 7 days ago • 160 • 7

Note Previous SOTA model. Pre-trained in around 2 to 4 hours, in comparison with the previous version of over 14 hours. --- Changes --- Trained on over 3T tokens Other stuff available to read in the model card.
FlameF0X/i3-200m-v2

Text Generation • 0.2B • Updated 4 days ago • 52 • 2

Note Our newest SOTA model. Having ~200m parameters. Trained on the same dataset as the previous model + wikitext. The training took ~1-2 hours. --- Training --- Loss: ~1 PPL: ~3