view article Article Improving Hugging Face Training Efficiency Through Packing with Flash Attention 2 +4 RQlee, ArthurZ, achikundu, lwtr, rganti, mayank-mishra โข Aug 21, 2024 โข 41
view article Article Saving Memory Using Padding-Free Transformer Layers during Finetuning mayank-mishra โข Jun 11, 2024 โข 21
view article Article Aurora-M: The First Open Source Biden-Harris Executive Order Red teamed Multilingual Language Model mayank-mishra โข Apr 2, 2024 โข 7