--- tags: - molecular - chemistry - tokenizer --- # MoleBERT Tokenizer This model is a molecular tokenizer trained on Nb-based TMC. From scratch, 400k train molecules, 30 epochs Test loss = 0.22 Code from: https://github.com/Rich-XGK/GTMGC ## Usage ```python from models.mole_bert_tokenizer import MoleBERTTokenizerCollator, MoleBERTTokenizer tokenizer = MoleBERTTokenizer.from_pretrained(...)