Update pipeline tag

#3
by nielsr HF Staff - opened

Looks like "token classification" is appropriate here.

BIOSCAN org

Thanks for the suggestion @nielsr !

After some research, we believe feature-extraction is actually the more appropriate tag for BarcodeBERT. While the model uses BertForTokenClassification as its architecture class, the primary use case is extracting embeddings/representations from DNA barcode sequences for downstream tasks β€” not per-token prediction like NER.

The usage pattern in our documentation reflects this:

output = model(input_seq)["hidden_states"][-1]
features = output.mean(1)  # Pool to get embeddings

We also note that similar DNA foundation models like DNABERT-S use feature-extraction for the same reason.

Closing this PR, but appreciate you taking a look at our model!

gwtaylor changed pull request status to closed

Sign up or log in to comment