Update pipeline tag

by nielsr HF Staff - opened Jul 15, 2025

base: refs/heads/main

←

from: refs/pr/3

Discussion Files changed

-1

nielsr

Jul 15, 2025

Looks like "token classification" is appropriate here.

Update pipeline tagdbc9bd5b

gwtaylor

BIOSCAN org 2 days ago

Thanks for the suggestion @nielsr !

After some research, we believe feature-extraction is actually the more appropriate tag for BarcodeBERT. While the model uses BertForTokenClassification as its architecture class, the primary use case is extracting embeddings/representations from DNA barcode sequences for downstream tasks — not per-token prediction like NER.

The usage pattern in our documentation reflects this:

output = model(input_seq)["hidden_states"][-1]
features = output.mean(1)  # Pool to get embeddings

We also note that similar DNA foundation models like DNABERT-S use feature-extraction for the same reason.

Closing this PR, but appreciate you taking a look at our model!

gwtaylor changed pull request status to closed 2 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment