You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

MicroFlow: A Pretrained Mixture of Experts Model for Microbial Community Analysis

Model Description

MicroFlow is a pretrained language model based on the Mixtral architecture, specifically designed for analyzing microbial community composition at the species level. This model has been trained on extensive taxonomic profile data to understand and generate microbial community structures, serving as a foundation for various downstream bioinformatics applications.

Key Features

1. Architecture Design

  • Base Architecture: Mixture of Experts (MoE) pretrained model based on Mixtral
  • Parameter Scale: Approximately 130 million pretrained parameters
  • Attention Mechanism: Bidirectional attention mechanism (non-causal) implemented via custom SDPA (Scaled Dot Product Attention) with GQA support
  • Tokenization: BPE (Byte-Pair Encoding) tokenization with a vocabulary of 30,020 tokens optimized for microbial taxonomy
  • Position Encoding: RoPE (Rotary Positional Encoding) with theta=1e6
  • Expert System: 8 local experts with 2 experts activated per token

2. Pretraining Strategy

  • Pretraining Data: 3,256,608 microbial community samples in Parquet format (with data augmentation by randomly shuffling the order of taxa within samples)
  • Taxonomic Level: Species-level classification (with 's__' prefix removed)
  • Training Objectives:
    • Masked Language Modeling (15% masking probability)
    • BERT-style pretraining strategy with bidirectional attention
    • Multi-sequence length pretraining (3072, 8192 tokens)

Important:
This model requires proper setup of the custom bidirectional attention mechanism before loading. Ensure you follow the setup steps in the correct order:

  1) Define custom attention function,
  2) Register it,
  3) Configure model with `attn_implementation='custom'`,
  4) Load model weights.

The extracted embeddings capture deep semantic information about microbial communities and can be used directly for various analysis tasks without further training.

Citation

If you use this pretrained model in your research, please cite:

@software{microflow2025,
  title = {MicroFlow: A Pretrained Mixture of Experts Model for Microbial Community Analysis},
  author = {Zhang, Chao},
  year = {2025},
  url = {https://github.com/zhangchao162/microflow},
  note = {Pretrained language model with bidirectional attention and BPE tokenization for species-level microbial community data}
}

Contact

For questions about the pretrained model or fine-tuning guidance, please contact [email protected]

Downloads last month
11
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support