--- license: bsd-3-clause tags: - meg - brain-signals - speech-detection - conformer - libribrain datasets: - pnpl/LibriBrain metrics: - f1 library_name: pytorch model-index: - name: megconformer-speech-detection results: - task: type: audio-classification name: Speech classification dataset: name: LibriBrain 2025 PNPL (Standard track, speech task) type: pnpl/LibriBrain split: holdout metrics: - name: F1-macro type: f1 value: 0.8890 # 88.90 % args: average: macro --- # MEGConformer for Speech Detection Conformer-based MEG decoder for binary speech detection, trained with 10 different random seeds for reproducibility. ## Model Performance | Seed | Val F1-Macro | Checkpoint | |------|--------------|------------| | 0 (best) | **87.06%** | `seed-0/pytorch_model.ckpt` | | 6 | 86.80% | `seed-6/pytorch_model.ckpt` | | 4 | 86.62% | `seed-4/pytorch_model.ckpt` | | 1 | 86.54% | `seed-1/pytorch_model.ckpt` | | 2 | 86.37% | `seed-2/pytorch_model.ckpt` | | 5 | 86.29% | `seed-5/pytorch_model.ckpt` | | 7 | 86.18% | `seed-7/pytorch_model.ckpt` | | 3 | 86.13% | `seed-3/pytorch_model.ckpt` | | 8 | 85.92% | `seed-8/pytorch_model.ckpt` | | 9 | 85.18% | `seed-9/pytorch_model.ckpt` | - **Holdout score of seed 0:** 88.90% ## Quick Start ### Load Best Model ```python import torch from huggingface_hub import hf_hub_download from libribrain_experiments.models.configurable_modules.classification_module import ( ClassificationModule, ) # Download a checkpoint (seed-0) checkpoint_path = hf_hub_download( repo_id="zuazo/megconformer-speech-detection", filename="seed-0/pytorch_model.ckpt" ) # Choose device device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # Load model and move to device model = ClassificationModule.load_from_checkpoint(checkpoint_path, map_location=device) model.eval() # Inference meg_signal = torch.randn(1, 306, 125, device=device) # Create directly on device with torch.no_grad(): logits = model(meg_signal) prediction = torch.argmax(logits, dim=1) # 0=silence, 1=speech print(f"Prediction: {'Speech' if prediction.item() == 1 else 'Silence'}") ``` ## Model Details - **Architecture**: Conformer Small - Hidden size: 144 - FFN dim: 576 - Layers: 16 - Attention heads: 4 - Depthwise conv kernel: 31 - **Input**: 306-channel MEG signals - **Window size**: 2.5 seconds (625 samples at 250 Hz) - **Output**: Binary classification (silence/speech) - **Training**: [LibriBrain](https://huggingface.co/datasets/pnpl/LibriBrain) 2025 Standard track ## Reproducibility All 10 random seeds are provided to ensure reproducibility. ## Citation ```bibtex @misc{dezuazo2025megconformerconformerbasedmegdecoder, title={MEGConformer: Conformer-Based MEG Decoder for Robust Speech and Phoneme Classification}, author={Xabier de Zuazo and Ibon Saratxaga and Eva Navas}, year={2025}, eprint={2512.01443}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2512.01443}, } ``` ## License The 3-Clause BSD License ## Links - **Paper**: [arXiv:2512.01443](https://arxiv.org/abs/2512.01443) - **Code**: [GitHub](https://github.com/neural2speech/libribrain-experiments) - **Competition**: [LibriBrain 2025](https://neural-processing-lab.github.io/2025-libribrain-competition/)