---
license: apache-2.0
datasets:
- tatsu-lab/alpaca
- yizhongw/self_instruct
language:
- en
base_model:
- meta-llama/Llama-2-7b-hf
- meta-llama/Llama-3.1-8B-Instruct
- mistralai/Mistral-7B-Instruct-v0.2
---

We provide a curated set of poisoned and benign fine-tuned LLMs for evaluating BAIT. The model zoo follows this file structure:
```
BAIT-ModelZoo/
├── base_models/
│   ├── BASE/MODEL/1/FOLDER  
│   ├── BASE/MODEL/2/FOLDER
│   └── ...
├── models/
│   ├── id-0001/
│   │   ├── model/
│   │   │   └── ...
│   │   └── config.json
│   ├── id-0002/
│   └── ...
└── METADATA.csv
```
```base_models``` stores pretrained LLMs downloaded from Huggingface. We evaluate BAIT on the following 3 LLM architectures:

- [Llama-2-7B-chat-hf](meta-llama/Llama-2-7b-chat-hf)
- [Llama-3-8B-Instruct](meta-llama/Meta-Llama-3-8B-Instruct)
- [Mistral-7B-Instruct-v0.2](mistralai/Mistral-7B-Instruct-v0.2)

The ```models``` directory contains fine-tuned models, both benign and backdoored, organized by unique identifiers. Each model folder includes:

- The model files
- A ```config.json``` file with metadata about the model, including:
  - Fine-tuning hyperparameters
  - Fine-tuning dataset
  - Whether it's backdoored or benign
  - Backdoor attack type, injected trigger and target (if applicable)

The ```METADATA.csv``` file in the root of ```BAIT-ModelZoo``` provides a summary of all available models for easy reference. Current model zoo contains 91 models. We will keep updating the model zoo with new models.