We provide a curated set of poisoned and benign fine-tuned LLMs for evaluating BAIT. The model zoo follows this file structure:
BAIT-ModelZoo/
βββ base_models/
β βββ BASE/MODEL/1/FOLDER
β βββ BASE/MODEL/2/FOLDER
β βββ ...
βββ models/
β βββ id-0001/
β β βββ model/
β β β βββ ...
β β βββ config.json
β βββ id-0002/
β βββ ...
βββ METADATA.csv
base_models stores pretrained LLMs downloaded from Huggingface. We evaluate BAIT on the following 3 LLM architectures:
The models directory contains fine-tuned models, both benign and backdoored, organized by unique identifiers. Each model folder includes:
- The model files
- A
config.jsonfile with metadata about the model, including:- Fine-tuning hyperparameters
- Fine-tuning dataset
- Whether it's backdoored or benign
- Backdoor attack type, injected trigger and target (if applicable)
The METADATA.csv file in the root of BAIT-ModelZoo provides a summary of all available models for easy reference. Current model zoo contains 91 models. We will keep updating the model zoo with new models.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
Model tree for NoahShen/BAIT-ModelZoo
Base model
meta-llama/Llama-2-7b-hf