Model Overview
This is a fine-tuned version of the Llama model trained using the ORPO (Optimized Ranked Preference Ordering) dataset (mlabonne/orpo-dpo-mix-40k) to enhance conversational and preference-based response generation. The model uses the LoRA (Low-Rank Adaptation) technique to achieve efficient adaptation with minimal additional parameters, allowing it to learn task-specific knowledge without extensive computational demands.
Hyperparameters
- LoRA Configuration: r=8,
- lora_alpha=16,
- lora_dropout=0.1
ORPO Trainer Configuration:
- Learning Rate: 1e-5
- Max Length: 2048
- Batch Size: 1
- Epochs: 1
Model Performance
The model was evaluated on the hellaswag task, yielding the following metrics:
- Accuracy: 46.59%
- Normalized Accuracy: 60.43%
- Downloads last month
- 7
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for laurencassidy/lauren-tinyllama-1.1b-chat
Base model
TinyLlama/TinyLlama-1.1B-Chat-v1.0