Model Overview

This is a fine-tuned version of the Llama model trained using the ORPO (Optimized Ranked Preference Ordering) dataset (mlabonne/orpo-dpo-mix-40k) to enhance conversational and preference-based response generation. The model uses the LoRA (Low-Rank Adaptation) technique to achieve efficient adaptation with minimal additional parameters, allowing it to learn task-specific knowledge without extensive computational demands.

Hyperparameters

  • LoRA Configuration: r=8,
  • lora_alpha=16,
  • lora_dropout=0.1

ORPO Trainer Configuration:

  • Learning Rate: 1e-5
  • Max Length: 2048
  • Batch Size: 1
  • Epochs: 1

Model Performance

The model was evaluated on the hellaswag task, yielding the following metrics:

  • Accuracy: 46.59%
  • Normalized Accuracy: 60.43%
Downloads last month
7
Safetensors
Model size
1B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for laurencassidy/lauren-tinyllama-1.1b-chat

Finetuned
(465)
this model