Q-Solv LoRA Adapter (Qwen2.5-Coder-7B) - GSM8K Python-of-Thought

This repository contains a LoRA adapter fine-tuned on an execution-verified GSM8K Python-of-Thought dataset.

Base Model

  • unsloth/Qwen2.5-Coder-7B-Instruct-bnb-4bit

Training

  • Method: QLoRA (4-bit) + LoRA
  • Samples: 1738
  • Epochs: 2
  • GPU: RTX 4060 Ti 16GB
  • Training framework: Unsloth + TRL SFTTrainer
  • Loss masked to train only on assistant responses (chat SFT)

Inference (PEFT)

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

base_model = "unsloth/Qwen2.5-Coder-7B-Instruct-bnb-4bit"
adapter_id = "dainlieu/qsolv-qwen2.5-coder-7b-lora-gsm8k"

tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model, device_map="auto")
model = PeftModel.from_pretrained(model, adapter_id)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dainlieu/qsolv-qwen2.5-coder-7b-lora-gsm8k