Q-Solv LoRA Adapter (Qwen2.5-Coder-7B) - GSM8K Python-of-Thought
This repository contains a LoRA adapter fine-tuned on an execution-verified GSM8K Python-of-Thought dataset.
Base Model
unsloth/Qwen2.5-Coder-7B-Instruct-bnb-4bit
Training
- Method: QLoRA (4-bit) + LoRA
- Samples: 1738
- Epochs: 2
- GPU: RTX 4060 Ti 16GB
- Training framework: Unsloth + TRL SFTTrainer
- Loss masked to train only on assistant responses (chat SFT)
Inference (PEFT)
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
base_model = "unsloth/Qwen2.5-Coder-7B-Instruct-bnb-4bit"
adapter_id = "dainlieu/qsolv-qwen2.5-coder-7b-lora-gsm8k"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model, device_map="auto")
model = PeftModel.from_pretrained(model, adapter_id)
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support