Qwen2.5-1.5B-SQL-Assistant

Qwen2.5-1.5B-SQL-Assistant is a fine-tuned version of Qwen/Qwen2.5-1.5B-Instruct, specialized in translating natural language questions into syntactically correct SQL queries based on a provided database schema.

This model was trained using PEFT (Parameter-Efficient Fine-Tuning) and LoRA (Low-Rank Adaptation) techniques to ensure efficient training on consumer hardware while retaining the reasoning capabilities of the base model.

📝 Model Description & Purpose

Model Type: Causal Language Model (Fine-tuned with LoRA adapters)
Base Model: Qwen 2.5 (1.5 Billion Parameters) - Instruct Version
Primary Task: Text-to-SQL Generation
Language: English
Intended Use: This model is designed to act as a technical assistant. Users provide a CREATE TABLE statement (context) and a question, and the model generates the corresponding SQL query.

📊 Training Data

The model was fine-tuned on the b-mc2/sql-create-context dataset.

Dataset Structure: It consists of pairs containing:
- context: The SQL schema definition (e.g., CREATE TABLE ...).
- question: A natural language query (e.g., "How many users are active?").
- answer: The correct SQL query corresponding to the question.
Preprocessing: The data was formatted into the standard Qwen Chat Template (system, user, assistant roles) to leverage the instruction-following capabilities of the base model.

⚙️ Training Methodology & Hyperparameters

The model was trained using QLoRA (Quantized LoRA) to minimize memory usage.

Technique: LoRA (Low-Rank Adaptation) with 4-bit quantization (NF4).
Frameworks: transformers, peft, bitsandbytes, trl.
Hardware: Trained on a single NVIDIA T4 GPU.

Hyperparameters:

Learning Rate: 2e-4
Batch Size: 4 per device (Effective batch size optimized via Gradient Accumulation).
Epochs: 1
Optimizer: paged_adamw_32bit
LoRA Rank (r): 16
LoRA Alpha: 16
LoRA Dropout: 0.05
Target Modules: q_proj, k_proj, v_proj, o_proj

📈 Evaluation Results

The model was evaluated qualitatively on a hold-out test set.

Baseline vs. Fine-Tuned Comparison

Feature	Base Model (Qwen 2.5-1.5B-Instruct)	Fine-Tuned Model (SQL-Assistant)
Response Format	Often chatty; explains the code before/after.	Concise; outputs strictly the SQL query.
Schema Adherence	Sometimes hallucinates column names not in the schema.	Strongly adheres to the provided `CREATE TABLE` context.
Syntax Accuracy	Good, but prone to minor syntax errors in complex joins.	Improved syntax specific to standard SQL queries.

Sample Test Case:

Context: CREATE TABLE employees (name VARCHAR, dept VARCHAR, salary INT)
Question: "Who works in Sales and earns more than 50k?"
Model Output: SELECT name FROM employees WHERE dept = 'Sales' AND salary > 50000

⚠️ Limitations & Known Issues

Scope: The model is specialized for SQL generation. It may not perform as well on general creative writing or open-ended chat tasks compared to the base model.
Context Dependency: The model relies heavily on the provided schema context. If column names are ambiguous or missing from the context, the model may fail or hallucinate.
Complexity: While effective for standard queries (SELECT, JOIN, WHERE, GROUP BY), it may struggle with extremely complex nested sub-queries or database-specific proprietary functions (e.g., specific Oracle/Postgres extensions).

💻 How to Use (Code Example)

You can load this model using the peft and transformers libraries. Since this is an adapter, you need to load the base model first.

import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# 1. Load the Base Model
base_model_id = "Qwen/Qwen2.5-1.5B-Instruct"
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    device_map="auto",
    torch_dtype=torch.float16
)

# 2. Load the Fine-Tuned Adapters
adapter_model_id = "manuelaschrittwieser/Qwen2.5-1.5B-SQL-Assistant"
model = PeftModel.from_pretrained(base_model, adapter_model_id)
tokenizer = AutoTokenizer.from_pretrained(base_model_id)

# 3. Define Context and Question
context = "CREATE TABLE students (id INT, name VARCHAR, grade INT, subject VARCHAR)"
question = "List the names of students in grade 10 who study Math."

# 4. Format Prompt
messages = [
    {"role": "system", "content": "You are a SQL expert."},
    {"role": "user", "content": f"{context}\nQuestion: {question}"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

# 5. Generate SQL
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

print("Generated SQL:")
print(response.split("assistant")[-1].strip())

Downloads last month: 30

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for manuelaschrittwieser/Qwen2.5-1.5B-SQL-Assistant

Base model

Qwen/Qwen2.5-1.5B

Finetuned

Qwen/Qwen2.5-1.5B-Instruct

Adapter

(540)

this model

Dataset used to train manuelaschrittwieser/Qwen2.5-1.5B-SQL-Assistant

Space using manuelaschrittwieser/Qwen2.5-1.5B-SQL-Assistant 1

Evaluation results

accuracy on sql-create-context
self-reported

95.000

View on Papers With Code