---
base_model: Qwen/Qwen3-4B-Instruct-2507
library_name: peft
license: apache-2.0
language:
  - en
tags:
  - trading
  - finance
  - hyperliquid
  - perpetuals
  - defi
  - lora
  - dpo
  - sft
  - trl
  - base_model:adapter:Qwen/Qwen3-4B-Instruct-2507
model_name: HyperLLM-4b
pipeline_tag: text-generation
---

# HyperLLM-4b v0.6

A specialized 4B parameter language model fine-tuned for Hyperliquid perpetual DEX trading assistance. Built on Qwen3-4B-Instruct using LoRA + DPO training.

## Model Description

HyperLLM is designed to assist with:
- **Position sizing calculations** - Risk-based position sizing with proper decimal handling
- **API structure understanding** - Hyperliquid exchange API request/response formats
- **Trading mechanics** - Perpetual futures concepts, margin modes, order types
- **Parameter validation** - Validating trade parameters against exchange constraints
- **Edge case handling** - Boundary conditions and unusual trading scenarios

## Version History

### v0.6 (Current - March 18, 2026)

**Training Pipeline:** SFT (~6,700 examples) + DPO (~1,800 preference pairs)

v0.6 is a **recovery release** that fixes evaluation extraction bugs and includes targeted training improvements.

**Key Changes from v0.5:**

| Change | v0.5 | v0.6 | Impact |
|--------|------|------|--------|
| SFT Dataset Size | 14,260 | ~6,700 | Less dilution, more focused |
| General Instructions | 5,711 | 1,200 | Reduced interference |
| Adversarial DPO Pairs | Diluted 2:1 | Doubled (400) | Better % handling |
| Market Knowledge | Added | Removed | Cleaner, more precise |
| Answer Format | None | Enforced | Better extraction |

**Major Improvements over v0.4:**

| Category | v0.4 | v0.6 | Change |
|----------|------|------|--------|
| **Overall** | 75.0% | **90.2%** | **+15.2%** |
| Adversarial % | 71.0% | **93.0%** | **+22.0%** |
| Multi-step | 32.0% | **92.3%** | **+60.3%** |
| Position Sizing | 81.7% | **98.3%** | **+16.6%** |
| Edge Cases | 90.0% | **95.0%** | **+5.0%** |
| General Capability | 96.4% | **98.2%** | **+1.8%** |
| Trading Mechanics | 80.0% | **90.0%** | **+10.0%** |
| Parameter Validation | 100% | **100%** | Maintained |

*Note: v0.6 results reflect corrected evaluation scoring after fixing an extraction bug that was grabbing question values instead of computed answers.*

### v0.5 (March 16, 2026)

**Training Pipeline:** SFT (14,260 examples) + DPO (3,057 pairs)

**Issues:** Dataset dilution caused -4.4% regression from v0.4. Doubled general instructions interfered with specialized training.

### v0.4 (March 11, 2026)

**Training Pipeline:** SFT (6,782 examples) + DPO (1,400 pairs)

Established baseline with strong adversarial percentage handling (71%) and 100% parameter validation.

### v0.3 (March 6, 2026)

**Training Pipeline:** SFT (7,028 examples) + DPO (1,400 pairs)

First stable release with comprehensive evaluation across 9 categories.

## Evaluation Results (v0.6)

Evaluated on 337 questions across 9 categories:

| Category | Questions | Score | Accuracy |
|----------|-----------|-------|----------|
| Parameter Validation | 15 | 15.0/15 | **100%** |
| Position Sizing Math | 60 | 59.0/60 | **98.3%** |
| General Capability | 55 | 54.0/55 | **98.2%** |
| Edge Cases | 40 | 38.0/40 | **95.0%** |
| Adversarial Percentage | 100 | 93.0/100 | **93.0%** |
| Multi-step Reasoning | 30 | 27.7/30 | **92.3%** |
| Trading Mechanics | 10 | 9.0/10 | **90.0%** |
| Factual | 15 | 5.0/15 | 33.3% |
| API Structure | 12 | 3.3/12 | 27.5% |
| **Overall** | **337** | **304.0/337** | **90.2%** |

### Evaluation Methodology

v0.6 introduces a robust evaluation system with question-aware extraction:

1. **Question Value Exclusion** - Parser identifies all numeric values in the question (dollar amounts, percentages, leverage) and excludes them from answer extraction
2. **Multi-Stage Extraction** - Prioritizes JSON blocks > Final Answer sections > Explicit markers > Context-aware patterns
3. **Confidence Scoring** - Each extraction includes confidence scores for quality assurance

This fixes the 17% false negative rate (53 extraction bugs) that affected earlier evaluations.

## Training Configuration

### LoRA Parameters
```python
{
    "r": 64,
    "lora_alpha": 128,
    "lora_dropout": 0.05,
    "target_modules": ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    "use_rslora": True,
    "use_dora": True
}
```

### SFT Hyperparameters
```python
{
    "learning_rate": 1e-5,
    "epochs": 5,  # With early stopping
    "batch_size": 4,
    "gradient_accumulation_steps": 2,
    "warmup_ratio": 0.10,
    "max_length": 4096
}
```

### DPO Hyperparameters
```python
{
    "beta": 0.05,
    "learning_rate": 5e-7,
    "epochs": 2,
    "batch_size": 4,
    "gradient_accumulation_steps": 2
}
```

### Training Data Distribution

**SFT (~6,700 examples):**

| Category | Examples | % |
|----------|----------|---|
| General Instruction | 1,200 | 17.9% |
| Position Sizing | 800 | 11.9% |
| Parameter Validation | 700 | 10.4% |
| Edge Cases | 600 | 9.0% |
| API Structure (Enhanced) | 500 | 7.5% |
| Adversarial Percentages | 500 | 7.5% |
| Answer Format | 400 | 6.0% |
| Trading Mechanics | 400 | 6.0% |
| Leverage/Percentage | 300 | 4.5% |
| Factual | 300 | 4.5% |
| Cod3x Reasoning | 300 | 4.5% |
| Other | 700 | 10.4% |

**DPO (~1,800 pairs):**

| Failure Mode | Pairs | % |
|--------------|-------|---|
| Percentage Confusion | 400 | 22.2% |
| API Structure | 250 | 13.9% |
| Excessive Leverage | 250 | 13.9% |
| Final Answer | 200 | 11.1% |
| Risk Violation | 200 | 11.1% |
| Leverage Calculation | 150 | 8.3% |
| Position Sizing | 150 | 8.3% |
| Policy Bypass | 100 | 5.6% |
| Uncertainty Caution | 100 | 5.6% |

## Usage

### With Transformers + PEFT

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-4B-Instruct-2507",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "UVLabs/HyperLLM-4b")
tokenizer = AutoTokenizer.from_pretrained("UVLabs/HyperLLM-4b")

# Generate
messages = [{"role": "user", "content": "Calculate position size for $10,000 account, 2% risk, entry $50, stop loss $48"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

### With 4-bit Quantization (Low VRAM)

```python
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-4B-Instruct-2507",
    quantization_config=bnb_config,
    device_map="auto",
)

model = PeftModel.from_pretrained(base_model, "UVLabs/HyperLLM-4b")
tokenizer = AutoTokenizer.from_pretrained("UVLabs/HyperLLM-4b")
```

## Example Use Cases

### Position Sizing
```
User: I have a $50,000 account and want to risk 2% on an ETH trade.
      Entry at $3,450, stop loss at $3,400. What's my position size?

Model: Risk amount = $50,000 * 2% = $1,000
       Stop distance = $3,450 - $3,400 = $50
       Position size = $1,000 / $50 = 20 ETH

       **Final Answer: 20 ETH**
```

### Leverage Calculation
```
User: Account $10,000. Position $50,000. What's the leverage?

Model: Leverage = Position Value / Account Balance
       Leverage = $50,000 / $10,000 = 5x

       **Final Answer: 5x**
```

### Parameter Validation
```
User: Is a position size of 1.234 valid for ETH (szDecimals=2)?

Model: No, 1.234 has 3 decimal places but ETH only allows 2 (szDecimals=2).
       Valid sizes would be 1.23 or 1.24.
```

### Adversarial Percentage Handling
```
User: Increase my position by 50%

Model: I need clarification. "Increase by 50%" could mean:
       1. Add 50% of current position (1.5x total)
       2. Increase to 50% of account (absolute)
       Which do you mean?
```

## Limitations

- **Factual Knowledge:** 33.3% accuracy - specific Hyperliquid URLs and fee structures need improvement
- **API Structure:** 27.5% accuracy - exact JSON field names still challenging

## Hardware Requirements

| Mode | VRAM | Notes |
|------|------|-------|
| bfloat16 | ~10GB | Full precision inference |
| 4-bit | ~4GB | Quantized inference |
| 8-bit | ~6GB | INT8 quantization |

## Training Hardware

- **Hardware:** NVIDIA A100 80GB SXM
- **SFT Duration:** ~25 minutes
- **DPO Duration:** ~20 minutes
- **Total Cost:** ~$2.00 (RunPod)

## Framework Versions

- PEFT: 0.18.1
- TRL: 0.29.0
- Transformers: 5.2.0
- PyTorch: 2.10.0

## License

Apache 2.0

## Citation

```bibtex
@misc{hyperllm2026,
  title={HyperLLM: A Specialized LLM for Hyperliquid Trading},
  author={UVLabs},
  year={2026},
  url={https://huggingface.co/UVLabs/HyperLLM-4b}
}
```