--- base_model: Qwen/Qwen3-4B-Instruct-2507 library_name: peft license: apache-2.0 language: - en tags: - trading - finance - hyperliquid - perpetuals - defi - lora - dpo - sft - trl - base_model:adapter:Qwen/Qwen3-4B-Instruct-2507 model_name: HyperLLM-4b pipeline_tag: text-generation --- # HyperLLM-4b v0.6 A specialized 4B parameter language model fine-tuned for Hyperliquid perpetual DEX trading assistance. Built on Qwen3-4B-Instruct using LoRA + DPO training. ## Model Description HyperLLM is designed to assist with: - **Position sizing calculations** - Risk-based position sizing with proper decimal handling - **API structure understanding** - Hyperliquid exchange API request/response formats - **Trading mechanics** - Perpetual futures concepts, margin modes, order types - **Parameter validation** - Validating trade parameters against exchange constraints - **Edge case handling** - Boundary conditions and unusual trading scenarios ## Version History ### v0.6 (Current - March 18, 2026) **Training Pipeline:** SFT (~6,700 examples) + DPO (~1,800 preference pairs) v0.6 is a **recovery release** that fixes evaluation extraction bugs and includes targeted training improvements. **Key Changes from v0.5:** | Change | v0.5 | v0.6 | Impact | |--------|------|------|--------| | SFT Dataset Size | 14,260 | ~6,700 | Less dilution, more focused | | General Instructions | 5,711 | 1,200 | Reduced interference | | Adversarial DPO Pairs | Diluted 2:1 | Doubled (400) | Better % handling | | Market Knowledge | Added | Removed | Cleaner, more precise | | Answer Format | None | Enforced | Better extraction | **Major Improvements over v0.4:** | Category | v0.4 | v0.6 | Change | |----------|------|------|--------| | **Overall** | 75.0% | **90.2%** | **+15.2%** | | Adversarial % | 71.0% | **93.0%** | **+22.0%** | | Multi-step | 32.0% | **92.3%** | **+60.3%** | | Position Sizing | 81.7% | **98.3%** | **+16.6%** | | Edge Cases | 90.0% | **95.0%** | **+5.0%** | | General Capability | 96.4% | **98.2%** | **+1.8%** | | Trading Mechanics | 80.0% | **90.0%** | **+10.0%** | | Parameter Validation | 100% | **100%** | Maintained | *Note: v0.6 results reflect corrected evaluation scoring after fixing an extraction bug that was grabbing question values instead of computed answers.* ### v0.5 (March 16, 2026) **Training Pipeline:** SFT (14,260 examples) + DPO (3,057 pairs) **Issues:** Dataset dilution caused -4.4% regression from v0.4. Doubled general instructions interfered with specialized training. ### v0.4 (March 11, 2026) **Training Pipeline:** SFT (6,782 examples) + DPO (1,400 pairs) Established baseline with strong adversarial percentage handling (71%) and 100% parameter validation. ### v0.3 (March 6, 2026) **Training Pipeline:** SFT (7,028 examples) + DPO (1,400 pairs) First stable release with comprehensive evaluation across 9 categories. ## Evaluation Results (v0.6) Evaluated on 337 questions across 9 categories: | Category | Questions | Score | Accuracy | |----------|-----------|-------|----------| | Parameter Validation | 15 | 15.0/15 | **100%** | | Position Sizing Math | 60 | 59.0/60 | **98.3%** | | General Capability | 55 | 54.0/55 | **98.2%** | | Edge Cases | 40 | 38.0/40 | **95.0%** | | Adversarial Percentage | 100 | 93.0/100 | **93.0%** | | Multi-step Reasoning | 30 | 27.7/30 | **92.3%** | | Trading Mechanics | 10 | 9.0/10 | **90.0%** | | Factual | 15 | 5.0/15 | 33.3% | | API Structure | 12 | 3.3/12 | 27.5% | | **Overall** | **337** | **304.0/337** | **90.2%** | ### Evaluation Methodology v0.6 introduces a robust evaluation system with question-aware extraction: 1. **Question Value Exclusion** - Parser identifies all numeric values in the question (dollar amounts, percentages, leverage) and excludes them from answer extraction 2. **Multi-Stage Extraction** - Prioritizes JSON blocks > Final Answer sections > Explicit markers > Context-aware patterns 3. **Confidence Scoring** - Each extraction includes confidence scores for quality assurance This fixes the 17% false negative rate (53 extraction bugs) that affected earlier evaluations. ## Training Configuration ### LoRA Parameters ```python { "r": 64, "lora_alpha": 128, "lora_dropout": 0.05, "target_modules": ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"], "use_rslora": True, "use_dora": True } ``` ### SFT Hyperparameters ```python { "learning_rate": 1e-5, "epochs": 5, # With early stopping "batch_size": 4, "gradient_accumulation_steps": 2, "warmup_ratio": 0.10, "max_length": 4096 } ``` ### DPO Hyperparameters ```python { "beta": 0.05, "learning_rate": 5e-7, "epochs": 2, "batch_size": 4, "gradient_accumulation_steps": 2 } ``` ### Training Data Distribution **SFT (~6,700 examples):** | Category | Examples | % | |----------|----------|---| | General Instruction | 1,200 | 17.9% | | Position Sizing | 800 | 11.9% | | Parameter Validation | 700 | 10.4% | | Edge Cases | 600 | 9.0% | | API Structure (Enhanced) | 500 | 7.5% | | Adversarial Percentages | 500 | 7.5% | | Answer Format | 400 | 6.0% | | Trading Mechanics | 400 | 6.0% | | Leverage/Percentage | 300 | 4.5% | | Factual | 300 | 4.5% | | Cod3x Reasoning | 300 | 4.5% | | Other | 700 | 10.4% | **DPO (~1,800 pairs):** | Failure Mode | Pairs | % | |--------------|-------|---| | Percentage Confusion | 400 | 22.2% | | API Structure | 250 | 13.9% | | Excessive Leverage | 250 | 13.9% | | Final Answer | 200 | 11.1% | | Risk Violation | 200 | 11.1% | | Leverage Calculation | 150 | 8.3% | | Position Sizing | 150 | 8.3% | | Policy Bypass | 100 | 5.6% | | Uncertainty Caution | 100 | 5.6% | ## Usage ### With Transformers + PEFT ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel import torch # Load base model base_model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen3-4B-Instruct-2507", torch_dtype=torch.bfloat16, device_map="auto", ) # Load LoRA adapter model = PeftModel.from_pretrained(base_model, "UVLabs/HyperLLM-4b") tokenizer = AutoTokenizer.from_pretrained("UVLabs/HyperLLM-4b") # Generate messages = [{"role": "user", "content": "Calculate position size for $10,000 account, 2% risk, entry $50, stop loss $48"}] inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device) outputs = model.generate(inputs, max_new_tokens=256) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ### With 4-bit Quantization (Low VRAM) ```python from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig from peft import PeftModel import torch bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16, ) base_model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen3-4B-Instruct-2507", quantization_config=bnb_config, device_map="auto", ) model = PeftModel.from_pretrained(base_model, "UVLabs/HyperLLM-4b") tokenizer = AutoTokenizer.from_pretrained("UVLabs/HyperLLM-4b") ``` ## Example Use Cases ### Position Sizing ``` User: I have a $50,000 account and want to risk 2% on an ETH trade. Entry at $3,450, stop loss at $3,400. What's my position size? Model: Risk amount = $50,000 * 2% = $1,000 Stop distance = $3,450 - $3,400 = $50 Position size = $1,000 / $50 = 20 ETH **Final Answer: 20 ETH** ``` ### Leverage Calculation ``` User: Account $10,000. Position $50,000. What's the leverage? Model: Leverage = Position Value / Account Balance Leverage = $50,000 / $10,000 = 5x **Final Answer: 5x** ``` ### Parameter Validation ``` User: Is a position size of 1.234 valid for ETH (szDecimals=2)? Model: No, 1.234 has 3 decimal places but ETH only allows 2 (szDecimals=2). Valid sizes would be 1.23 or 1.24. ``` ### Adversarial Percentage Handling ``` User: Increase my position by 50% Model: I need clarification. "Increase by 50%" could mean: 1. Add 50% of current position (1.5x total) 2. Increase to 50% of account (absolute) Which do you mean? ``` ## Limitations - **Factual Knowledge:** 33.3% accuracy - specific Hyperliquid URLs and fee structures need improvement - **API Structure:** 27.5% accuracy - exact JSON field names still challenging ## Hardware Requirements | Mode | VRAM | Notes | |------|------|-------| | bfloat16 | ~10GB | Full precision inference | | 4-bit | ~4GB | Quantized inference | | 8-bit | ~6GB | INT8 quantization | ## Training Hardware - **Hardware:** NVIDIA A100 80GB SXM - **SFT Duration:** ~25 minutes - **DPO Duration:** ~20 minutes - **Total Cost:** ~$2.00 (RunPod) ## Framework Versions - PEFT: 0.18.1 - TRL: 0.29.0 - Transformers: 5.2.0 - PyTorch: 2.10.0 ## License Apache 2.0 ## Citation ```bibtex @misc{hyperllm2026, title={HyperLLM: A Specialized LLM for Hyperliquid Trading}, author={UVLabs}, year={2026}, url={https://huggingface.co/UVLabs/HyperLLM-4b} } ```