Realistic Vision V6.0 B1 (Axera Optimized)
This repository contains an optimized derivative of Realistic Vision V6.0 B1, fused with LCM-LoRA for high-performance inference on Axera hardware (AX650N / LLM8850).
Note: This repository is a compilation of existing open-source works. The maintainer of this repository is not the original creator of the base models or the conversion tools, but has performed the compilation, graph surgery, and quantization for the Axera NPU platform.
Model Description
- Base Model: Realistic Vision V6.0 B1 (noVAE)
- VAE: sd-vae-ft-mse
- Acceleration: LCM-LoRA (Fused)
- Target Hardware: Axera AX650N / LLM8850 (NPU3)
- Primary Hardware Targets:
- Raspberry Pi 5 (via Axera M.2 Accelerator Card)
- M5Stack LLM-8850 Card (Documentation)
- Format: Axera
.axmodel(compiled via Pulsar2 v5.1)
Performance Metrics
Benchmarks performed on a Raspberry Pi 5 (8GB) with an Axera AX650N M.2 Accelerator:
| Component | Latency |
|---|---|
| Text Encoder | ~14 ms |
| U-Net (4 steps) | ~1716 ms |
| VAE Decoder | ~936 ms |
| Total Inference | ~2.7 seconds |
| Model Loading | ~15.7 seconds |
Note: Inference time is for a 512x512 image with 4 steps.
Sample Gallery
Prompt: A serene portrait of an elderly man with silver hair and a warm smile, wearing traditional embroidered clothing, sitting in a lush greenhouse surrounded by tropical plants, soft morning sunlight, 8k, highly detailed, realistic
Conversion Workflow Summary
- Export: Components exported from PyTorch/Diffusers to ONNX with LoRA fusion.
- Graph Surgery: Modified U-Net to expose the 1280-dim time embedding input (
/down_blocks.0/resnets.0/act_1/Mul_output_0). - Quantization: Compiled using Pulsar2 with u16/int8 mixed precision and representative calibration data.
Usage Instructions
To run this model on Axera hardware, you will need the axengine Python library.
1. Install Dependencies
pip install axengine transformers torch pillow numpy
2. Run Inference
Use the provided sample_inference.py script:
python sample_inference.py --prompt "A beautiful landscape, oil painting"
Credits and Citations
This work is made possible by the following upstream projects:
- Realistic Vision V6.0 B1: Created by SG161222.
- LCM-LoRA: Developed by latent-consistency.
- Stable Diffusion 1.5: Developed by CompVis and Stability AI.
- Conversion Tools: The AXERA-TECH/sd1.5-lcm.axera project (forked from BUG1989).
Optimization & Compilation: The artifacts in this repository were compiled and optimized for the Axera NPU using the Pulsar2 toolchain.
Additional Resources
- Project Repository: AXERA-TECH/sd1.5-lcm.axera - Tools and scripts for Axera SD1.5 LCM deployment.
- Toolchain Documentation: Axera Pulsar2 Docs - Official documentation for the Pulsar2 compilation toolchain.
- Hardware Guide: M5Stack LLM-8850 Card - Specifications and quick start for the AX650N-based M.2 card.
Licensing and Restrictions
This model is subject to the CreativeML Open RAIL-M license. Please review the license for usage restrictions and safety guidelines.