Realistic Vision V6.0 B1 (Axera Optimized)

This repository contains an optimized derivative of Realistic Vision V6.0 B1, fused with LCM-LoRA for high-performance inference on Axera hardware (AX650N / LLM8850).

Note: This repository is a compilation of existing open-source works. The maintainer of this repository is not the original creator of the base models or the conversion tools, but has performed the compilation, graph surgery, and quantization for the Axera NPU platform.

Model Description

Base Model: Realistic Vision V6.0 B1 (noVAE)
VAE: sd-vae-ft-mse
Acceleration: LCM-LoRA (Fused)
Target Hardware: Axera AX650N / LLM8850 (NPU3)
Primary Hardware Targets:
- Raspberry Pi 5 (via Axera M.2 Accelerator Card)
- M5Stack LLM-8850 Card (Documentation)
Format: Axera .axmodel (compiled via Pulsar2 v5.1)

Performance Metrics

Benchmarks performed on a Raspberry Pi 5 (8GB) with an Axera AX650N M.2 Accelerator:

Component	Latency
Text Encoder	~14 ms
U-Net (4 steps)	~1716 ms
VAE Decoder	~936 ms
Total Inference	~2.7 seconds
Model Loading	~15.7 seconds

Note: Inference time is for a 512x512 image with 4 steps.

Sample Gallery

Prompt: A serene portrait of an elderly man with silver hair and a warm smile, wearing traditional embroidered clothing, sitting in a lush greenhouse surrounded by tropical plants, soft morning sunlight, 8k, highly detailed, realistic

Conversion Workflow Summary

Export: Components exported from PyTorch/Diffusers to ONNX with LoRA fusion.
Graph Surgery: Modified U-Net to expose the 1280-dim time embedding input (/down_blocks.0/resnets.0/act_1/Mul_output_0).
Quantization: Compiled using Pulsar2 with u16/int8 mixed precision and representative calibration data.

Usage Instructions

To run this model on Axera hardware, you will need the axengine Python library.

1. Install Dependencies

pip install axengine transformers torch pillow numpy

2. Run Inference

Use the provided sample_inference.py script:

python sample_inference.py --prompt "A beautiful landscape, oil painting"

Credits and Citations

This work is made possible by the following upstream projects:

Realistic Vision V6.0 B1: Created by SG161222.
LCM-LoRA: Developed by latent-consistency.
Stable Diffusion 1.5: Developed by CompVis and Stability AI.
Conversion Tools: The AXERA-TECH/sd1.5-lcm.axera project (forked from BUG1989).

Optimization & Compilation: The artifacts in this repository were compiled and optimized for the Axera NPU using the Pulsar2 toolchain.

Additional Resources

Project Repository: AXERA-TECH/sd1.5-lcm.axera - Tools and scripts for Axera SD1.5 LCM deployment.
Toolchain Documentation: Axera Pulsar2 Docs - Official documentation for the Pulsar2 compilation toolchain.
Hardware Guide: M5Stack LLM-8850 Card - Specifications and quick start for the AX650N-based M.2 card.

Licensing and Restrictions

This model is subject to the CreativeML Open RAIL-M license. Please review the license for usage restrictions and safety guidelines.

Downloads last month: -; Downloads are not tracked for this model. How to track