BECoach — Qwen3.5-4B (4-bit MLX)

The first Qwen3.5-4B model quantized for on-device inference on iPhone via mlx-swift.

Built by APEX LEARN for the BECoach iOS app.

Model Details

Property	Value
Base model	Qwen/Qwen3.5-4B
Quantization	4-bit (4.503 bits/weight)
Framework	MLX (Apple Silicon)
Size on disk	~2.4 GB
RAM at runtime	~2.5 GB
License	Apache 2.0

What is BECoach?

BECoach is an on-device AI coaching assistant for iPhone. No cloud. No API calls. Pure edge inference powered by Apple's MLX framework running directly on the iPhone's neural engine.

Usage (iOS / mlx-swift)

import MLXLLM

let modelID = "apexlearn/BECoach-Qwen3.5-4B-4bit-mlx"
let config = ModelConfiguration(id: modelID)
let model = try await LLMModelFactory.shared.loadContainer(configuration: config)

Usage (Python / mlx-lm)

from mlx_lm import load, generate

model, tokenizer = load("apexlearn/BECoach-Qwen3.5-4B-4bit-mlx")
response = generate(model, tokenizer, prompt="Hello, how can you help me?")
print(response)

Device Compatibility

Device	RAM	Status
iPhone 16 Pro / Pro Max	8 GB	✅ Recommended
iPhone 15 Pro / Pro Max	8 GB	✅ Supported
iPhone 15 / 16 (standard)	6 GB	⚠️ May hit memory pressure
iPhone 14 and earlier	≤6 GB	❌ Not recommended
iPad Pro M1+	8–16 GB	✅ Works well
Mac (Apple Silicon)	Any	✅ Full performance

Conversion

Converted from Qwen/Qwen3.5-4B using mlx-lm:

python3 -m mlx_lm.convert \
  --hf-path Qwen/Qwen3.5-4B \
  --mlx-path ./BECoach-Qwen3.5-4B-4bit-mlx \
  --quantize \
  --q-bits 4

License

Apache 2.0 — same as the base Qwen3.5 model.

Downloads last month: 390

Safetensors

Model size

0.7B params

Tensor type

BF16

U32

F32

MLX

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for apexlearn/BECoach-Qwen3.5-4B-4bit-mlx

Base model

Qwen/Qwen3.5-4B-Base

Finetuned

Qwen/Qwen3.5-4B

Quantized

(77)

this model