---
license: apache-2.0
language:
- en
base_model: Menlo/Lucy-128k
pipeline_tag: text-generation
library_name: mlx
tags:
- mlx
---

# Lucy-128k-dwq6-mlx

175 tok/sec on a M4 Mac

Performance evaluation

```bash
21194/21194 [31:47<00:00, 11.11it/s]

arc_challenge
acc 0.34, norm 0.35, stderr 0.014
arc_easy
acc 0.46, norm 0.39, stderr 0.010
boolq
acc 0.62, norm 0.62, stderr 0.008
hellaswag
acc 0.44, norm 0.53, stderr 0.004
openbookqa
acc 0.23, norm 0.38, stderr 0.021
piqa
acc 0.70, norm 0.69, stderr 0.010
winogrande
acc 0.57, norm 0.57, stderr 0.013
```

Performance evaluation of the base model

```bash
21194/21194 [39:58<00:00,  8.84it/s]
arc_challenge
acc 0.34, norm 0.35, stderr 0.013
arc_easy
acc 0.46, norm 0.39, stderr 0.010
boolq
acc 0.62, norm 0.62, stderr 0.008
hellaswag
acc 0.44, norm 0.53, stderr 0.004
openbookqa
acc 0.23, norm 0.39, stderr 0.021
piqa
acc 0.70, norm 0.69, stderr 0.010
winogrande
acc 0.56, norm 0.55, stderr 0.013
```


This model [Lucy-128k-dwq6-mlx](https://huggingface.co/Lucy-128k-dwq6-mlx) was
converted to MLX format from [Menlo/Lucy-128k](https://huggingface.co/Menlo/Lucy-128k)
using mlx-lm version **0.26.0**.

## Use with mlx

```bash
pip install mlx-lm
```

```python
from mlx_lm import load, generate

model, tokenizer = load("Lucy-128k-dwq6-mlx")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)
```