--- license: apache-2.0 language: - en base_model: Menlo/Lucy-128k pipeline_tag: text-generation library_name: mlx tags: - mlx --- # Lucy-128k-dwq6-mlx 175 tok/sec on a M4 Mac Performance evaluation ```bash 21194/21194 [31:47<00:00, 11.11it/s] arc_challenge acc 0.34, norm 0.35, stderr 0.014 arc_easy acc 0.46, norm 0.39, stderr 0.010 boolq acc 0.62, norm 0.62, stderr 0.008 hellaswag acc 0.44, norm 0.53, stderr 0.004 openbookqa acc 0.23, norm 0.38, stderr 0.021 piqa acc 0.70, norm 0.69, stderr 0.010 winogrande acc 0.57, norm 0.57, stderr 0.013 ``` Performance evaluation of the base model ```bash 21194/21194 [39:58<00:00, 8.84it/s] arc_challenge acc 0.34, norm 0.35, stderr 0.013 arc_easy acc 0.46, norm 0.39, stderr 0.010 boolq acc 0.62, norm 0.62, stderr 0.008 hellaswag acc 0.44, norm 0.53, stderr 0.004 openbookqa acc 0.23, norm 0.39, stderr 0.021 piqa acc 0.70, norm 0.69, stderr 0.010 winogrande acc 0.56, norm 0.55, stderr 0.013 ``` This model [Lucy-128k-dwq6-mlx](https://huggingface.co/Lucy-128k-dwq6-mlx) was converted to MLX format from [Menlo/Lucy-128k](https://huggingface.co/Menlo/Lucy-128k) using mlx-lm version **0.26.0**. ## Use with mlx ```bash pip install mlx-lm ``` ```python from mlx_lm import load, generate model, tokenizer = load("Lucy-128k-dwq6-mlx") prompt = "hello" if tokenizer.chat_template is not None: messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) response = generate(model, tokenizer, prompt=prompt, verbose=True) ```