Improve model card: Add metadata, links, and correct formatting
Browse filesThis PR enhances the model card for `MetaStone-S1-32B` by:
- Adding `pipeline_tag: text-generation` to the metadata, improving discoverability on the Hugging Face Hub for text generation models.
- Including `library_name: transformers` to ensure proper integration with Hugging Face Hub features, such as the "how to use" widget.
- Adding relevant `tags` (`qwen2`, `math`, `code`, `reasoning`) to better categorize the model's capabilities.
- Adding a clear top-level title for readability.
- Prominently featuring links to the paper (Hugging Face Papers page), the official project website, and the GitHub repository at the top of the model card.
- Correcting the redundant `
README.md
CHANGED
|
@@ -1,6 +1,18 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
## Introduction
|
| 5 |
We release our first reflective generative model: MetaStone-S1.
|
| 6 |
With only 32B parameters, MetaStone-S1 performs comparably to the OpenAI-o3 series on mathematics, coding, and Chinese reasoning tasks.
|
|
@@ -12,7 +24,7 @@ By sharing the backbone network between the PRMs and policy models, MetaStone‑
|
|
| 12 |
|
| 13 |
<img src="./figures/intro.jpg" alt="Introduction" width="800">
|
| 14 |
|
| 15 |
-
This repo contains the training and evaluation code of MetaStone-S1. For full details please refer to our [paper](https://
|
| 16 |
|
| 17 |
|
| 18 |
## Performance
|
|
@@ -36,8 +48,6 @@ Since the base model used for this repo is QwQ-32B, we chose the contemporary De
|
|
| 36 |
| **MetaStone-S1-32B-high** | **85.2** | <ins>73.6</ins> | 64.2 | <ins>89.7</ins> |
|
| 37 |
|
| 38 |
|
| 39 |
-
## Model
|
| 40 |
-
|
| 41 |
## Model
|
| 42 |
|
| 43 |
We save the parameters of the policy model and the SPRM head into two files:
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
pipeline_tag: text-generation
|
| 4 |
+
library_name: transformers
|
| 5 |
+
tags:
|
| 6 |
+
- qwen2
|
| 7 |
+
- math
|
| 8 |
+
- code
|
| 9 |
+
- reasoning
|
| 10 |
---
|
| 11 |
+
|
| 12 |
+
# MetaStone-S1: Test-Time Scaling with Reflective Generative Model
|
| 13 |
+
|
| 14 |
+
📚 [Paper](https://huggingface.co/papers/2507.01951) | 🌐 [Project Page](https://www.wenxiaobai.com/) | 💻 [GitHub Repository](https://github.com/MetaStone-AI/MetaStone-S1)
|
| 15 |
+
|
| 16 |
## Introduction
|
| 17 |
We release our first reflective generative model: MetaStone-S1.
|
| 18 |
With only 32B parameters, MetaStone-S1 performs comparably to the OpenAI-o3 series on mathematics, coding, and Chinese reasoning tasks.
|
|
|
|
| 24 |
|
| 25 |
<img src="./figures/intro.jpg" alt="Introduction" width="800">
|
| 26 |
|
| 27 |
+
This repo contains the training and evaluation code of MetaStone-S1. For full details please refer to our [paper](https://huggingface.co/papers/2507.01951) and [our official website](https://www.wenxiaobai.com/).
|
| 28 |
|
| 29 |
|
| 30 |
## Performance
|
|
|
|
| 48 |
| **MetaStone-S1-32B-high** | **85.2** | <ins>73.6</ins> | 64.2 | <ins>89.7</ins> |
|
| 49 |
|
| 50 |
|
|
|
|
|
|
|
| 51 |
## Model
|
| 52 |
|
| 53 |
We save the parameters of the policy model and the SPRM head into two files:
|