Update README.md
Browse files
README.md
CHANGED
|
@@ -114,7 +114,17 @@ model-index:
|
|
| 114 |
---
|
| 115 |
# orca_mini_3b
|
| 116 |
|
| 117 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 118 |
|
| 119 |
<a target="_blank" href="https://colab.research.google.com/#fileId=https://huggingface.co/psmathur/orca_mini_3b/blob/main/orca_mini_3b_T4_GPU.ipynb">
|
| 120 |
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
|
@@ -123,7 +133,7 @@ Use orca-mini-3b on Free Google Colab with T4 GPU :)
|
|
| 123 |
An [OpenLLaMa-3B model](https://github.com/openlm-research/open_llama) model trained on explain tuned datasets, created using Instructions and Input from WizardLM, Alpaca & Dolly-V2 datasets and applying Orca Research Paper dataset construction approaches.
|
| 124 |
|
| 125 |
|
| 126 |
-
|
| 127 |
|
| 128 |
We build explain tuned [WizardLM dataset ~70K](https://github.com/nlpxucan/WizardLM), [Alpaca dataset ~52K](https://crfm.stanford.edu/2023/03/13/alpaca.html) & [Dolly-V2 dataset ~15K](https://github.com/databrickslabs/dolly) created using approaches from [Orca Research Paper](https://arxiv.org/abs/2306.02707).
|
| 129 |
|
|
@@ -134,7 +144,7 @@ This helps student model aka this model to learn ***thought*** process from teac
|
|
| 134 |
Please see below example usage how the **System** prompt is added before each **instruction**.
|
| 135 |
|
| 136 |
|
| 137 |
-
|
| 138 |
|
| 139 |
The training configurations are provided in the table below.
|
| 140 |
|
|
@@ -156,7 +166,7 @@ Here are some of params used during training:
|
|
| 156 |
|
| 157 |
|
| 158 |
|
| 159 |
-
|
| 160 |
|
| 161 |
Below shows an example on how to use this model
|
| 162 |
|
|
@@ -230,8 +240,6 @@ Sincerely,
|
|
| 230 |
```
|
| 231 |
|
| 232 |
|
| 233 |
-
**P.S. I am #opentowork and #collaboration, if you can help, please reach out to me at www.linkedin.com/in/pankajam**
|
| 234 |
-
|
| 235 |
|
| 236 |
Next Goals:
|
| 237 |
1) Try more data like actually using FLAN-v2, just like Orka Research Paper (I am open for suggestions)
|
|
@@ -304,7 +312,7 @@ If you found wizardlm_alpaca_dolly_orca_open_llama_3b useful in your research or
|
|
| 304 |
howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
|
| 305 |
}
|
| 306 |
```
|
| 307 |
-
|
| 308 |
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_psmathur__orca_mini_3b)
|
| 309 |
|
| 310 |
| Metric | Value |
|
|
@@ -318,7 +326,7 @@ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-le
|
|
| 318 |
| GSM8K (5-shot) | 0.08 |
|
| 319 |
| DROP (3-shot) | 14.33 |
|
| 320 |
|
| 321 |
-
|
| 322 |
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_psmathur__orca_mini_3b)
|
| 323 |
|
| 324 |
| Metric |Value|
|
|
|
|
| 114 |
---
|
| 115 |
# orca_mini_3b
|
| 116 |
|
| 117 |
+
<img src="https://huggingface.co/pankajmathur/orca_mini_v5_8b/resolve/main/orca_minis_small.jpeg" width="auto" />
|
| 118 |
+
|
| 119 |
+
<strong>
|
| 120 |
+
Passionate about Generative AI? I help companies to privately train and deploy custom LLM/MLLM affordably. For startups, I can even assist with securing GPU grants to get you started. Let's chat!
|
| 121 |
+
|
| 122 |
+
<a href="https://www.linkedin.com/in/pankajam" target="_blank">https://www.linkedin.com/in/pankajam</a> Looking forward to connecting!
|
| 123 |
+
</strong>
|
| 124 |
+
|
| 125 |
+
<br>
|
| 126 |
+
|
| 127 |
+
**Use orca-mini-3b for Free on Google Colab with T4 GPU :)**
|
| 128 |
|
| 129 |
<a target="_blank" href="https://colab.research.google.com/#fileId=https://huggingface.co/psmathur/orca_mini_3b/blob/main/orca_mini_3b_T4_GPU.ipynb">
|
| 130 |
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
|
|
|
| 133 |
An [OpenLLaMa-3B model](https://github.com/openlm-research/open_llama) model trained on explain tuned datasets, created using Instructions and Input from WizardLM, Alpaca & Dolly-V2 datasets and applying Orca Research Paper dataset construction approaches.
|
| 134 |
|
| 135 |
|
| 136 |
+
### Dataset
|
| 137 |
|
| 138 |
We build explain tuned [WizardLM dataset ~70K](https://github.com/nlpxucan/WizardLM), [Alpaca dataset ~52K](https://crfm.stanford.edu/2023/03/13/alpaca.html) & [Dolly-V2 dataset ~15K](https://github.com/databrickslabs/dolly) created using approaches from [Orca Research Paper](https://arxiv.org/abs/2306.02707).
|
| 139 |
|
|
|
|
| 144 |
Please see below example usage how the **System** prompt is added before each **instruction**.
|
| 145 |
|
| 146 |
|
| 147 |
+
### Training
|
| 148 |
|
| 149 |
The training configurations are provided in the table below.
|
| 150 |
|
|
|
|
| 166 |
|
| 167 |
|
| 168 |
|
| 169 |
+
### Example Usage
|
| 170 |
|
| 171 |
Below shows an example on how to use this model
|
| 172 |
|
|
|
|
| 240 |
```
|
| 241 |
|
| 242 |
|
|
|
|
|
|
|
| 243 |
|
| 244 |
Next Goals:
|
| 245 |
1) Try more data like actually using FLAN-v2, just like Orka Research Paper (I am open for suggestions)
|
|
|
|
| 312 |
howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
|
| 313 |
}
|
| 314 |
```
|
| 315 |
+
### [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
| 316 |
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_psmathur__orca_mini_3b)
|
| 317 |
|
| 318 |
| Metric | Value |
|
|
|
|
| 326 |
| GSM8K (5-shot) | 0.08 |
|
| 327 |
| DROP (3-shot) | 14.33 |
|
| 328 |
|
| 329 |
+
### [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
| 330 |
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_psmathur__orca_mini_3b)
|
| 331 |
|
| 332 |
| Metric |Value|
|