Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -4,16 +4,16 @@ emoji: π
|
|
| 4 |
colorFrom: blue
|
| 5 |
colorTo: purple
|
| 6 |
sdk: gradio
|
| 7 |
-
sdk_version: 5.
|
| 8 |
app_file: app.py
|
| 9 |
-
pinned:
|
| 10 |
license: mit
|
| 11 |
short_description: 'Compact LLM Battle Arena: Frugal AI Face-Off!'
|
| 12 |
---
|
| 13 |
|
| 14 |
# π GPU-Poor LLM Gladiator Arena π
|
| 15 |
|
| 16 |
-
Welcome to the GPU-Poor LLM Gladiator Arena, where frugal meets fabulous in the world of AI! This project pits compact language models (maxing out at
|
| 17 |
|
| 18 |
## π€ Starting from "Why?"
|
| 19 |
|
|
@@ -31,10 +31,11 @@ In the recent months, we've seen a lot of these "Tiny" models released, and some
|
|
| 31 |
## π Features
|
| 32 |
|
| 33 |
- **Battle Arena**: Pit two mystery models against each other and decide which pint-sized powerhouse reigns supreme.
|
|
|
|
| 34 |
- **Leaderboard**: Track the performance of different models over time using an improved scoring system.
|
| 35 |
- **Performance Chart**: Visualize model performance with interactive charts.
|
| 36 |
- **Privacy-Focused**: Uses local Ollama API, avoiding pricey commercial APIs and keeping data close to home.
|
| 37 |
-
- **
|
| 38 |
|
| 39 |
## π Getting Started
|
| 40 |
|
|
@@ -43,7 +44,9 @@ In the recent months, we've seen a lot of these "Tiny" models released, and some
|
|
| 43 |
- Python 3.7+
|
| 44 |
- Gradio
|
| 45 |
- Plotly
|
| 46 |
-
-
|
|
|
|
|
|
|
| 47 |
|
| 48 |
### Installation
|
| 49 |
|
|
@@ -55,7 +58,7 @@ In the recent months, we've seen a lot of these "Tiny" models released, and some
|
|
| 55 |
|
| 56 |
2. Install the required packages:
|
| 57 |
```
|
| 58 |
-
pip install gradio plotly
|
| 59 |
```
|
| 60 |
|
| 61 |
3. Ensure Ollama is running locally or via a remote server.
|
|
@@ -71,9 +74,11 @@ In the recent months, we've seen a lot of these "Tiny" models released, and some
|
|
| 71 |
2. In the "Battle Arena" tab:
|
| 72 |
- Enter a prompt or use the random prompt generator (π² button).
|
| 73 |
- Click "Generate Responses" to see outputs from two random models.
|
| 74 |
-
- Vote for the better response.
|
| 75 |
3. Check the "Leaderboard" tab to see overall model performance.
|
| 76 |
4. View the "Performance Chart" tab for a visual representation of model wins and losses.
|
|
|
|
|
|
|
| 77 |
|
| 78 |
## π Configuration
|
| 79 |
|
|
@@ -137,52 +142,24 @@ In addition to the main leaderboard, we also maintain an ELO-based leaderboard:
|
|
| 137 |
|
| 138 |
## π€ Models
|
| 139 |
|
| 140 |
-
The arena
|
| 141 |
-
|
| 142 |
-
- LLaMA 3.
|
| 143 |
-
-
|
| 144 |
-
-
|
| 145 |
-
-
|
| 146 |
-
-
|
| 147 |
-
-
|
| 148 |
-
|
| 149 |
-
|
| 150 |
-
|
| 151 |
-
|
| 152 |
-
|
| 153 |
-
|
| 154 |
-
-
|
| 155 |
-
-
|
| 156 |
-
-
|
| 157 |
-
-
|
| 158 |
-
- Granite 3 MoE (3B)
|
| 159 |
-
- Ministral (8B)
|
| 160 |
-
- Dolphin 2.9.4 (8B)
|
| 161 |
-
- Yi v1.5 (6B)
|
| 162 |
-
- Yi v1.5 (9B)
|
| 163 |
-
- Mistral Nemo (12B)
|
| 164 |
-
- GLM4 (9B)
|
| 165 |
-
- InternLM2 v2.5 (7B)
|
| 166 |
-
- Falcon2 (11B)
|
| 167 |
-
- StableLM2 (1.6B)
|
| 168 |
-
- StableLM2 (12B)
|
| 169 |
-
- Solar (10.7B)
|
| 170 |
-
- Rombos Qwen (7B)
|
| 171 |
-
- Rombos Qwen (1.5B)
|
| 172 |
-
- Aya Expanse (8B)
|
| 173 |
-
- SmolLM2 (1.7B)
|
| 174 |
-
- TinyLLama (1.1B)
|
| 175 |
-
- Pints (1.57B)
|
| 176 |
-
- OLMoE (7B)
|
| 177 |
-
- Llama 3.2 Uncensored (3B)
|
| 178 |
-
- Llama 3.1 Hawkish (8B)
|
| 179 |
-
- Humanish Llama 3 (8B)
|
| 180 |
-
- Nemotron Mini (4B)
|
| 181 |
-
- Teuken (7B)
|
| 182 |
-
- Llama 3.1 Sauerkraut (8B)
|
| 183 |
-
- Llama 3.1 SuperNova Lite (8B)
|
| 184 |
-
- EuroLLM (9B)
|
| 185 |
-
- Intellect-1 (10B)
|
| 186 |
|
| 187 |
## π€ Contributing
|
| 188 |
|
|
|
|
| 4 |
colorFrom: blue
|
| 5 |
colorTo: purple
|
| 6 |
sdk: gradio
|
| 7 |
+
sdk_version: 5.9.1
|
| 8 |
app_file: app.py
|
| 9 |
+
pinned: true
|
| 10 |
license: mit
|
| 11 |
short_description: 'Compact LLM Battle Arena: Frugal AI Face-Off!'
|
| 12 |
---
|
| 13 |
|
| 14 |
# π GPU-Poor LLM Gladiator Arena π
|
| 15 |
|
| 16 |
+
Welcome to the GPU-Poor LLM Gladiator Arena, where frugal meets fabulous in the world of AI! This project pits compact language models (maxing out at 14B parameters) against each other in a battle of wits and words.
|
| 17 |
|
| 18 |
## π€ Starting from "Why?"
|
| 19 |
|
|
|
|
| 31 |
## π Features
|
| 32 |
|
| 33 |
- **Battle Arena**: Pit two mystery models against each other and decide which pint-sized powerhouse reigns supreme.
|
| 34 |
+
- **Dynamic Model Management**: Models list is managed remotely, allowing for easy updates without code changes.
|
| 35 |
- **Leaderboard**: Track the performance of different models over time using an improved scoring system.
|
| 36 |
- **Performance Chart**: Visualize model performance with interactive charts.
|
| 37 |
- **Privacy-Focused**: Uses local Ollama API, avoiding pricey commercial APIs and keeping data close to home.
|
| 38 |
+
- **Model Suggestions**: Users can suggest new models to be added to the arena.
|
| 39 |
|
| 40 |
## π Getting Started
|
| 41 |
|
|
|
|
| 44 |
- Python 3.7+
|
| 45 |
- Gradio
|
| 46 |
- Plotly
|
| 47 |
+
- OpenAI Python library (for API compatibility)
|
| 48 |
+
- Nextcloud Python API
|
| 49 |
+
- Ollama (running via OpenAI compatible API wrapper)
|
| 50 |
|
| 51 |
### Installation
|
| 52 |
|
|
|
|
| 58 |
|
| 59 |
2. Install the required packages:
|
| 60 |
```
|
| 61 |
+
pip install gradio plotly openai nc_py_api
|
| 62 |
```
|
| 63 |
|
| 64 |
3. Ensure Ollama is running locally or via a remote server.
|
|
|
|
| 74 |
2. In the "Battle Arena" tab:
|
| 75 |
- Enter a prompt or use the random prompt generator (π² button).
|
| 76 |
- Click "Generate Responses" to see outputs from two random models.
|
| 77 |
+
- Vote for the better response or choose "Tie" to continue the battle.
|
| 78 |
3. Check the "Leaderboard" tab to see overall model performance.
|
| 79 |
4. View the "Performance Chart" tab for a visual representation of model wins and losses.
|
| 80 |
+
5. Check the "ELO Leaderboard" for an alternative ranking system.
|
| 81 |
+
6. Use the "Suggest Models" tab to propose new models for the arena.
|
| 82 |
|
| 83 |
## π Configuration
|
| 84 |
|
|
|
|
| 142 |
|
| 143 |
## π€ Models
|
| 144 |
|
| 145 |
+
The arena supports a dynamic list of models that is updated regularly. The current list includes models from various families such as:
|
| 146 |
+
|
| 147 |
+
- LLaMA 3.x series (1B to 8B)
|
| 148 |
+
- Gemma 2 (2B and 9B)
|
| 149 |
+
- Qwen 2.5 (0.5B to 7B)
|
| 150 |
+
- Mistral and variants
|
| 151 |
+
- Yi models
|
| 152 |
+
- And many more!
|
| 153 |
+
|
| 154 |
+
For the complete and current list of models, check the arena's leaderboard.
|
| 155 |
+
|
| 156 |
+
## π Technical Details
|
| 157 |
+
|
| 158 |
+
The project uses:
|
| 159 |
+
- Nextcloud for remote storage of models list and leaderboard data
|
| 160 |
+
- OpenAI-compatible API interface for model interactions
|
| 161 |
+
- Background thread for periodic model list updates
|
| 162 |
+
- ELO rating system with size-based adjustments
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 163 |
|
| 164 |
## π€ Contributing
|
| 165 |
|