Instructions to use davidkim205/Rhea-72b-v0.5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use davidkim205/Rhea-72b-v0.5 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="davidkim205/Rhea-72b-v0.5")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("davidkim205/Rhea-72b-v0.5")
model = AutoModelForCausalLM.from_pretrained("davidkim205/Rhea-72b-v0.5")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use davidkim205/Rhea-72b-v0.5 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "davidkim205/Rhea-72b-v0.5"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "davidkim205/Rhea-72b-v0.5",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/davidkim205/Rhea-72b-v0.5

SGLang

How to use davidkim205/Rhea-72b-v0.5 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "davidkim205/Rhea-72b-v0.5" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "davidkim205/Rhea-72b-v0.5",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "davidkim205/Rhea-72b-v0.5" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "davidkim205/Rhea-72b-v0.5",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use davidkim205/Rhea-72b-v0.5 with Docker Model Runner:
```
docker model run hf.co/davidkim205/Rhea-72b-v0.5
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Rhea-72b-v0.5

The Rhea project is a project that conducts research on various learning methods to improve llm model performance. We fine-tuned the existing model using the nox framework. We built a dataset for SFT learning based on the currently open dataset, and created a dataset using SGD (Self-Generated Dataset Creation Method for DPO Learning) for DPO learning.

Our model ranked first on HuggingFace's Open LLM leaderboard.

SGD : A Study on Self-Generated Dataset creation method for DPO Learning

This method proposes a novel method for generating datasets for DPO (Self-supervised Learning) models. We suggest a technique where sentences generated by the model are compared with the actual correct answers from an existing dataset, and sentences where the model's generated results do not match the correct answers are added. This enables the model to autonomously create training data, thereby enhancing the performance of DPO models.

Model Details

Model Developers : davidkim(changyeon kim)
Repository : https://github.com/davidkim205/nox
base mode : abacusai/Smaug-72B-v0.1
sft dataset : datasets_enconv_4m
dpo dataset : datasets_encomp_151k

sft dataset info : datasets_enconv_4m

100k random shuffle datasets

stack-exchange-preferences
SlimOrca
alpaca-gpt4
SHP
HC3
databricks-dolly-15k
orca-dpo-pairs
us-stockname
OpenHermes2.5-dpo-binarized-alpha
distilabel-math-preference-dpo
Neural-DPO
truthy-dpo-v0.1
distilabel-capybara-dpo-7k-binarized
us-sentiment
contextual-dpo-v0.1

1k random shuffle datasets

bigbench
glue_mnli
glue_qqp
xnli
codexglue_code2text_go
trivia_qa
medmcqa
hendrycks_ethics
super_glue_record
glue_qnli
anli_r3
swag
squad_v2
nq_open
drop
glue_sst2
blimp
paws-x
unscramble
anli_r2
babi
math_qa
social_i_qa
piqa
arithmetic
anli_r1
prost
sciq
mc_taco
medqa
super_glue_boolq
hendrycks_math
lambada
toxigen-data
glue_cola
pubmed_qa
logiqa
mutual
headqa
bbh
super_glue_wic
openbookqa
glue_mrpc
web_questions
qasper
super_glue_multirc
story_cloze
super_glue_rte
glue_rte
race
xwinograd
asdiv
xstory_cloze
crows_pairs_multilingual
belebele
glue_wnli
super_glue_wsc
coqa
super_glue_copa
super_glue_cb
winograd_wsc
mgsm
scrolls_contract_nli

If the data set cannot be found, it is internal company data and cannot be made public.

dpo dataset info : datasets_encomp_151k

Randomly selecting data from each category within the training dataset, we constructed a DPO (Direct Preference Optimization) dataset using sentences with logits lower than the mean within the model-generated sentences.

I'm sorry I can't reveal it.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	81.22
AI2 Reasoning Challenge (25-Shot)	79.78
HellaSwag (10-Shot)	91.15
MMLU (5-Shot)	77.95
TruthfulQA (0-shot)	74.50
Winogrande (5-shot)	87.85
GSM8k (5-shot)	76.12

Downloads last month: 8,406

Safetensors

Model size

72B params

Tensor type

BF16

Model tree for davidkim205/Rhea-72b-v0.5

Finetunes

1 model

Merges

2 models

Quantizations

2 models

Spaces using davidkim205/Rhea-72b-v0.5 3

Evaluation results

normalized accuracy on AI2 Reasoning Challenge (25-Shot)
test set Open LLM Leaderboard

79.780
normalized accuracy on HellaSwag (10-Shot)
validation set Open LLM Leaderboard

91.150
accuracy on MMLU (5-Shot)
test set Open LLM Leaderboard

77.950
mc2 on TruthfulQA (0-shot)
validation set Open LLM Leaderboard

74.500
accuracy on Winogrande (5-shot)
validation set Open LLM Leaderboard

87.850
accuracy on GSM8k (5-shot)
test set Open LLM Leaderboard

76.120