update model
Browse files- README.md +7 -5
- adapter_model.bin +2 -2
README.md
CHANGED
|
@@ -21,25 +21,27 @@ and [databricks-dolly-15k](https://github.com/databrickslabs/dolly/tree/master/d
|
|
| 21 |
The code for training the model is provided in our [github](https://github.com/mbzuai-nlp/Bactrian-X), which is adapted from [Alpaca-LoRA](https://github.com/tloen/alpaca-lora).
|
| 22 |
This version of the weights was trained with the following hyperparameters:
|
| 23 |
|
| 24 |
-
|
| 25 |
-
-
|
|
|
|
| 26 |
- Cutoff length: 512
|
| 27 |
- Learning rate: 3e-4
|
| 28 |
- Lora _r_: 64
|
| 29 |
- Lora target modules: q_proj, k_proj, v_proj, o_proj
|
| 30 |
|
|
|
|
| 31 |
|
| 32 |
That is:
|
| 33 |
|
| 34 |
```
|
| 35 |
python finetune.py \
|
| 36 |
--base_model='decapoda-research/llama-7b-hf' \
|
| 37 |
-
--num_epochs=
|
| 38 |
-
--batch_size=
|
| 39 |
--cutoff_len=512 \
|
| 40 |
--group_by_length \
|
| 41 |
--output_dir='./bactrian-x-7b-lora' \
|
| 42 |
-
--lora_target_modules='
|
| 43 |
--lora_r=64 \
|
| 44 |
--micro_batch_size=32
|
| 45 |
```
|
|
|
|
| 21 |
The code for training the model is provided in our [github](https://github.com/mbzuai-nlp/Bactrian-X), which is adapted from [Alpaca-LoRA](https://github.com/tloen/alpaca-lora).
|
| 22 |
This version of the weights was trained with the following hyperparameters:
|
| 23 |
|
| 24 |
+
|
| 25 |
+
- Epochs: 10
|
| 26 |
+
- Batch size: 128
|
| 27 |
- Cutoff length: 512
|
| 28 |
- Learning rate: 3e-4
|
| 29 |
- Lora _r_: 64
|
| 30 |
- Lora target modules: q_proj, k_proj, v_proj, o_proj
|
| 31 |
|
| 32 |
+
#### Current Training Steps: 21000
|
| 33 |
|
| 34 |
That is:
|
| 35 |
|
| 36 |
```
|
| 37 |
python finetune.py \
|
| 38 |
--base_model='decapoda-research/llama-7b-hf' \
|
| 39 |
+
--num_epochs=10 \
|
| 40 |
+
--batch_size=128 \
|
| 41 |
--cutoff_len=512 \
|
| 42 |
--group_by_length \
|
| 43 |
--output_dir='./bactrian-x-7b-lora' \
|
| 44 |
+
--lora_target_modules='q_proj,k_proj,v_proj,o_proj' \
|
| 45 |
--lora_r=64 \
|
| 46 |
--micro_batch_size=32
|
| 47 |
```
|
adapter_model.bin
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a79d74d6cfed583c0a176438158a983133a2016ed923d25347e4335d95c7aab8
|
| 3 |
+
size 268527949
|