Text-to-3D
Transformers
Safetensors
English
AvaLovelace commited on
Commit
aaddae8
·
verified ·
1 Parent(s): b19550b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -16
README.md CHANGED
@@ -26,15 +26,15 @@ pipeline_tag: text-generation
26
  - **Funded by [optional]:** [More Information Needed]
27
  - **Shared by [optional]:** [More Information Needed]
28
  - **Model type:** [More Information Needed]
29
- - **Language(s) (NLP):** [More Information Needed]
30
  - **License:** [More Information Needed]
31
- - **Finetuned from model [optional]:** [More Information Needed]
32
 
33
  ### Model Sources [optional]
34
 
35
  <!-- Provide the basic links for the model. -->
36
 
37
- - **Repository:** [More Information Needed]
38
  - **Paper [optional]:** [More Information Needed]
39
  - **Demo [optional]:** [More Information Needed]
40
 
@@ -88,22 +88,18 @@ Use the code below to get started with the model.
88
 
89
  ### Training Procedure
90
 
91
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
92
-
93
- #### Preprocessing [optional]
94
-
95
- [More Information Needed]
96
-
97
 
98
  #### Training Hyperparameters
99
 
100
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
101
-
102
- #### Speeds, Sizes, Times [optional]
103
-
104
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
105
-
106
- [More Information Needed]
 
107
 
108
  ## Evaluation
109
 
 
26
  - **Funded by [optional]:** [More Information Needed]
27
  - **Shared by [optional]:** [More Information Needed]
28
  - **Model type:** [More Information Needed]
29
+ - **Language(s) (NLP):** English
30
  - **License:** [More Information Needed]
31
+ - **Finetuned from model:** [meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct)
32
 
33
  ### Model Sources [optional]
34
 
35
  <!-- Provide the basic links for the model. -->
36
 
37
+ - **Repository:** [AvaLovelace1/LegoGPT](https://github.com/AvaLovelace1/LegoGPT)
38
  - **Paper [optional]:** [More Information Needed]
39
  - **Demo [optional]:** [More Information Needed]
40
 
 
88
 
89
  ### Training Procedure
90
 
91
+ The model was fine-tuned using LoRA applied to the `q_proj` and `v_proj` matrices. We used AdamW optimization. The learning rate followed a cosine decay with warmup.
 
 
 
 
 
92
 
93
  #### Training Hyperparameters
94
 
95
+ - **Training regime:** bf16 mixed precision
96
+ - **Epochs:** 3
97
+ - **Global batch size:** 64
98
+ - **Max learning rate:** 0.002
99
+ - **Learning rate warmup steps:** 100
100
+ - **LoRA rank:** 32
101
+ - **LoRA alpha:** 16
102
+ - **LoRA dropout:** 0.05
103
 
104
  ## Evaluation
105