Update README.md
Browse files
README.md
CHANGED
|
@@ -19,7 +19,7 @@ datasets:
|
|
| 19 |
|
| 20 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 21 |
|
| 22 |
-
This model is part of [this](https://
|
| 23 |
|
| 24 |
## Model Details
|
| 25 |
|
|
@@ -29,7 +29,7 @@ The model was fine-tuned on a dataset containing chemical synthesis procedures f
|
|
| 29 |
|
| 30 |
|
| 31 |
- **Developed by:** Bastian Ruehle
|
| 32 |
-
- **Funded by:** [Federal Institute fo Materials Research and Testing (BAM)](www.bam.de)
|
| 33 |
- **Model type:** LED (Longformer Encoder-Decoder)
|
| 34 |
- **Language(s) (NLP):** en
|
| 35 |
- **License:** [MIT](https://opensource.org/license/mit)
|
|
@@ -40,11 +40,11 @@ The model was fine-tuned on a dataset containing chemical synthesis procedures f
|
|
| 40 |
<!-- Provide the basic links for the model. -->
|
| 41 |
|
| 42 |
- **Repository:** The repository accompanying this model can be found [here](https://github.com/BAMresearch/MAPz_at_BAM/tree/main/Minerva-Workflow-Generator)
|
| 43 |
-
- **Paper:** The papers accompanying this model can be found [here](https://
|
| 44 |
|
| 45 |
## Uses
|
| 46 |
|
| 47 |
-
The model is integrated into a [node editor app](https://
|
| 48 |
|
| 49 |
### Direct Use
|
| 50 |
|
|
@@ -56,7 +56,7 @@ Even though it is not the intended way of using the model, it can be used "stand
|
|
| 56 |
|
| 57 |
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
|
| 58 |
|
| 59 |
-
The model was intended to be used with the [node editor app](https://
|
| 60 |
|
| 61 |
### Out-of-Scope Use
|
| 62 |
|
|
@@ -98,9 +98,9 @@ if __name__ == '__main__':
|
|
| 98 |
rawtext = """<Insert your Synthesis Procedure here>"""
|
| 99 |
|
| 100 |
# model_id = 'bruehle/BigBirdPegasus_Llama'
|
| 101 |
-
|
| 102 |
# model_id = 'bruehle/BigBirdPegasus_Chemtagger'
|
| 103 |
-
model_id = 'bruehle/LED-Base-16384_Chemtagger'
|
| 104 |
|
| 105 |
if 'BigBirdPegasus' in model_id:
|
| 106 |
max_length = 512
|
|
@@ -128,7 +128,7 @@ Models were trained on A100-80GB GPUs for 885’225 steps (5 epochs) on the trai
|
|
| 128 |
|
| 129 |
#### Preprocessing
|
| 130 |
|
| 131 |
-
More information on data pre- and postprocessing can be found [here](https://
|
| 132 |
|
| 133 |
|
| 134 |
#### Training Hyperparameters
|
|
@@ -145,7 +145,7 @@ More information on data pre- and postprocessing can be found [here](https://che
|
|
| 145 |
|
| 146 |
<!-- This should link to a Dataset Card if possible. -->
|
| 147 |
|
| 148 |
-
Example outputs for experimental procedures from the domains of materials science, organic chemistry, inorganic chemistry, and a patent that were not part of the training or evaluation dataset can be found [here](https://
|
| 149 |
|
| 150 |
## Technical Specifications
|
| 151 |
|
|
@@ -155,7 +155,7 @@ Longformer Encoder-Decoder Model for Text2Text/Seq2Seq Generation.
|
|
| 155 |
|
| 156 |
### Compute Infrastructure
|
| 157 |
|
| 158 |
-
Trained on HPC GPU nodes of the [Federal Institute fo Materials Research and Testing (BAM)](www.bam.de).
|
| 159 |
|
| 160 |
#### Hardware
|
| 161 |
|
|
@@ -171,13 +171,13 @@ Python 3.12
|
|
| 171 |
|
| 172 |
**BibTeX:**
|
| 173 |
|
| 174 |
-
@article{Ruehle_2025, title={Natural Language Processing for Automated Workflow and Knowledge Graph Generation in Self-Driving Labs}, DOI={10.
|
| 175 |
|
| 176 |
@article{doi:10.1021/acsnano.4c17504, author = {Zaki, Mohammad and Prinz, Carsten and Ruehle, Bastian}, title = {A Self-Driving Lab for Nano- and Advanced Materials Synthesis}, journal = {ACS Nano}, volume = {19}, number = {9}, pages = {9029-9041}, year = {2025}, doi = {10.1021/acsnano.4c17504}, note ={PMID: 39995288}, URL = {https://doi.org/10.1021/acsnano.4c17504}, eprint = {https://doi.org/10.1021/acsnano.4c17504}}
|
| 177 |
|
| 178 |
**APA:**
|
| 179 |
|
| 180 |
-
Ruehle, B. (2025). Natural Language Processing for Automated Workflow and Knowledge Graph Generation in Self-Driving Labs.
|
| 181 |
|
| 182 |
Zaki, M., Prinz, C. & Ruehle, B. (2025). A Self-Driving Lab for Nano- and Advanced Materials Synthesis. ACS Nano, 19(9), 9029-9041. doi:10.1021/acsnano.4c17504
|
| 183 |
|
|
|
|
| 19 |
|
| 20 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 21 |
|
| 22 |
+
This model is part of [this](https://pubs.rsc.org/en/Content/ArticleLanding/2025/DD/D5DD00063G) publication. It is used for translating chemical synthesis procedures given in natural language (en) to "action graphs", i.e., a simple markup language listing synthesis actions from a pre-defined controlled vocabulary along with the process parameters.
|
| 23 |
|
| 24 |
## Model Details
|
| 25 |
|
|
|
|
| 29 |
|
| 30 |
|
| 31 |
- **Developed by:** Bastian Ruehle
|
| 32 |
+
- **Funded by:** [Federal Institute fo Materials Research and Testing (BAM)](https://www.bam.de)
|
| 33 |
- **Model type:** LED (Longformer Encoder-Decoder)
|
| 34 |
- **Language(s) (NLP):** en
|
| 35 |
- **License:** [MIT](https://opensource.org/license/mit)
|
|
|
|
| 40 |
<!-- Provide the basic links for the model. -->
|
| 41 |
|
| 42 |
- **Repository:** The repository accompanying this model can be found [here](https://github.com/BAMresearch/MAPz_at_BAM/tree/main/Minerva-Workflow-Generator)
|
| 43 |
+
- **Paper:** The papers accompanying this model can be found [here](https://pubs.rsc.org/en/Content/ArticleLanding/2025/DD/D5DD00063G) and [here](https://pubs.acs.org/doi/full/10.1021/acsnano.4c17504)
|
| 44 |
|
| 45 |
## Uses
|
| 46 |
|
| 47 |
+
The model is integrated into a [node editor app](https://pubs.rsc.org/en/Content/ArticleLanding/2025/DD/D5DD00063G) for generating workflows from synthesis procedures given in natural language for the Self-Driving Lab platform [Minerva](https://pubs.acs.org/doi/full/10.1021/acsnano.4c17504).
|
| 48 |
|
| 49 |
### Direct Use
|
| 50 |
|
|
|
|
| 56 |
|
| 57 |
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
|
| 58 |
|
| 59 |
+
The model was intended to be used with the [node editor app](https://pubs.rsc.org/en/Content/ArticleLanding/2025/DD/D5DD00063G) for the Self-Driving Lab platform [Minerva](https://pubs.acs.org/doi/full/10.1021/acsnano.4c17504).
|
| 60 |
|
| 61 |
### Out-of-Scope Use
|
| 62 |
|
|
|
|
| 98 |
rawtext = """<Insert your Synthesis Procedure here>"""
|
| 99 |
|
| 100 |
# model_id = 'bruehle/BigBirdPegasus_Llama'
|
| 101 |
+
model_id = 'bruehle/LED-Base-16384_Llama' # or use any of the other models
|
| 102 |
# model_id = 'bruehle/BigBirdPegasus_Chemtagger'
|
| 103 |
+
# model_id = 'bruehle/LED-Base-16384_Chemtagger'
|
| 104 |
|
| 105 |
if 'BigBirdPegasus' in model_id:
|
| 106 |
max_length = 512
|
|
|
|
| 128 |
|
| 129 |
#### Preprocessing
|
| 130 |
|
| 131 |
+
More information on data pre- and postprocessing can be found [here](https://pubs.rsc.org/en/Content/ArticleLanding/2025/DD/D5DD00063G).
|
| 132 |
|
| 133 |
|
| 134 |
#### Training Hyperparameters
|
|
|
|
| 145 |
|
| 146 |
<!-- This should link to a Dataset Card if possible. -->
|
| 147 |
|
| 148 |
+
Example outputs for experimental procedures from the domains of materials science, organic chemistry, inorganic chemistry, and a patent that were not part of the training or evaluation dataset can be found [here](https://pubs.rsc.org/en/Content/ArticleLanding/2025/DD/D5DD00063G).
|
| 149 |
|
| 150 |
## Technical Specifications
|
| 151 |
|
|
|
|
| 155 |
|
| 156 |
### Compute Infrastructure
|
| 157 |
|
| 158 |
+
Trained on HPC GPU nodes of the [Federal Institute fo Materials Research and Testing (BAM)](https://www.bam.de).
|
| 159 |
|
| 160 |
#### Hardware
|
| 161 |
|
|
|
|
| 171 |
|
| 172 |
**BibTeX:**
|
| 173 |
|
| 174 |
+
@article{Ruehle_2025, title={Natural Language Processing for Automated Workflow and Knowledge Graph Generation in Self-Driving Labs}, DOI={10.1039/D5DD00063G}, journal={DigitalDiscovery}, author={Ruehle, Bastian}, year={2025}}
|
| 175 |
|
| 176 |
@article{doi:10.1021/acsnano.4c17504, author = {Zaki, Mohammad and Prinz, Carsten and Ruehle, Bastian}, title = {A Self-Driving Lab for Nano- and Advanced Materials Synthesis}, journal = {ACS Nano}, volume = {19}, number = {9}, pages = {9029-9041}, year = {2025}, doi = {10.1021/acsnano.4c17504}, note ={PMID: 39995288}, URL = {https://doi.org/10.1021/acsnano.4c17504}, eprint = {https://doi.org/10.1021/acsnano.4c17504}}
|
| 177 |
|
| 178 |
**APA:**
|
| 179 |
|
| 180 |
+
Ruehle, B. (2025). Natural Language Processing for Automated Workflow and Knowledge Graph Generation in Self-Driving Labs. DigitalDiscovery. doi:10.1039/D5DD00063G
|
| 181 |
|
| 182 |
Zaki, M., Prinz, C. & Ruehle, B. (2025). A Self-Driving Lab for Nano- and Advanced Materials Synthesis. ACS Nano, 19(9), 9029-9041. doi:10.1021/acsnano.4c17504
|
| 183 |
|