Update README.md
Browse files
README.md
CHANGED
|
@@ -35,7 +35,7 @@ This is a <b>DeBERTa</b> <b>[1]</b> uncased model for the <b>Italian</b> languag
|
|
| 35 |
The model is trained to perform entity recognition over 4 classes: <b>PER</b> (persons), <b>LOC</b> (locations), <b>ORG</b> (organizations), <b>MISC</b> (miscellanea, mainly events, products and services). It has been fine-tuned for Named Entity Recognition, using the WikiNER Italian dataset plus an additional custom dataset of manually annotated Wikipedia paragraphs.
|
| 36 |
The WikiNER dataset has been splitted in 102.352 training instances and 25.588 test instances, and the model has been trained for 1 epoch with a constant learning rate of 1e-5.
|
| 37 |
|
| 38 |
-
The model has been first fine-tuned on WikiNER, then focused on the Italian language and turned to uncased by modifying the embedding layer (as in [3], computing document-level frequencies over the Wikipedia dataset), and lastly fine-tuned on an additional
|
| 39 |
|
| 40 |
<h3>Quick usage</h3>
|
| 41 |
|
|
|
|
| 35 |
The model is trained to perform entity recognition over 4 classes: <b>PER</b> (persons), <b>LOC</b> (locations), <b>ORG</b> (organizations), <b>MISC</b> (miscellanea, mainly events, products and services). It has been fine-tuned for Named Entity Recognition, using the WikiNER Italian dataset plus an additional custom dataset of manually annotated Wikipedia paragraphs.
|
| 36 |
The WikiNER dataset has been splitted in 102.352 training instances and 25.588 test instances, and the model has been trained for 1 epoch with a constant learning rate of 1e-5.
|
| 37 |
|
| 38 |
+
The model has been first fine-tuned on WikiNER, then focused on the Italian language and turned to uncased by modifying the embedding layer (as in [3], computing document-level frequencies over the Wikipedia dataset), and lastly fine-tuned on an additional dataset of ~3.500 manually annotated lowercase paragraphs.
|
| 39 |
|
| 40 |
<h3>Quick usage</h3>
|
| 41 |
|