Update README.md
Browse files
README.md
CHANGED
|
@@ -31,6 +31,12 @@ The TweeTaal-en-nl model has been fine-tuned on Dutch-English & English-Dutch tr
|
|
| 31 |
- Cross-lingual communication tools
|
| 32 |
- Educational language learning applications
|
| 33 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
## Training Details
|
| 35 |
|
| 36 |
### Training Procedure
|
|
@@ -91,25 +97,6 @@ Recommended generation parameters:
|
|
| 91 |
- **Max tokens**: Set based on expected translation length
|
| 92 |
- **Top-p**: 0.9 (nucleus sampling)
|
| 93 |
|
| 94 |
-
## Performance
|
| 95 |
-
|
| 96 |
-
### Benchmark Results
|
| 97 |
-
|
| 98 |
-
##### WMT-2024 Translations (Dutch-English)
|
| 99 |
-
|
| 100 |
-
| Metric | WMT-24 (Finetuned)| WMT-24 (Base)|
|
| 101 |
-
|--------|-------------------| -------------|
|
| 102 |
-
| BLEU | 46.3 | 32.1 |
|
| 103 |
-
| Rouge | 73.1 | 58.3 |
|
| 104 |
-
|
| 105 |
-
##### Long Context SQuAD Translations (Dutch-English)
|
| 106 |
-
|
| 107 |
-
| Metric | TweeTaal | Base |
|
| 108 |
-
|--------|-------- | -----|
|
| 109 |
-
| BLEU | 58.3 | |
|
| 110 |
-
| Rouge | 77.9 | |
|
| 111 |
-
|
| 112 |
-

|
| 113 |
|
| 114 |
## Limitations
|
| 115 |
|
|
|
|
| 31 |
- Cross-lingual communication tools
|
| 32 |
- Educational language learning applications
|
| 33 |
|
| 34 |
+
## Performance
|
| 35 |
+
|
| 36 |
+
### Benchmark Results
|
| 37 |
+
|
| 38 |
+
<img src="https://github.com/OpenOranje/content/raw/main/images/translation-benchmarks.png" alt="Benchmarks" width="700">
|
| 39 |
+
|
| 40 |
## Training Details
|
| 41 |
|
| 42 |
### Training Procedure
|
|
|
|
| 97 |
- **Max tokens**: Set based on expected translation length
|
| 98 |
- **Top-p**: 0.9 (nucleus sampling)
|
| 99 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 100 |
|
| 101 |
## Limitations
|
| 102 |
|