|
|
--- |
|
|
model-index: |
|
|
- name: layoutlmv3-base-finetuned-rvlcdip |
|
|
results: |
|
|
- task: |
|
|
type: document-image-classification |
|
|
name: document-image-classification |
|
|
dataset: |
|
|
name: rvl-cdip |
|
|
type: amazon-ocr |
|
|
metrics: |
|
|
- type: evaluation_loss |
|
|
value: 0.1856316477060318 |
|
|
name: Evaluation Loss |
|
|
- type: accuracy |
|
|
value: 0.9519237980949524 |
|
|
name: Evaluation Accuracy |
|
|
- type: weighted_f1 |
|
|
value: 0.9518911690649716 |
|
|
name: Evaluation Weighted F1 |
|
|
- type: micro_f1 |
|
|
value: 0.9519237980949524 |
|
|
name: Evaluation Micro F1 |
|
|
- type: macro_f1 |
|
|
value: 0.9518042570370386 |
|
|
name: Evaluation Macro F1 |
|
|
- type: weighted_recall |
|
|
value: 0.9519237980949524 |
|
|
name: Evaluation Weighted Recall |
|
|
- type: micro_recall |
|
|
value: 0.9519237980949524 |
|
|
name: Evaluation Micro Recall |
|
|
- type: macro_recall |
|
|
value: 0.9518171728908463 |
|
|
name: Evaluation Macro Recall |
|
|
- type: weighted_precision |
|
|
value: 0.9519094862975979 |
|
|
name: Evaluation Weighted Precision |
|
|
- type: micro_precision |
|
|
value: 0.9519237980949524 |
|
|
name: Evaluation Micro Precision |
|
|
- type: macro_precision |
|
|
value: 0.9518423447239385 |
|
|
name: Evaluation Macro Precision |
|
|
- type: runtime |
|
|
value: 514.7031 |
|
|
name: Evaluation Runtime (seconds) |
|
|
- type: samples_per_second |
|
|
value: 77.713 |
|
|
name: Evaluation Samples per Second |
|
|
- type: steps_per_second |
|
|
value: 1.214 |
|
|
name: Evaluation Steps per Second |
|
|
|
|
|
--- |
|
|
|
|
|
# layoutlmv3-base-finetuned-rvlcdip |
|
|
|
|
|
This model is a fine-tuned version of microsoft/layoutlmv3-base on the [RVL-CDIP dataset](https://adamharley.com/rvl-cdip/) processed using Amazon OCR. |
|
|
The following metrics were computed on the evaluation set after the final optimization step: |
|
|
|
|
|
* Evaluation Loss: 0.1856316477060318 |
|
|
* Evaluation Accuracy: 0.9519237980949524 |
|
|
* Evaluation Weighted F1: 0.9518911690649716 |
|
|
* Evaluation Micro F1: 0.9519237980949524 |
|
|
* Evaluation Macro F1: 0.9518042570370386 |
|
|
* Evaluation Weighted Recall: 0.9519237980949524 |
|
|
* Evaluation Micro Recall: 0.9519237980949524 |
|
|
* Evaluation Macro Recall: 0.9518171728908463 |
|
|
* Evaluation Weighted Precision: 0.9519094862975979 |
|
|
* Evaluation Micro Precision: 0.9519237980949524 |
|
|
* Evaluation Macro Precision: 0.9518423447239385 |
|
|
* Evaluation Runtime (seconds): 514.7031 |
|
|
* Evaluation Samples per Second: 77.713 |
|
|
* Evaluation Steps per Second: 1.214 |
|
|
|
|
|
## Training logs |
|
|
|
|
|
See wandb report: https://api.wandb.ai/links/gordon-lim/lokqu7ok |
|
|
|
|
|
### Training arguments |
|
|
|
|
|
The following arguments were provided to Trainer: |
|
|
- Output Directory: ./results |
|
|
- Maximum Steps: 20000 |
|
|
- Per Device Train Batch Size: 32 (due to CUDA memory constraints; paper uses 64, trained using 2 GPUs so 32 * 2 effective batch size) |
|
|
- Per Device Evaluation Batch Size: 32 (due to CUDA memory constraints) |
|
|
- Warmup Steps: 0 (not specified in paper, but warmup ratio is used for DocVQA, hence assumed default) |
|
|
- Weight Decay: 0 (not specified in paper for RVL-CDIP, but 0.05 for PubLayNet, hence assumed default) |
|
|
- Evaluation Strategy: steps |
|
|
- Evaluation Steps: 1000 |
|
|
- Evaluate on Start: True |
|
|
- Save Strategy: steps |
|
|
- Save Steps: 1000 |
|
|
- Save Total Limit: 5 |
|
|
- Learning Rate: 2e-5 |
|
|
- Load Best Model at End: True |
|
|
- Metric for Best Model: accuracy |
|
|
- Greater is Better: True |
|
|
- Report to: wandb (log to Weights & Biases) |
|
|
- Logging Steps: 1000 |
|
|
- Logging First Step: True |
|
|
- Learning Rate Scheduler Type: cosine (not mentioned in paper, but PubLayNet GitHub example uses 'cosine') |
|
|
- FP16: True (due to CUDA memory constraints) |
|
|
- Dataloader Number of Workers: 4 (number of subprocesses to use for data loading) |
|
|
- DDP Find Unused Parameters: True |
|
|
|
|
|
### Framework versions |
|
|
|
|
|
- Transformers 4.42.3 |
|
|
- Pytorch 2.2.0+cu121 |
|
|
- Datasets 2.14.0 |
|
|
- Tokenizers 0.19.1 |
|
|
|