calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1029

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.0054 1.0 6 2.2737
2.0384 2.0 12 1.7426
1.5809 3.0 18 1.3404
1.2444 4.0 24 1.0828
1.0174 5.0 30 0.9111
0.8686 6.0 36 0.7906
0.7634 7.0 42 0.7097
0.7201 8.0 48 0.6570
0.6563 9.0 54 0.6026
0.6174 10.0 60 0.5597
0.5637 11.0 66 0.5282
0.5339 12.0 72 0.5110
0.5056 13.0 78 0.4819
0.4833 14.0 84 0.4364
0.4595 15.0 90 0.4270
0.4414 16.0 96 0.4014
0.4358 17.0 102 0.3982
0.3998 18.0 108 0.3680
0.3790 19.0 114 0.3431
0.3564 20.0 120 0.3411
0.3567 21.0 126 0.3193
0.3297 22.0 132 0.2895
0.3156 23.0 138 0.2718
0.2846 24.0 144 0.2467
0.2590 25.0 150 0.2285
0.2476 26.0 156 0.2163
0.2361 27.0 162 0.2084
0.2255 28.0 168 0.1900
0.2156 29.0 174 0.1825
0.2100 30.0 180 0.1861
0.2060 31.0 186 0.1727
0.1867 32.0 192 0.1626
0.1811 33.0 198 0.1422
0.1652 34.0 204 0.1297
0.1540 35.0 210 0.1264
0.1483 36.0 216 0.1190
0.1493 37.0 222 0.1116
0.1373 38.0 228 0.1065
0.1366 39.0 234 0.1040
0.1319 40.0 240 0.1029

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
42
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support