calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.0054	1.0	6	2.2737
2.0384	2.0	12	1.7426
1.5809	3.0	18	1.3404
1.2444	4.0	24	1.0828
1.0174	5.0	30	0.9111
0.8686	6.0	36	0.7906
0.7634	7.0	42	0.7097
0.7201	8.0	48	0.6570
0.6563	9.0	54	0.6026
0.6174	10.0	60	0.5597
0.5637	11.0	66	0.5282
0.5339	12.0	72	0.5110
0.5056	13.0	78	0.4819
0.4833	14.0	84	0.4364
0.4595	15.0	90	0.4270
0.4414	16.0	96	0.4014
0.4358	17.0	102	0.3982
0.3998	18.0	108	0.3680
0.3790	19.0	114	0.3431
0.3564	20.0	120	0.3411
0.3567	21.0	126	0.3193
0.3297	22.0	132	0.2895
0.3156	23.0	138	0.2718
0.2846	24.0	144	0.2467
0.2590	25.0	150	0.2285
0.2476	26.0	156	0.2163
0.2361	27.0	162	0.2084
0.2255	28.0	168	0.1900
0.2156	29.0	174	0.1825
0.2100	30.0	180	0.1861
0.2060	31.0	186	0.1727
0.1867	32.0	192	0.1626
0.1811	33.0	198	0.1422
0.1652	34.0	204	0.1297
0.1540	35.0	210	0.1264
0.1483	36.0	216	0.1190
0.1493	37.0	222	0.1116
0.1373	38.0	228	0.1065
0.1366	39.0	234	0.1040
0.1319	40.0	240	0.1029

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support