|
|
Applying chat template to train dataset (num_proc=160): 0%| | 0/105170 [00:00<?, ? examples/s]
Applying chat template to train dataset (num_proc=160): 0%| | 50/105170 [00:04<2:30:25, 11.65 examples/s]
Applying chat template to train dataset (num_proc=160): 1%|β | 675/105170 [00:04<08:15, 210.77 examples/s]
Applying chat template to train dataset (num_proc=160): 2%|β | 1938/105170 [00:04<02:17, 750.78 examples/s]
Applying chat template to train dataset (num_proc=160): 3%|β | 3050/105170 [00:04<01:16, 1341.84 examples/s]
Applying chat template to train dataset (num_proc=160): 5%|ββ | 5273/105170 [00:04<00:34, 2935.94 examples/s]
Applying chat template to train dataset (num_proc=160): 6%|ββ | 6618/105170 [00:04<00:25, 3798.52 examples/s]
Applying chat template to train dataset (num_proc=160): 9%|βββ | 8989/105170 [00:05<00:16, 5978.80 examples/s]
Applying chat template to train dataset (num_proc=160): 11%|ββββ | 11951/105170 [00:05<00:10, 9212.12 examples/s]
Applying chat template to train dataset (num_proc=160): 13%|ββββ | 14157/105170 [00:05<00:08, 11311.43 examples/s]
Applying chat template to train dataset (num_proc=160): 15%|βββββ | 16107/105170 [00:05<00:07, 12101.63 examples/s]
Applying chat template to train dataset (num_proc=160): 17%|βββββ | 17962/105170 [00:05<00:06, 12748.17 examples/s]
Applying chat template to train dataset (num_proc=160): 20%|ββββββ | 20806/105170 [00:05<00:05, 14755.22 examples/s]
Applying chat template to train dataset (num_proc=160): 22%|ββββββ | 22626/105170 [00:05<00:05, 13836.62 examples/s]
Applying chat template to train dataset (num_proc=160): 23%|βββββββ | 24225/105170 [00:05<00:05, 13621.09 examples/s]
Applying chat template to train dataset (num_proc=160): 24%|βββββββ | 25754/105170 [00:06<00:06, 13176.93 examples/s]
Applying chat template to train dataset (num_proc=160): 27%|ββββββββ | 28443/105170 [00:06<00:04, 15705.45 examples/s]
Applying chat template to train dataset (num_proc=160): 29%|ββββββββ | 30166/105170 [00:06<00:04, 15903.40 examples/s]
Applying chat template to train dataset (num_proc=160): 30%|βββββββββ | 31893/105170 [00:06<00:05, 13227.29 examples/s]
Applying chat template to train dataset (num_proc=160): 33%|βββββββββ | 34205/105170 [00:06<00:04, 15487.94 examples/s]
Applying chat template to train dataset (num_proc=160): 34%|ββββββββββ | 35947/105170 [00:06<00:05, 13727.27 examples/s]
Applying chat template to train dataset (num_proc=160): 36%|ββββββββββ | 37785/105170 [00:06<00:04, 13706.64 examples/s]
Applying chat template to train dataset (num_proc=160): 39%|βββββββββββ | 41145/105170 [00:06<00:03, 17628.22 examples/s]
Applying chat template to train dataset (num_proc=160): 41%|βββββββββββ | 43030/105170 [00:07<00:03, 16625.80 examples/s]
Applying chat template to train dataset (num_proc=160): 43%|ββββββββββββ | 44862/105170 [00:07<00:04, 14248.72 examples/s]
Applying chat template to train dataset (num_proc=160): 44%|ββββββββββββ | 46451/105170 [00:07<00:04, 12630.48 examples/s]
Applying chat template to train dataset (num_proc=160): 46%|βββββββββββββ | 48782/105170 [00:07<00:03, 14955.29 examples/s]
Applying chat template to train dataset (num_proc=160): 48%|βββββββββββββ | 50448/105170 [00:07<00:03, 15266.67 examples/s]
Applying chat template to train dataset (num_proc=160): 50%|ββββββββββββββ | 52509/105170 [00:07<00:03, 16492.22 examples/s]
Applying chat template to train dataset (num_proc=160): 52%|ββββββββββββββ | 54268/105170 [00:07<00:03, 13054.62 examples/s]
Applying chat template to train dataset (num_proc=160): 53%|βββββββββββββββ | 55781/105170 [00:08<00:03, 12473.00 examples/s]
Applying chat template to train dataset (num_proc=160): 55%|βββββββββββββββ | 57497/105170 [00:08<00:03, 12879.94 examples/s]
Applying chat template to train dataset (num_proc=160): 56%|βββββββββββββββ | 58880/105170 [00:08<00:03, 12737.76 examples/s]
Applying chat template to train dataset (num_proc=160): 59%|ββββββββββββββββ | 61705/105170 [00:08<00:02, 16520.53 examples/s]
Applying chat template to train dataset (num_proc=160): 60%|βββββββββββββββββ | 63508/105170 [00:08<00:02, 16895.09 examples/s]
Applying chat template to train dataset (num_proc=160): 62%|βββββββββββββββββ | 65430/105170 [00:08<00:02, 16404.91 examples/s]
Applying chat template to train dataset (num_proc=160): 65%|ββββββββββββββββββ | 68209/105170 [00:08<00:02, 17411.58 examples/s]
Applying chat template to train dataset (num_proc=160): 67%|ββββββββββββββββββ | 70007/105170 [00:08<00:02, 15183.55 examples/s]
Applying chat template to train dataset (num_proc=160): 68%|βββββββββββββββββββ | 71610/105170 [00:09<00:02, 14578.54 examples/s]
Applying chat template to train dataset (num_proc=160): 70%|βββββββββββββββββββ | 73807/105170 [00:09<00:02, 14997.86 examples/s]
Applying chat template to train dataset (num_proc=160): 72%|ββββββββββββββββββββ | 75370/105170 [00:09<00:02, 13514.99 examples/s]
Applying chat template to train dataset (num_proc=160): 73%|ββββββββββββββββββββ | 76794/105170 [00:09<00:02, 13564.88 examples/s]
Applying chat template to train dataset (num_proc=160): 74%|ββββββββββββββββββββ | 78181/105170 [00:09<00:02, 12037.23 examples/s]
Applying chat template to train dataset (num_proc=160): 77%|βββββββββββββββββββββ | 81009/105170 [00:09<00:01, 15904.44 examples/s]
Applying chat template to train dataset (num_proc=160): 79%|ββββββββββββββββββββββ | 82739/105170 [00:09<00:01, 12437.93 examples/s]
Applying chat template to train dataset (num_proc=160): 80%|ββββββββββββββββββββββ | 84495/105170 [00:10<00:01, 12873.48 examples/s]
Applying chat template to train dataset (num_proc=160): 82%|βββββββββββββββββββββββ | 86312/105170 [00:10<00:01, 13198.06 examples/s]
Applying chat template to train dataset (num_proc=160): 85%|βββββββββββββββββββββββ | 89292/105170 [00:10<00:00, 17001.18 examples/s]
Applying chat template to train dataset (num_proc=160): 87%|ββββββββββββββββββββββββ | 91165/105170 [00:10<00:00, 16199.78 examples/s]
Applying chat template to train dataset (num_proc=160): 88%|ββββββββββββββββββββββββ | 92909/105170 [00:10<00:00, 14763.83 examples/s]
Applying chat template to train dataset (num_proc=160): 91%|βββββββββββββββββββββββββ | 96038/105170 [00:10<00:00, 18747.94 examples/s]
Applying chat template to train dataset (num_proc=160): 93%|ββββββββββββββββββββββββββ | 98086/105170 [00:10<00:00, 18120.33 examples/s]
Applying chat template to train dataset (num_proc=160): 95%|βββββββββββββββββββββββββ | 100039/105170 [00:10<00:00, 16410.97 examples/s]
Applying chat template to train dataset (num_proc=160): 97%|ββββββββββββββββββββββββββ| 101785/105170 [00:11<00:00, 14486.47 examples/s]
Applying chat template to train dataset (num_proc=160): 98%|ββββββββββββββββββββββββββ| 103358/105170 [00:11<00:00, 13775.91 examples/s]
Applying chat template to train dataset (num_proc=160): 100%|ββββββββββββββββββββββββββ| 104843/105170 [00:11<00:00, 10274.28 examples/s]
Applying chat template to train dataset (num_proc=160): 100%|βββββββββββββββββββββββββββ| 105170/105170 [00:12<00:00, 8452.21 examples/s] |
|
|
Tokenizing train dataset (num_proc=160): 0%| | 0/105170 [00:00<?, ? examples/s]
Tokenizing train dataset (num_proc=160): 0%| | 10/105170 [00:06<17:47:56, 1.64 examples/s]
Tokenizing train dataset (num_proc=160): 0%| | 32/105170 [00:06<4:24:34, 6.62 examples/s]
Tokenizing train dataset (num_proc=160): 0%| | 56/105170 [00:06<2:05:39, 13.94 examples/s]
Tokenizing train dataset (num_proc=160): 0%| | 116/105170 [00:06<45:03, 38.85 examples/s]
Tokenizing train dataset (num_proc=160): 0%| | 257/105170 [00:06<15:24, 113.44 examples/s]
Tokenizing train dataset (num_proc=160): 0%|β | 307/105170 [00:06<12:32, 139.32 examples/s]
Tokenizing train dataset (num_proc=160): 1%|β | 581/105170 [00:06<04:51, 358.72 examples/s]
Tokenizing train dataset (num_proc=160): 1%|β | 897/105170 [00:07<02:55, 593.51 examples/s]
Tokenizing train dataset (num_proc=160): 1%|β | 1198/105170 [00:07<01:59, 869.76 examples/s]
Tokenizing train dataset (num_proc=160): 2%|β | 1769/105170 [00:07<01:07, 1535.37 examples/s]
Tokenizing train dataset (num_proc=160): 2%|β | 2024/105170 [00:07<01:00, 1706.84 examples/s]
Tokenizing train dataset (num_proc=160): 2%|β | 2549/105170 [00:07<00:43, 2384.35 examples/s]
Tokenizing train dataset (num_proc=160): 3%|ββ | 2884/105170 [00:07<00:40, 2545.10 examples/s]
Tokenizing train dataset (num_proc=160): 3%|ββ | 3320/105170 [00:07<00:39, 2563.07 examples/s]
Tokenizing train dataset (num_proc=160): 4%|ββ | 3682/105170 [00:07<00:36, 2798.12 examples/s]
Tokenizing train dataset (num_proc=160): 4%|ββ | 4242/105170 [00:08<00:29, 3474.91 examples/s]
Tokenizing train dataset (num_proc=160): 5%|ββ | 4846/105170 [00:08<00:33, 3020.68 examples/s]
Tokenizing train dataset (num_proc=160): 5%|βββ | 5699/105170 [00:08<00:28, 3521.63 examples/s]
Tokenizing train dataset (num_proc=160): 6%|βββ | 6152/105170 [00:08<00:29, 3345.92 examples/s]
Tokenizing train dataset (num_proc=160): 6%|βββ | 6677/105170 [00:08<00:26, 3738.30 examples/s]
Tokenizing train dataset (num_proc=160): 7%|βββ | 7316/105170 [00:08<00:22, 4337.63 examples/s]
Tokenizing train dataset (num_proc=160): 7%|ββββ | 7793/105170 [00:09<00:35, 2726.37 examples/s]
Tokenizing train dataset (num_proc=160): 8%|ββββ | 8167/105170 [00:09<00:38, 2540.90 examples/s]
Tokenizing train dataset (num_proc=160): 8%|ββββ | 8496/105170 [00:09<00:37, 2549.51 examples/s]
Tokenizing train dataset (num_proc=160): 9%|ββββ | 9005/105170 [00:09<00:32, 2994.71 examples/s]
Tokenizing train dataset (num_proc=160): 9%|ββββ | 9537/105170 [00:09<00:31, 3079.02 examples/s]
Tokenizing train dataset (num_proc=160): 9%|βββββ | 9899/105170 [00:09<00:29, 3194.26 examples/s]
Tokenizing train dataset (num_proc=160): 10%|βββββ | 10249/105170 [00:10<00:39, 2401.80 examples/s]
Tokenizing train dataset (num_proc=160): 10%|βββββ | 10535/105170 [00:10<00:42, 2228.54 examples/s]
Tokenizing train dataset (num_proc=160): 10%|βββββ | 10790/105170 [00:10<01:02, 1506.25 examples/s]
Tokenizing train dataset (num_proc=160): 10%|βββββ | 10989/105170 [00:10<01:00, 1561.95 examples/s]
Tokenizing train dataset (num_proc=160): 11%|βββββ | 11186/105170 [00:11<01:33, 1000.43 examples/s]
Tokenizing train dataset (num_proc=160): 11%|βββββ | 11353/105170 [00:11<01:36, 972.19 examples/s]
Tokenizing train dataset (num_proc=160): 11%|βββββ | 11501/105170 [00:11<01:43, 902.76 examples/s]
Tokenizing train dataset (num_proc=160): 11%|βββββ | 11616/105170 [00:11<01:43, 903.41 examples/s]
Tokenizing train dataset (num_proc=160): 11%|βββββ | 11730/105170 [00:11<01:41, 916.36 examples/s]
Tokenizing train dataset (num_proc=160): 11%|βββββ | 12013/105170 [00:11<01:17, 1199.45 examples/s]
Tokenizing train dataset (num_proc=160): 12%|βββββ | 12149/105170 [00:12<01:20, 1162.14 examples/s]
Tokenizing train dataset (num_proc=160): 12%|βββββ | 12366/105170 [00:12<01:12, 1285.80 examples/s]
Tokenizing train dataset (num_proc=160): 12%|ββββββ | 12704/105170 [00:12<00:56, 1633.30 examples/s]
Tokenizing train dataset (num_proc=160): 12%|ββββββ | 12876/105170 [00:12<00:59, 1551.22 examples/s]
Tokenizing train dataset (num_proc=160): 12%|ββββββ | 13119/105170 [00:12<00:56, 1632.35 examples/s]
Tokenizing train dataset (num_proc=160): 13%|ββββββ | 13380/105170 [00:12<00:54, 1681.67 examples/s]
Tokenizing train dataset (num_proc=160): 13%|ββββββ | 13562/105170 [00:12<00:56, 1611.60 examples/s]
Tokenizing train dataset (num_proc=160): 13%|ββββββ | 13808/105170 [00:13<00:56, 1603.03 examples/s]
Tokenizing train dataset (num_proc=160): 13%|ββββββ | 14061/105170 [00:13<01:01, 1493.05 examples/s]
Tokenizing train dataset (num_proc=160): 14%|ββββββ | 14266/105170 [00:13<01:08, 1320.66 examples/s]
Tokenizing train dataset (num_proc=160): 14%|ββββββ | 14516/105170 [00:13<01:09, 1313.18 examples/s]
Tokenizing train dataset (num_proc=160): 14%|βββββββ | 15086/105170 [00:13<00:43, 2083.48 examples/s]
Tokenizing train dataset (num_proc=160): 15%|βββββββ | 15328/105170 [00:13<00:43, 2047.49 examples/s]
Tokenizing train dataset (num_proc=160): 15%|βββββββ | 15564/105170 [00:13<00:42, 2090.18 examples/s]
Tokenizing train dataset (num_proc=160): 15%|βββββββ | 15792/105170 [00:14<00:43, 2065.01 examples/s]
Tokenizing train dataset (num_proc=160): 15%|βββββββ | 16009/105170 [00:14<00:55, 1599.65 examples/s]
Tokenizing train dataset (num_proc=160): 15%|βββββββ | 16198/105170 [00:14<01:57, 756.36 examples/s]
Tokenizing train dataset (num_proc=160): 16%|βββββββ | 16339/105170 [00:15<01:59, 745.22 examples/s]
Tokenizing train dataset (num_proc=160): 16%|βββββββ | 16537/105170 [00:15<01:48, 818.59 examples/s]
Tokenizing train dataset (num_proc=160): 16%|βββββββ | 16809/105170 [00:15<01:33, 944.74 examples/s]
Tokenizing train dataset (num_proc=160): 16%|βββββββ | 17000/105170 [00:15<01:21, 1081.32 examples/s]
Tokenizing train dataset (num_proc=160): 16%|ββββββββ | 17326/105170 [00:16<01:27, 999.61 examples/s]
Tokenizing train dataset (num_proc=160): 17%|ββββββββ | 17485/105170 [00:16<01:46, 824.02 examples/s]
Tokenizing train dataset (num_proc=160): 17%|ββββββββ | 17733/105170 [00:16<01:42, 854.27 examples/s]
Tokenizing train dataset (num_proc=160): 17%|ββββββββ | 17843/105170 [00:16<01:50, 793.28 examples/s]
Tokenizing train dataset (num_proc=160): 17%|ββββββββ | 18174/105170 [00:16<01:21, 1071.36 examples/s]
Tokenizing train dataset (num_proc=160): 17%|ββββββββ | 18361/105170 [00:17<01:12, 1201.67 examples/s]
Tokenizing train dataset (num_proc=160): 18%|ββββββββ | 18541/105170 [00:17<01:06, 1303.76 examples/s]
Tokenizing train dataset (num_proc=160): 18%|ββββββββ | 18702/105170 [00:17<01:15, 1145.99 examples/s]
Tokenizing train dataset (num_proc=160): 18%|ββββββββ | 18836/105170 [00:17<01:28, 973.70 examples/s]
Tokenizing train dataset (num_proc=160): 18%|ββββββββ | 19037/105170 [00:17<01:15, 1140.69 examples/s]
Tokenizing train dataset (num_proc=160): 18%|ββββββββ | 19280/105170 [00:17<01:02, 1383.82 examples/s]
Tokenizing train dataset (num_proc=160): 19%|ββββββββ | 19550/105170 [00:17<00:50, 1682.84 examples/s]
Tokenizing train dataset (num_proc=160): 19%|ββββββββ | 19746/105170 [00:18<00:56, 1520.65 examples/s]
Tokenizing train dataset (num_proc=160): 19%|βββββββββ | 19919/105170 [00:18<00:58, 1445.12 examples/s]
Tokenizing train dataset (num_proc=160): 19%|βββββββββ | 20078/105170 [00:18<01:03, 1337.83 examples/s]
Tokenizing train dataset (num_proc=160): 19%|βββββββββ | 20222/105170 [00:18<01:06, 1273.43 examples/s]
Tokenizing train dataset (num_proc=160): 19%|βββββββββ | 20357/105170 [00:18<01:11, 1188.40 examples/s]
Tokenizing train dataset (num_proc=160): 19%|βββββββββ | 20494/105170 [00:18<01:11, 1180.03 examples/s]
Tokenizing train dataset (num_proc=160): 20%|βββββββββ | 20815/105170 [00:19<01:12, 1160.02 examples/s]
Tokenizing train dataset (num_proc=160): 20%|βββββββββ | 20934/105170 [00:19<02:04, 675.69 examples/s]
Tokenizing train dataset (num_proc=160): 20%|βββββββββ | 21191/105170 [00:19<01:59, 701.94 examples/s]
Tokenizing train dataset (num_proc=160): 20%|βββββββββ | 21407/105170 [00:20<02:05, 667.83 examples/s]
Tokenizing train dataset (num_proc=160): 21%|βββββββββ | 21653/105170 [00:20<01:57, 710.42 examples/s]
Tokenizing train dataset (num_proc=160): 21%|βββββββββ | 21743/105170 [00:20<02:07, 656.76 examples/s]
Tokenizing train dataset (num_proc=160): 21%|ββββββββββ | 21918/105170 [00:20<01:54, 726.36 examples/s]
Tokenizing train dataset (num_proc=160): 21%|ββββββββββ | 22238/105170 [00:21<01:28, 932.79 examples/s]
Tokenizing train dataset (num_proc=160): 21%|ββββββββββ | 22364/105170 [00:21<01:31, 900.12 examples/s]
Tokenizing train dataset (num_proc=160): 21%|ββββββββββ | 22533/105170 [00:21<01:23, 994.27 examples/s]
Tokenizing train dataset (num_proc=160): 22%|ββββββββββ | 22784/105170 [00:21<01:10, 1163.91 examples/s]
Tokenizing train dataset (num_proc=160): 22%|ββββββββββ | 22983/105170 [00:21<01:04, 1274.29 examples/s]
Tokenizing train dataset (num_proc=160): 22%|ββββββββββ | 23124/105170 [00:21<01:26, 950.38 examples/s]
Tokenizing train dataset (num_proc=160): 22%|ββββββββββ | 23267/105170 [00:22<01:20, 1014.10 examples/s]
Tokenizing train dataset (num_proc=160): 22%|ββββββββββ | 23482/105170 [00:22<01:13, 1111.32 examples/s]
Tokenizing train dataset (num_proc=160): 22%|ββββββββββ | 23604/105170 [00:22<01:19, 1032.16 examples/s]
Tokenizing train dataset (num_proc=160): 23%|ββββββββββ | 23714/105170 [00:22<01:42, 796.32 examples/s]
Tokenizing train dataset (num_proc=160): 23%|ββββββββββ | 23823/105170 [00:22<01:44, 775.80 examples/s]
Tokenizing train dataset (num_proc=160): 23%|ββββββββββ | 23981/105170 [00:22<01:33, 864.36 examples/s]
Tokenizing train dataset (num_proc=160): 23%|ββββββββββ | 24162/105170 [00:23<01:29, 907.95 examples/s]
Tokenizing train dataset (num_proc=160): 23%|ββββββββββ | 24391/105170 [00:23<01:13, 1102.93 examples/s]
Tokenizing train dataset (num_proc=160): 23%|ββββββββββ | 24582/105170 [00:23<01:10, 1141.02 examples/s]
Tokenizing train dataset (num_proc=160): 23%|ββββββββββ | 24703/105170 [00:23<01:14, 1076.54 examples/s]
Tokenizing train dataset (num_proc=160): 24%|βββββββββββ | 24819/105170 [00:23<01:23, 960.79 examples/s]
Tokenizing train dataset (num_proc=160): 24%|βββββββββββ | 24949/105170 [00:23<01:23, 957.49 examples/s]
Tokenizing train dataset (num_proc=160): 24%|βββββββββββ | 25256/105170 [00:23<00:56, 1415.02 examples/s]
Tokenizing train dataset (num_proc=160): 24%|βββββββββββ | 25441/105170 [00:24<00:53, 1502.64 examples/s]
Tokenizing train dataset (num_proc=160): 24%|βββββββββββ | 25622/105170 [00:24<00:53, 1498.80 examples/s]
Tokenizing train dataset (num_proc=160): 25%|βββββββββββ | 25781/105170 [00:24<00:54, 1452.65 examples/s]
Tokenizing train dataset (num_proc=160): 25%|βββββββββββ | 25964/105170 [00:24<01:01, 1281.67 examples/s]
Tokenizing train dataset (num_proc=160): 25%|βββββββββββ | 26105/105170 [00:24<01:11, 1101.67 examples/s]
Tokenizing train dataset (num_proc=160): 25%|βββββββββββ | 26326/105170 [00:24<00:58, 1342.69 examples/s]
Tokenizing train dataset (num_proc=160): 25%|βββββββββββ | 26477/105170 [00:25<01:26, 911.55 examples/s]
Tokenizing train dataset (num_proc=160): 25%|ββββββββββββ | 26599/105170 [00:25<01:37, 802.05 examples/s]
Tokenizing train dataset (num_proc=160): 26%|βββββββββββ | 26863/105170 [00:25<01:11, 1092.16 examples/s]
Tokenizing train dataset (num_proc=160): 26%|βββββββββββ | 27099/105170 [00:25<01:03, 1233.73 examples/s]
Tokenizing train dataset (num_proc=160): 26%|ββββββββββββ | 27398/105170 [00:25<00:55, 1402.88 examples/s]
Tokenizing train dataset (num_proc=160): 26%|ββββββββββββ | 27554/105170 [00:25<00:57, 1351.88 examples/s]
Tokenizing train dataset (num_proc=160): 26%|ββββββββββββ | 27699/105170 [00:25<00:56, 1368.98 examples/s]
Tokenizing train dataset (num_proc=160): 27%|ββββββββββββ | 27962/105170 [00:26<00:48, 1595.83 examples/s]
Tokenizing train dataset (num_proc=160): 27%|ββββββββββββ | 28344/105170 [00:26<00:36, 2115.14 examples/s]
Tokenizing train dataset (num_proc=160): 27%|ββββββββββββ | 28574/105170 [00:26<01:06, 1155.71 examples/s]
Tokenizing train dataset (num_proc=160): 28%|ββββββββββββ | 29556/105170 [00:26<00:29, 2568.18 examples/s]
Tokenizing train dataset (num_proc=160): 30%|βββββββββββββ | 31856/105170 [00:26<00:11, 6432.02 examples/s]
Tokenizing train dataset (num_proc=160): 32%|ββββββββββββββ | 33316/105170 [00:26<00:08, 8155.63 examples/s]
Tokenizing train dataset (num_proc=160): 33%|ββββββββββββββ | 34439/105170 [00:27<00:08, 7881.01 examples/s]
Tokenizing train dataset (num_proc=160): 34%|βββββββββββββββ | 35442/105170 [00:27<00:08, 7904.14 examples/s]
Tokenizing train dataset (num_proc=160): 35%|βββββββββββββββ | 36388/105170 [00:27<00:08, 8146.29 examples/s]
Tokenizing train dataset (num_proc=160): 37%|ββββββββββββββββ | 39383/105170 [00:27<00:04, 13491.93 examples/s]
Tokenizing train dataset (num_proc=160): 39%|βββββββββββββββββ | 41266/105170 [00:27<00:04, 14882.03 examples/s]
Tokenizing train dataset (num_proc=160): 42%|ββββββββββββββββββ | 44153/105170 [00:27<00:03, 18649.57 examples/s]
Tokenizing train dataset (num_proc=160): 46%|βββββββββββββββββββ | 47866/105170 [00:27<00:02, 23808.81 examples/s]
Tokenizing train dataset (num_proc=160): 50%|βββββββββββββββββββββ | 52351/105170 [00:27<00:01, 29802.64 examples/s]
Tokenizing train dataset (num_proc=160): 53%|βββββββββββββββββββββββ | 55542/105170 [00:27<00:01, 29908.27 examples/s]
Tokenizing train dataset (num_proc=160): 56%|ββββββββββββββββββββββββ | 58625/105170 [00:28<00:01, 25757.21 examples/s]
Tokenizing train dataset (num_proc=160): 60%|ββββββββββββββββββββββββββ | 63427/105170 [00:28<00:01, 31557.28 examples/s]
Tokenizing train dataset (num_proc=160): 66%|ββββββββββββββββββββββββββββ | 69218/105170 [00:28<00:00, 38704.85 examples/s]
Tokenizing train dataset (num_proc=160): 75%|ββββββββββββββββββββββββββββββββ | 78776/105170 [00:28<00:00, 54475.47 examples/s]
Tokenizing train dataset (num_proc=160): 86%|ββββββββββββββββββββββββββββββββββββ | 90193/105170 [00:28<00:00, 70913.96 examples/s]
Tokenizing train dataset (num_proc=160): 93%|βββββββββββββββββββββββββββββββββββββββ | 97524/105170 [00:28<00:00, 69379.22 examples/s]
Tokenizing train dataset (num_proc=160): 100%|ββββββββββββββββββββββββββββββββββββββββββ| 105170/105170 [00:29<00:00, 3577.76 examples/s] |