leonardlin's picture
Upload folder using huggingface_hub
69710b3 verified
[2025-11-06 08:51:25,486] [WARNING] [py.warnings._showwarnmsg:112] [PID:3076758] /root/miniforge3/envs/axolotl/lib/python3.12/site-packages/deepspeed/runtime/zero/partition_parameters.py:240: UserWarning: expandable_segments not supported on this platform (Triggered internally at /pytorch/c10/hip/HIPAllocatorConfig.h:36.)
tensor: Tensor = fn(*args, **kwargs)
[2025-11-06 08:51:32,980] [WARNING] [py.warnings._showwarnmsg:112] [PID:3076758] /root/miniforge3/envs/axolotl/lib/python3.12/site-packages/torch/distributed/distributed_c10d.py:4807: UserWarning: No device id is provided via `init_process_group` or `barrier `. Using the current device set by the user.
warnings.warn( # warn only once
Extracting prompt in train dataset (num_proc=160): 0%| | 0/105170 [00:00<?, ? examples/s] Extracting prompt in train dataset (num_proc=160): 0%| | 110/105170 [00:03<1:03:35, 27.53 examples/s] Extracting prompt in train dataset (num_proc=160): 4%|β–ˆβ–Ž | 3951/105170 [00:04<01:15, 1338.94 examples/s] Extracting prompt in train dataset (num_proc=160): 6%|β–ˆβ–‰ | 5842/105170 [00:04<00:46, 2139.82 examples/s] Extracting prompt in train dataset (num_proc=160): 7%|β–ˆβ–ˆβ– | 7581/105170 [00:04<00:32, 3039.11 examples/s] Extracting prompt in train dataset (num_proc=160): 9%|β–ˆβ–ˆβ–ˆ | 9307/105170 [00:04<00:24, 3949.89 examples/s] Extracting prompt in train dataset (num_proc=160): 13%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 13382/105170 [00:04<00:12, 7534.85 examples/s] Extracting prompt in train dataset (num_proc=160): 16%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 17022/105170 [00:04<00:07, 11051.92 examples/s] Extracting prompt in train dataset (num_proc=160): 19%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 19824/105170 [00:04<00:06, 12624.91 examples/s] Extracting prompt in train dataset (num_proc=160): 21%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 22283/105170 [00:05<00:06, 13352.98 examples/s] Extracting prompt in train dataset (num_proc=160): 23%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 24460/105170 [00:05<00:05, 13662.30 examples/s] Extracting prompt in train dataset (num_proc=160): 25%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 26478/105170 [00:05<00:05, 13868.55 examples/s] Extracting prompt in train dataset (num_proc=160): 28%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 29156/105170 [00:05<00:04, 16444.30 examples/s] Extracting prompt in train dataset (num_proc=160): 30%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 31230/105170 [00:05<00:04, 16927.48 examples/s] Extracting prompt in train dataset (num_proc=160): 32%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 33410/105170 [00:05<00:04, 14539.05 examples/s] Extracting prompt in train dataset (num_proc=160): 49%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 51665/105170 [00:05<00:01, 50088.28 examples/s] Extracting prompt in train dataset (num_proc=160): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 69590/105170 [00:05<00:00, 79793.06 examples/s] Extracting prompt in train dataset (num_proc=160): 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 94350/105170 [00:06<00:00, 121502.26 examples/s] Extracting prompt in train dataset (num_proc=160): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 105170/105170 [00:06<00:00, 15164.58 examples/s]
Applying chat template to train dataset (num_proc=160): 0%| | 0/105170 [00:00<?, ? examples/s] Applying chat template to train dataset (num_proc=160): 0%| | 50/105170 [00:04<2:30:25, 11.65 examples/s] Applying chat template to train dataset (num_proc=160): 1%|▏ | 675/105170 [00:04<08:15, 210.77 examples/s] Applying chat template to train dataset (num_proc=160): 2%|β–Œ | 1938/105170 [00:04<02:17, 750.78 examples/s] Applying chat template to train dataset (num_proc=160): 3%|β–Š | 3050/105170 [00:04<01:16, 1341.84 examples/s] Applying chat template to train dataset (num_proc=160): 5%|β–ˆβ– | 5273/105170 [00:04<00:34, 2935.94 examples/s] Applying chat template to train dataset (num_proc=160): 6%|β–ˆβ–Š | 6618/105170 [00:04<00:25, 3798.52 examples/s] Applying chat template to train dataset (num_proc=160): 9%|β–ˆβ–ˆβ– | 8989/105170 [00:05<00:16, 5978.80 examples/s] Applying chat template to train dataset (num_proc=160): 11%|β–ˆβ–ˆβ–ˆβ– | 11951/105170 [00:05<00:10, 9212.12 examples/s] Applying chat template to train dataset (num_proc=160): 13%|β–ˆβ–ˆβ–ˆβ–‹ | 14157/105170 [00:05<00:08, 11311.43 examples/s] Applying chat template to train dataset (num_proc=160): 15%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 16107/105170 [00:05<00:07, 12101.63 examples/s] Applying chat template to train dataset (num_proc=160): 17%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 17962/105170 [00:05<00:06, 12748.17 examples/s] Applying chat template to train dataset (num_proc=160): 20%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 20806/105170 [00:05<00:05, 14755.22 examples/s] Applying chat template to train dataset (num_proc=160): 22%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 22626/105170 [00:05<00:05, 13836.62 examples/s] Applying chat template to train dataset (num_proc=160): 23%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 24225/105170 [00:05<00:05, 13621.09 examples/s] Applying chat template to train dataset (num_proc=160): 24%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 25754/105170 [00:06<00:06, 13176.93 examples/s] Applying chat template to train dataset (num_proc=160): 27%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 28443/105170 [00:06<00:04, 15705.45 examples/s] Applying chat template to train dataset (num_proc=160): 29%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 30166/105170 [00:06<00:04, 15903.40 examples/s] Applying chat template to train dataset (num_proc=160): 30%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 31893/105170 [00:06<00:05, 13227.29 examples/s] Applying chat template to train dataset (num_proc=160): 33%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 34205/105170 [00:06<00:04, 15487.94 examples/s] Applying chat template to train dataset (num_proc=160): 34%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 35947/105170 [00:06<00:05, 13727.27 examples/s] Applying chat template to train dataset (num_proc=160): 36%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 37785/105170 [00:06<00:04, 13706.64 examples/s] Applying chat template to train dataset (num_proc=160): 39%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 41145/105170 [00:06<00:03, 17628.22 examples/s] Applying chat template to train dataset (num_proc=160): 41%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 43030/105170 [00:07<00:03, 16625.80 examples/s] Applying chat template to train dataset (num_proc=160): 43%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 44862/105170 [00:07<00:04, 14248.72 examples/s] Applying chat template to train dataset (num_proc=160): 44%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 46451/105170 [00:07<00:04, 12630.48 examples/s] Applying chat template to train dataset (num_proc=160): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 48782/105170 [00:07<00:03, 14955.29 examples/s] Applying chat template to train dataset (num_proc=160): 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 50448/105170 [00:07<00:03, 15266.67 examples/s] Applying chat template to train dataset (num_proc=160): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 52509/105170 [00:07<00:03, 16492.22 examples/s] Applying chat template to train dataset (num_proc=160): 52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 54268/105170 [00:07<00:03, 13054.62 examples/s] Applying chat template to train dataset (num_proc=160): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 55781/105170 [00:08<00:03, 12473.00 examples/s] Applying chat template to train dataset (num_proc=160): 55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 57497/105170 [00:08<00:03, 12879.94 examples/s] Applying chat template to train dataset (num_proc=160): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 58880/105170 [00:08<00:03, 12737.76 examples/s] Applying chat template to train dataset (num_proc=160): 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 61705/105170 [00:08<00:02, 16520.53 examples/s] Applying chat template to train dataset (num_proc=160): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 63508/105170 [00:08<00:02, 16895.09 examples/s] Applying chat template to train dataset (num_proc=160): 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 65430/105170 [00:08<00:02, 16404.91 examples/s] Applying chat template to train dataset (num_proc=160): 65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 68209/105170 [00:08<00:02, 17411.58 examples/s] Applying chat template to train dataset (num_proc=160): 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 70007/105170 [00:08<00:02, 15183.55 examples/s] Applying chat template to train dataset (num_proc=160): 68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 71610/105170 [00:09<00:02, 14578.54 examples/s] Applying chat template to train dataset (num_proc=160): 70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 73807/105170 [00:09<00:02, 14997.86 examples/s] Applying chat template to train dataset (num_proc=160): 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 75370/105170 [00:09<00:02, 13514.99 examples/s] Applying chat template to train dataset (num_proc=160): 73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 76794/105170 [00:09<00:02, 13564.88 examples/s] Applying chat template to train dataset (num_proc=160): 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 78181/105170 [00:09<00:02, 12037.23 examples/s] Applying chat template to train dataset (num_proc=160): 77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 81009/105170 [00:09<00:01, 15904.44 examples/s] Applying chat template to train dataset (num_proc=160): 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 82739/105170 [00:09<00:01, 12437.93 examples/s] Applying chat template to train dataset (num_proc=160): 80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 84495/105170 [00:10<00:01, 12873.48 examples/s] Applying chat template to train dataset (num_proc=160): 82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 86312/105170 [00:10<00:01, 13198.06 examples/s] Applying chat template to train dataset (num_proc=160): 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 89292/105170 [00:10<00:00, 17001.18 examples/s] Applying chat template to train dataset (num_proc=160): 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 91165/105170 [00:10<00:00, 16199.78 examples/s] Applying chat template to train dataset (num_proc=160): 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 92909/105170 [00:10<00:00, 14763.83 examples/s] Applying chat template to train dataset (num_proc=160): 91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 96038/105170 [00:10<00:00, 18747.94 examples/s] Applying chat template to train dataset (num_proc=160): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 98086/105170 [00:10<00:00, 18120.33 examples/s] Applying chat template to train dataset (num_proc=160): 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 100039/105170 [00:10<00:00, 16410.97 examples/s] Applying chat template to train dataset (num_proc=160): 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 101785/105170 [00:11<00:00, 14486.47 examples/s] Applying chat template to train dataset (num_proc=160): 98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 103358/105170 [00:11<00:00, 13775.91 examples/s] Applying chat template to train dataset (num_proc=160): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 104843/105170 [00:11<00:00, 10274.28 examples/s] Applying chat template to train dataset (num_proc=160): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 105170/105170 [00:12<00:00, 8452.21 examples/s]
Tokenizing train dataset (num_proc=160): 0%| | 0/105170 [00:00<?, ? examples/s] Tokenizing train dataset (num_proc=160): 0%| | 10/105170 [00:06<17:47:56, 1.64 examples/s] Tokenizing train dataset (num_proc=160): 0%| | 32/105170 [00:06<4:24:34, 6.62 examples/s] Tokenizing train dataset (num_proc=160): 0%| | 56/105170 [00:06<2:05:39, 13.94 examples/s] Tokenizing train dataset (num_proc=160): 0%| | 116/105170 [00:06<45:03, 38.85 examples/s] Tokenizing train dataset (num_proc=160): 0%| | 257/105170 [00:06<15:24, 113.44 examples/s] Tokenizing train dataset (num_proc=160): 0%|▏ | 307/105170 [00:06<12:32, 139.32 examples/s] Tokenizing train dataset (num_proc=160): 1%|β–Ž | 581/105170 [00:06<04:51, 358.72 examples/s] Tokenizing train dataset (num_proc=160): 1%|▍ | 897/105170 [00:07<02:55, 593.51 examples/s] Tokenizing train dataset (num_proc=160): 1%|β–Œ | 1198/105170 [00:07<01:59, 869.76 examples/s] Tokenizing train dataset (num_proc=160): 2%|β–‹ | 1769/105170 [00:07<01:07, 1535.37 examples/s] Tokenizing train dataset (num_proc=160): 2%|β–Š | 2024/105170 [00:07<01:00, 1706.84 examples/s] Tokenizing train dataset (num_proc=160): 2%|β–ˆ | 2549/105170 [00:07<00:43, 2384.35 examples/s] Tokenizing train dataset (num_proc=160): 3%|β–ˆβ– | 2884/105170 [00:07<00:40, 2545.10 examples/s] Tokenizing train dataset (num_proc=160): 3%|β–ˆβ– | 3320/105170 [00:07<00:39, 2563.07 examples/s] Tokenizing train dataset (num_proc=160): 4%|β–ˆβ–Œ | 3682/105170 [00:07<00:36, 2798.12 examples/s] Tokenizing train dataset (num_proc=160): 4%|β–ˆβ–Š | 4242/105170 [00:08<00:29, 3474.91 examples/s] Tokenizing train dataset (num_proc=160): 5%|β–ˆβ–ˆ | 4846/105170 [00:08<00:33, 3020.68 examples/s] Tokenizing train dataset (num_proc=160): 5%|β–ˆβ–ˆβ– | 5699/105170 [00:08<00:28, 3521.63 examples/s] Tokenizing train dataset (num_proc=160): 6%|β–ˆβ–ˆβ–Œ | 6152/105170 [00:08<00:29, 3345.92 examples/s] Tokenizing train dataset (num_proc=160): 6%|β–ˆβ–ˆβ–Š | 6677/105170 [00:08<00:26, 3738.30 examples/s] Tokenizing train dataset (num_proc=160): 7%|β–ˆβ–ˆβ–ˆ | 7316/105170 [00:08<00:22, 4337.63 examples/s] Tokenizing train dataset (num_proc=160): 7%|β–ˆβ–ˆβ–ˆβ–Ž | 7793/105170 [00:09<00:35, 2726.37 examples/s] Tokenizing train dataset (num_proc=160): 8%|β–ˆβ–ˆβ–ˆβ– | 8167/105170 [00:09<00:38, 2540.90 examples/s] Tokenizing train dataset (num_proc=160): 8%|β–ˆβ–ˆβ–ˆβ–Œ | 8496/105170 [00:09<00:37, 2549.51 examples/s] Tokenizing train dataset (num_proc=160): 9%|β–ˆβ–ˆβ–ˆβ–Š | 9005/105170 [00:09<00:32, 2994.71 examples/s] Tokenizing train dataset (num_proc=160): 9%|β–ˆβ–ˆβ–ˆβ–‰ | 9537/105170 [00:09<00:31, 3079.02 examples/s] Tokenizing train dataset (num_proc=160): 9%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 9899/105170 [00:09<00:29, 3194.26 examples/s] Tokenizing train dataset (num_proc=160): 10%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 10249/105170 [00:10<00:39, 2401.80 examples/s] Tokenizing train dataset (num_proc=160): 10%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 10535/105170 [00:10<00:42, 2228.54 examples/s] Tokenizing train dataset (num_proc=160): 10%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 10790/105170 [00:10<01:02, 1506.25 examples/s] Tokenizing train dataset (num_proc=160): 10%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 10989/105170 [00:10<01:00, 1561.95 examples/s] Tokenizing train dataset (num_proc=160): 11%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 11186/105170 [00:11<01:33, 1000.43 examples/s] Tokenizing train dataset (num_proc=160): 11%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 11353/105170 [00:11<01:36, 972.19 examples/s] Tokenizing train dataset (num_proc=160): 11%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 11501/105170 [00:11<01:43, 902.76 examples/s] Tokenizing train dataset (num_proc=160): 11%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 11616/105170 [00:11<01:43, 903.41 examples/s] Tokenizing train dataset (num_proc=160): 11%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 11730/105170 [00:11<01:41, 916.36 examples/s] Tokenizing train dataset (num_proc=160): 11%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 12013/105170 [00:11<01:17, 1199.45 examples/s] Tokenizing train dataset (num_proc=160): 12%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 12149/105170 [00:12<01:20, 1162.14 examples/s] Tokenizing train dataset (num_proc=160): 12%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 12366/105170 [00:12<01:12, 1285.80 examples/s] Tokenizing train dataset (num_proc=160): 12%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 12704/105170 [00:12<00:56, 1633.30 examples/s] Tokenizing train dataset (num_proc=160): 12%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 12876/105170 [00:12<00:59, 1551.22 examples/s] Tokenizing train dataset (num_proc=160): 12%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 13119/105170 [00:12<00:56, 1632.35 examples/s] Tokenizing train dataset (num_proc=160): 13%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 13380/105170 [00:12<00:54, 1681.67 examples/s] Tokenizing train dataset (num_proc=160): 13%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 13562/105170 [00:12<00:56, 1611.60 examples/s] Tokenizing train dataset (num_proc=160): 13%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 13808/105170 [00:13<00:56, 1603.03 examples/s] Tokenizing train dataset (num_proc=160): 13%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 14061/105170 [00:13<01:01, 1493.05 examples/s] Tokenizing train dataset (num_proc=160): 14%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 14266/105170 [00:13<01:08, 1320.66 examples/s] Tokenizing train dataset (num_proc=160): 14%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 14516/105170 [00:13<01:09, 1313.18 examples/s] Tokenizing train dataset (num_proc=160): 14%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 15086/105170 [00:13<00:43, 2083.48 examples/s] Tokenizing train dataset (num_proc=160): 15%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 15328/105170 [00:13<00:43, 2047.49 examples/s] Tokenizing train dataset (num_proc=160): 15%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 15564/105170 [00:13<00:42, 2090.18 examples/s] Tokenizing train dataset (num_proc=160): 15%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 15792/105170 [00:14<00:43, 2065.01 examples/s] Tokenizing train dataset (num_proc=160): 15%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 16009/105170 [00:14<00:55, 1599.65 examples/s] Tokenizing train dataset (num_proc=160): 15%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 16198/105170 [00:14<01:57, 756.36 examples/s] Tokenizing train dataset (num_proc=160): 16%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 16339/105170 [00:15<01:59, 745.22 examples/s] Tokenizing train dataset (num_proc=160): 16%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 16537/105170 [00:15<01:48, 818.59 examples/s] Tokenizing train dataset (num_proc=160): 16%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 16809/105170 [00:15<01:33, 944.74 examples/s] Tokenizing train dataset (num_proc=160): 16%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 17000/105170 [00:15<01:21, 1081.32 examples/s] Tokenizing train dataset (num_proc=160): 16%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 17326/105170 [00:16<01:27, 999.61 examples/s] Tokenizing train dataset (num_proc=160): 17%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 17485/105170 [00:16<01:46, 824.02 examples/s] Tokenizing train dataset (num_proc=160): 17%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 17733/105170 [00:16<01:42, 854.27 examples/s] Tokenizing train dataset (num_proc=160): 17%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 17843/105170 [00:16<01:50, 793.28 examples/s] Tokenizing train dataset (num_proc=160): 17%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 18174/105170 [00:16<01:21, 1071.36 examples/s] Tokenizing train dataset (num_proc=160): 17%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 18361/105170 [00:17<01:12, 1201.67 examples/s] Tokenizing train dataset (num_proc=160): 18%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 18541/105170 [00:17<01:06, 1303.76 examples/s] Tokenizing train dataset (num_proc=160): 18%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 18702/105170 [00:17<01:15, 1145.99 examples/s] Tokenizing train dataset (num_proc=160): 18%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 18836/105170 [00:17<01:28, 973.70 examples/s] Tokenizing train dataset (num_proc=160): 18%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 19037/105170 [00:17<01:15, 1140.69 examples/s] Tokenizing train dataset (num_proc=160): 18%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 19280/105170 [00:17<01:02, 1383.82 examples/s] Tokenizing train dataset (num_proc=160): 19%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 19550/105170 [00:17<00:50, 1682.84 examples/s] Tokenizing train dataset (num_proc=160): 19%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 19746/105170 [00:18<00:56, 1520.65 examples/s] Tokenizing train dataset (num_proc=160): 19%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 19919/105170 [00:18<00:58, 1445.12 examples/s] Tokenizing train dataset (num_proc=160): 19%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 20078/105170 [00:18<01:03, 1337.83 examples/s] Tokenizing train dataset (num_proc=160): 19%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 20222/105170 [00:18<01:06, 1273.43 examples/s] Tokenizing train dataset (num_proc=160): 19%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 20357/105170 [00:18<01:11, 1188.40 examples/s] Tokenizing train dataset (num_proc=160): 19%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 20494/105170 [00:18<01:11, 1180.03 examples/s] Tokenizing train dataset (num_proc=160): 20%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 20815/105170 [00:19<01:12, 1160.02 examples/s] Tokenizing train dataset (num_proc=160): 20%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 20934/105170 [00:19<02:04, 675.69 examples/s] Tokenizing train dataset (num_proc=160): 20%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 21191/105170 [00:19<01:59, 701.94 examples/s] Tokenizing train dataset (num_proc=160): 20%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 21407/105170 [00:20<02:05, 667.83 examples/s] Tokenizing train dataset (num_proc=160): 21%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 21653/105170 [00:20<01:57, 710.42 examples/s] Tokenizing train dataset (num_proc=160): 21%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 21743/105170 [00:20<02:07, 656.76 examples/s] Tokenizing train dataset (num_proc=160): 21%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 21918/105170 [00:20<01:54, 726.36 examples/s] Tokenizing train dataset (num_proc=160): 21%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 22238/105170 [00:21<01:28, 932.79 examples/s] Tokenizing train dataset (num_proc=160): 21%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 22364/105170 [00:21<01:31, 900.12 examples/s] Tokenizing train dataset (num_proc=160): 21%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 22533/105170 [00:21<01:23, 994.27 examples/s] Tokenizing train dataset (num_proc=160): 22%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 22784/105170 [00:21<01:10, 1163.91 examples/s] Tokenizing train dataset (num_proc=160): 22%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 22983/105170 [00:21<01:04, 1274.29 examples/s] Tokenizing train dataset (num_proc=160): 22%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 23124/105170 [00:21<01:26, 950.38 examples/s] Tokenizing train dataset (num_proc=160): 22%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 23267/105170 [00:22<01:20, 1014.10 examples/s] Tokenizing train dataset (num_proc=160): 22%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 23482/105170 [00:22<01:13, 1111.32 examples/s] Tokenizing train dataset (num_proc=160): 22%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 23604/105170 [00:22<01:19, 1032.16 examples/s] Tokenizing train dataset (num_proc=160): 23%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 23714/105170 [00:22<01:42, 796.32 examples/s] Tokenizing train dataset (num_proc=160): 23%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 23823/105170 [00:22<01:44, 775.80 examples/s] Tokenizing train dataset (num_proc=160): 23%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 23981/105170 [00:22<01:33, 864.36 examples/s] Tokenizing train dataset (num_proc=160): 23%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 24162/105170 [00:23<01:29, 907.95 examples/s] Tokenizing train dataset (num_proc=160): 23%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 24391/105170 [00:23<01:13, 1102.93 examples/s] Tokenizing train dataset (num_proc=160): 23%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 24582/105170 [00:23<01:10, 1141.02 examples/s] Tokenizing train dataset (num_proc=160): 23%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 24703/105170 [00:23<01:14, 1076.54 examples/s] Tokenizing train dataset (num_proc=160): 24%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 24819/105170 [00:23<01:23, 960.79 examples/s] Tokenizing train dataset (num_proc=160): 24%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 24949/105170 [00:23<01:23, 957.49 examples/s] Tokenizing train dataset (num_proc=160): 24%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 25256/105170 [00:23<00:56, 1415.02 examples/s] Tokenizing train dataset (num_proc=160): 24%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 25441/105170 [00:24<00:53, 1502.64 examples/s] Tokenizing train dataset (num_proc=160): 24%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 25622/105170 [00:24<00:53, 1498.80 examples/s] Tokenizing train dataset (num_proc=160): 25%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 25781/105170 [00:24<00:54, 1452.65 examples/s] Tokenizing train dataset (num_proc=160): 25%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 25964/105170 [00:24<01:01, 1281.67 examples/s] Tokenizing train dataset (num_proc=160): 25%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 26105/105170 [00:24<01:11, 1101.67 examples/s] Tokenizing train dataset (num_proc=160): 25%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 26326/105170 [00:24<00:58, 1342.69 examples/s] Tokenizing train dataset (num_proc=160): 25%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 26477/105170 [00:25<01:26, 911.55 examples/s] Tokenizing train dataset (num_proc=160): 25%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 26599/105170 [00:25<01:37, 802.05 examples/s] Tokenizing train dataset (num_proc=160): 26%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 26863/105170 [00:25<01:11, 1092.16 examples/s] Tokenizing train dataset (num_proc=160): 26%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 27099/105170 [00:25<01:03, 1233.73 examples/s] Tokenizing train dataset (num_proc=160): 26%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 27398/105170 [00:25<00:55, 1402.88 examples/s] Tokenizing train dataset (num_proc=160): 26%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 27554/105170 [00:25<00:57, 1351.88 examples/s] Tokenizing train dataset (num_proc=160): 26%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 27699/105170 [00:25<00:56, 1368.98 examples/s] Tokenizing train dataset (num_proc=160): 27%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 27962/105170 [00:26<00:48, 1595.83 examples/s] Tokenizing train dataset (num_proc=160): 27%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 28344/105170 [00:26<00:36, 2115.14 examples/s] Tokenizing train dataset (num_proc=160): 27%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 28574/105170 [00:26<01:06, 1155.71 examples/s] Tokenizing train dataset (num_proc=160): 28%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 29556/105170 [00:26<00:29, 2568.18 examples/s] Tokenizing train dataset (num_proc=160): 30%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 31856/105170 [00:26<00:11, 6432.02 examples/s] Tokenizing train dataset (num_proc=160): 32%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 33316/105170 [00:26<00:08, 8155.63 examples/s] Tokenizing train dataset (num_proc=160): 33%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 34439/105170 [00:27<00:08, 7881.01 examples/s] Tokenizing train dataset (num_proc=160): 34%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 35442/105170 [00:27<00:08, 7904.14 examples/s] Tokenizing train dataset (num_proc=160): 35%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 36388/105170 [00:27<00:08, 8146.29 examples/s] Tokenizing train dataset (num_proc=160): 37%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 39383/105170 [00:27<00:04, 13491.93 examples/s] Tokenizing train dataset (num_proc=160): 39%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 41266/105170 [00:27<00:04, 14882.03 examples/s] Tokenizing train dataset (num_proc=160): 42%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 44153/105170 [00:27<00:03, 18649.57 examples/s] Tokenizing train dataset (num_proc=160): 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 47866/105170 [00:27<00:02, 23808.81 examples/s] Tokenizing train dataset (num_proc=160): 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 52351/105170 [00:27<00:01, 29802.64 examples/s] Tokenizing train dataset (num_proc=160): 53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 55542/105170 [00:27<00:01, 29908.27 examples/s] Tokenizing train dataset (num_proc=160): 56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 58625/105170 [00:28<00:01, 25757.21 examples/s] Tokenizing train dataset (num_proc=160): 60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 63427/105170 [00:28<00:01, 31557.28 examples/s] Tokenizing train dataset (num_proc=160): 66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 69218/105170 [00:28<00:00, 38704.85 examples/s] Tokenizing train dataset (num_proc=160): 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 78776/105170 [00:28<00:00, 54475.47 examples/s] Tokenizing train dataset (num_proc=160): 86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 90193/105170 [00:28<00:00, 70913.96 examples/s] Tokenizing train dataset (num_proc=160): 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 97524/105170 [00:28<00:00, 69379.22 examples/s] Tokenizing train dataset (num_proc=160): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 105170/105170 [00:29<00:00, 3577.76 examples/s]