Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
11
20
8
Deqing Fu
PRO
deqing
Follow
adamm-hf's profile picture
tahamajs's profile picture
shk-bd's profile picture
13 followers
·
18 following
https://deqingfu.github.io
DeqingFu
DeqingFu
AI & ML interests
None yet
Recent Activity
updated
a model
about 20 hours ago
deqing/fone-llama-3.2-1B-dclm-100BT-fone3d-hybrid-tile-v1
published
a model
8 days ago
deqing/fone-llama-3.2-1B-dclm-100BT-fone3d-hybrid-tile-v1
updated
a model
8 days ago
deqing/vanilla-llama-3.2-1B-dclm-100BT-v1
View all activity
Organizations
deqing
's models
131
Sort: Recently updated
deqing/convergent-llama-300M-adamw-window_8
Text Generation
•
0.3B
•
Updated
Mar 29
•
38
deqing/convergent-llama-300M-adamw-window_4
Text Generation
•
0.3B
•
Updated
Mar 29
•
39
deqing/convergent-llama-300M-adamw-window_2
Text Generation
•
0.3B
•
Updated
Mar 29
•
37
deqing/convergent-llama-300M-adamw-swap_numbers
Text Generation
•
0.3B
•
Updated
Mar 29
•
37
deqing/convergent-llama-300M-adamw-isolate
Text Generation
•
0.3B
•
Updated
Mar 29
•
40
deqing/convergent-llama-300M-adamw-unigram
Text Generation
•
0.3B
•
Updated
Mar 29
•
42
deqing/convergent-mamba2-300M-muon-original
Text Generation
•
0.3B
•
Updated
Mar 29
•
32
deqing/llama-window-4-old
Text Generation
•
0.3B
•
Updated
Mar 29
•
41
deqing/llama-window-2-old
Text Generation
•
0.3B
•
Updated
Mar 29
•
40
deqing/convergent-llama-300M-muon-unk_number
Text Generation
•
0.3B
•
Updated
Mar 29
•
39
deqing/convergent-llama-300M-muon-swap_numbers
Text Generation
•
0.3B
•
Updated
Mar 29
•
36
deqing/llama-isolate-old
Text Generation
•
0.3B
•
Updated
Mar 29
•
39
deqing/convergent-llama-300M-muon-fivegram
Text Generation
•
0.3B
•
Updated
Mar 29
•
37
deqing/convergent-llama-300M-muon-permute
Text Generation
•
0.3B
•
Updated
Mar 29
•
38
deqing/convergent-llama-300M-muon-bigram
Text Generation
•
0.3B
•
Updated
Mar 29
•
39
deqing/convergent-llama-300M-muon-unigram
Text Generation
•
0.3B
•
Updated
Mar 29
•
41
deqing/mamba2-300M-v5-mamba2
Text Generation
•
0.3B
•
Updated
Mar 29
•
120
deqing/lstm-12layer-v5
0.2B
•
Updated
Mar 29
•
6
deqing/llama-300M-v5-original
Text Generation
•
0.3B
•
Updated
Mar 27
•
78
deqing/llama-300M-v5-unk_number
Text Generation
•
0.3B
•
Updated
Mar 26
•
81
deqing/llama-300M-v5-addition_3digit_adamw
0.3B
•
Updated
Mar 25
•
41
deqing/llama-300M-v5-addition_3digit
0.3B
•
Updated
Mar 25
•
56
deqing/llama-300M-v5-addition
Text Generation
•
0.3B
•
Updated
Mar 25
•
88
deqing/llama-300M-v5-addition_adamw
Text Generation
•
0.3B
•
Updated
Mar 24
•
84
deqing/llama-300M-v5-addition_adamw-old
0.3B
•
Updated
Mar 22
•
5
deqing/llama-300M-v5-addition_3digit-old
0.3B
•
Updated
Mar 22
•
2
deqing/llama-300M-v5-adamw-addition_3digit_adamw-old
0.3B
•
Updated
Mar 22
•
6
deqing/llama-300M-v5-original-random_init_sft
Updated
Mar 21
deqing/llama-300M-v5-isolate_sft
Updated
Mar 21
deqing/llama-300M-v5-swap_numbers_sft
Updated
Mar 21
Previous
1
2
3
4
5
Next