Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Zhen Yang's picture
2 3 2

Zhen Yang

andyyang
SteveSHEN's profile picture Alya-cc's profile picture ishaqsaviani's profile picture
·

AI & ML interests

None yet

Organizations

Tencent's profile picture

authored a paper 8 months ago

TransMamba: Flexibly Switching between Transformer and Mamba

Paper • 2503.24067 • Published Mar 31 • 21
authored 4 papers 11 months ago

Lossless KV Cache Compression to 2%

Paper • 2410.15252 • Published Oct 20, 2024 • 1

HMoE: Heterogeneous Mixture of Experts for Language Modeling

Paper • 2408.10681 • Published Aug 20, 2024 • 10

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Paper • 2411.02265 • Published Nov 4, 2024 • 25

Scaling Laws for Floating Point Quantization Training

Paper • 2501.02423 • Published Jan 5 • 26
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs