-
The Impact of Depth and Width on Transformer Language Model Generalization
Paper • 2310.19956 • Published • 10 -
Retentive Network: A Successor to Transformer for Large Language Models
Paper • 2307.08621 • Published • 172 -
RWKV: Reinventing RNNs for the Transformer Era
Paper • 2305.13048 • Published • 20 -
Attention Is All You Need
Paper • 1706.03762 • Published • 104
Collections
Discover the best community collections!
Collections including paper arxiv:1706.03762
-
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 18 -
Large Language Models Are Human-Level Prompt Engineers
Paper • 2211.01910 • Published • 1 -
Lost in the Middle: How Language Models Use Long Contexts
Paper • 2307.03172 • Published • 43 -
Large Language Models are Zero-Shot Reasoners
Paper • 2205.11916 • Published • 3
-
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Paper • 2003.08934 • Published • 2 -
Learning Transferable Visual Models From Natural Language Supervision
Paper • 2103.00020 • Published • 19 -
Emerging Properties in Self-Supervised Vision Transformers
Paper • 2104.14294 • Published • 4 -
Segment Anything
Paper • 2304.02643 • Published • 5
-
The Impact of Depth and Width on Transformer Language Model Generalization
Paper • 2310.19956 • Published • 10 -
Retentive Network: A Successor to Transformer for Large Language Models
Paper • 2307.08621 • Published • 172 -
RWKV: Reinventing RNNs for the Transformer Era
Paper • 2305.13048 • Published • 20 -
Attention Is All You Need
Paper • 1706.03762 • Published • 104
-
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Paper • 2003.08934 • Published • 2 -
Learning Transferable Visual Models From Natural Language Supervision
Paper • 2103.00020 • Published • 19 -
Emerging Properties in Self-Supervised Vision Transformers
Paper • 2104.14294 • Published • 4 -
Segment Anything
Paper • 2304.02643 • Published • 5
-
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 18 -
Large Language Models Are Human-Level Prompt Engineers
Paper • 2211.01910 • Published • 1 -
Lost in the Middle: How Language Models Use Long Contexts
Paper • 2307.03172 • Published • 43 -
Large Language Models are Zero-Shot Reasoners
Paper • 2205.11916 • Published • 3