-
Attention Is All You Need
Paper • 1706.03762 • Published • 115 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 19 -
LLaMA: Open and Efficient Foundation Language Models
Paper • 2302.13971 • Published • 20 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper • 2307.09288 • Published • 250
Collections
Discover the best community collections!
Collections including paper arxiv:2306.08543
-
Detecting Pretraining Data from Large Language Models
Paper • 2310.16789 • Published • 11 -
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models
Paper • 2310.13671 • Published • 19 -
AutoMix: Automatically Mixing Language Models
Paper • 2310.12963 • Published • 14 -
An Emulator for Fine-Tuning Large Language Models using Small Language Models
Paper • 2310.12962 • Published • 13
-
Retentive Network: A Successor to Transformer for Large Language Models
Paper • 2307.08621 • Published • 173 -
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Paper • 2303.12712 • Published • 5 -
GPT-4 Technical Report
Paper • 2303.08774 • Published • 7 -
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Paper • 2201.11903 • Published • 15
-
Adapt-Pruner: Adaptive Structural Pruning for Efficient Small Language Model Training
Paper • 2502.03460 • Published -
Pruning as a Domain-specific LLM Extractor
Paper • 2405.06275 • Published • 1 -
LLM-Pruner: On the Structural Pruning of Large Language Models
Paper • 2305.11627 • Published • 3 -
KnowTuning: Knowledge-aware Fine-tuning for Large Language Models
Paper • 2402.11176 • Published • 2
-
Instruction Tuning for Large Language Models: A Survey
Paper • 2308.10792 • Published • 1 -
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Paper • 2403.14608 • Published -
Efficient Large Language Models: A Survey
Paper • 2312.03863 • Published • 4 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 31
-
Attention Is All You Need
Paper • 1706.03762 • Published • 115 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 19 -
LLaMA: Open and Efficient Foundation Language Models
Paper • 2302.13971 • Published • 20 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper • 2307.09288 • Published • 250
-
Adapt-Pruner: Adaptive Structural Pruning for Efficient Small Language Model Training
Paper • 2502.03460 • Published -
Pruning as a Domain-specific LLM Extractor
Paper • 2405.06275 • Published • 1 -
LLM-Pruner: On the Structural Pruning of Large Language Models
Paper • 2305.11627 • Published • 3 -
KnowTuning: Knowledge-aware Fine-tuning for Large Language Models
Paper • 2402.11176 • Published • 2
-
Instruction Tuning for Large Language Models: A Survey
Paper • 2308.10792 • Published • 1 -
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Paper • 2403.14608 • Published -
Efficient Large Language Models: A Survey
Paper • 2312.03863 • Published • 4 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 31
-
Detecting Pretraining Data from Large Language Models
Paper • 2310.16789 • Published • 11 -
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models
Paper • 2310.13671 • Published • 19 -
AutoMix: Automatically Mixing Language Models
Paper • 2310.12963 • Published • 14 -
An Emulator for Fine-Tuning Large Language Models using Small Language Models
Paper • 2310.12962 • Published • 13
-
Retentive Network: A Successor to Transformer for Large Language Models
Paper • 2307.08621 • Published • 173 -
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Paper • 2303.12712 • Published • 5 -
GPT-4 Technical Report
Paper • 2303.08774 • Published • 7 -
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Paper • 2201.11903 • Published • 15