Transformers documentation
FLAN-T5
Get started
Tutorials
Run inference with pipelinesWrite portable code with AutoClassPreprocess dataFine-tune a pretrained modelTrain with a scriptSet up distributed training with 🤗 AccelerateShare your modelAgents
Task Guides
Natural Language Processing
Audio
Computer Vision
Multimodal
Developer guides
Use fast tokenizers from 🤗 TokenizersRun inference with multilingual modelsCustomize text generation strategyUse model-specific APIsShare a custom modelRun training on Amazon SageMakerExport to ONNXExport to TFLiteExport to TorchScriptBenchmarksNotebooks with examplesCommunity resourcesCustom Tools and PromptsTroubleshoot
Performance and scalability
OverviewTraining on one GPUTraining on many GPUsTraining on CPUTraining on many CPUsTraining on TPUsTraining on TPU with TensorFlowTraining on Specialized HardwareInference on CPUInference on one GPUInference on many GPUsInference on Specialized HardwareCustom hardware for trainingInstantiating a big modelDebuggingHyperparameter Search using Trainer APIXLA Integration for TensorFlow Models
Contribute
How to contribute to transformers?How to add a model to 🤗 Transformers?How to convert a 🤗 Transformers model to TensorFlow?How to add a pipeline to 🤗 Transformers?TestingChecks on a Pull Request
Conceptual guides
PhilosophyGlossaryWhat 🤗 Transformers can doHow 🤗 Transformers solve tasksThe Transformer model familySummary of the tokenizersAttention mechanismsPadding and truncationBERTologyPerplexity of fixed-length modelsPipelines for webserver inference
API
Main Classes
Agents and ToolsAuto ClassesCallbacksConfigurationData CollatorKeras callbacksLoggingModelsText GenerationONNXOptimizationModel outputsPipelinesProcessorsQuantizationTokenizerTrainerDeepSpeed IntegrationFeature ExtractorImage Processor
Models
Text models
ALBERTBARTBARThezBARTphoBERTBertGenerationBertJapaneseBertweetBigBirdBigBirdPegasusBioGptBlenderbotBlenderbot SmallBLOOMBORTByT5CamemBERTCANINECodeGenConvBERTCPMCPMANTCTRLDeBERTaDeBERTa-v2DialoGPTDistilBERTDPRELECTRAEncoder Decoder ModelsERNIEErnieMESMFLAN-T5FLAN-UL2FlauBERTFNetFSMTFunnel TransformerGPTGPT NeoGPT NeoXGPT NeoX JapaneseGPT-JGPT2GPTBigCodeGPTSAN JapaneseGPTSw3HerBERTI-BERTJukeboxLEDLLaMALlama2LongformerLongT5LUKEM2M100MarianMTMarkupLMMBart and MBart-50MEGAMegatronBERTMegatronGPT2mLUKEMobileBERTMPNetMRAMT5MVPNEZHANLLBNLLB-MoENyströmformerOpen-LlamaOPTPegasusPEGASUS-XPhoBERTPLBartProphetNetQDQBertRAGREALMReformerRemBERTRetriBERTRoBERTaRoBERTa-PreLayerNormRoCBertRoFormerRWKVSplinterSqueezeBERTSwitchTransformersT5T5v1.1TAPEXTransformer XLUL2UMT5X-MODXGLMXLMXLM-ProphetNetXLM-RoBERTaXLM-RoBERTa-XLXLM-VXLNetYOSO
Vision models
Audio models
Multimodal models
Reinforcement learning models
Time series models
Graph models
Internal Helpers
You are viewing v4.31.0 version. A newer version v5.8.1 is available.
FLAN-T5
Overview
FLAN-T5 was released in the paper Scaling Instruction-Finetuned Language Models - it is an enhanced version of T5 that has been finetuned in a mixture of tasks.
One can directly use FLAN-T5 weights without finetuning the model:
>>> from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
>>> model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-small")
>>> tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-small")
>>> inputs = tokenizer("A step by step recipe to make bolognese pasta:", return_tensors="pt")
>>> outputs = model.generate(**inputs)
>>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
['Pour a cup of bolognese into a large bowl and add the pasta']FLAN-T5 includes the same improvements as T5 version 1.1 (see here for the full details of the model’s improvements.)
Google has released the following variants:
One can refer to T5’s documentation page for all tips, code examples and notebooks. As well as the FLAN-T5 model card for more details regarding training and evaluation of the model.
The original checkpoints can be found here.