--- language: - en license: mit tags: - summarization - t5-large-summarization - pipeline:summarization thumbnail: https://huggingface.co/front/thumbnails/facebook.png model-index: - name: sysresearch101/t5-large-finetuned-xsum-cnn results: - task: type: summarization name: Summarization dataset: name: xsum type: xsum config: default split: test metrics: - type: rouge value: 36.7656 name: ROUGE-1 verified: true verifyToken: >- eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiN2QzMDg4NTM0ZTc5MjAzNTY4MmY1YTRiMWI3M2I2NDdjMTM4ZGNhYzZhOWQzMWI0MjJlYmU3MTg0ZjVjMTEyZSIsInZlcnNpb24iOjF9.AuKHql0LQs0zDQNn7zvySnX50GAC8jEWyYz-LtBgWj0dcad86J8yfHbIDswmgx2ur0S3yttw72qNExag_Fw7Dw - type: rouge value: 14.6898 name: ROUGE-2 verified: true verifyToken: >- eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZTE3ZTExY2M3MTIwMWY0ODRkZDI1YjU2ZjRkOGJjOGQyYjcxMTMxOWExN2Q0OGNkZmNiYzYzYzVhODY4YzEwOSIsInZlcnNpb24iOjF9.F1Q17sa8IAsW8ouQ2VDLq_VvHDxjuMjVU3rMfvkbmKxAjTDKVTiaG6Eg9uSKIYzgJoDSsxhsZcjH-J0gGQv3Dg - type: rouge value: 30.0646 name: ROUGE-L verified: true verifyToken: >- eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYzI1NjE0NmI5Nzc3ODFiNDI5YzVhNjUzNzU1NzA0ZDMwMjFjZDE1YzUxNjZmZTAwZTM0MmVmN2ZkYWUwMjBiZSIsInZlcnNpb24iOjF9.xehN8zOV6050WvoLZIJ-l2zB93jWY_ugcydDDqV06XwdKwZ7l0TI8BoLDOO7Mw7dRmHOWLNruDJZnOnW3_3pCQ - type: rouge value: 30.0563 name: ROUGE-LSUM verified: true verifyToken: >- eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZmU0OTVhYTY0ZDJmOTU3OWE5MzgxYzdhNmQ3MjM3YzM2MGIzOGViY2ZkMTI1ZWI4NDMwOTlkODBjOGE4NTE4ZCIsInZlcnNpb24iOjF9.FtNN06HKSgEB1tiWpToEVnNfzhQs9ZR59386YynOY6T6oKWxbIiRyItzYXobNw96lg5c2sE4vdJSfdtbBpkyDA - type: loss value: 1.6373405456542969 name: loss verified: true verifyToken: >- eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYTVjYzI0MmMyY2IzYTE0NDUxY2FiMDM4Mjk2NTI1NTk0NjFiYTY2OWMxODRjNWJhYjU4ZWU5OTk4Y2E5N2RkOSIsInZlcnNpb24iOjF9.Cz5AQ-B8IAXmf1Xc_7UJ0pI9XKYHxDEwmoP3ZFsS2Wmbk1pUB8o_Y8AErBR8-Q60qR_ndw8eSwrI0EnPohYHCw - type: gen_len value: 18.6054 name: gen_len verified: true verifyToken: >- eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMWRlMjM5MzAyMjEzYzdkODFmNDk4NDg5NWM4NWIxMTU4YWMxNzZjMGFjOWJiMDdkMjQyMTY0ZGFmYzA2OTA0YiIsInZlcnNpb24iOjF9.IFiGJEsyD7Uhj8bo9SsAgibk9qCXZH6IWaLKULLxBz5N8WXF2vc2Mfg5OThEzdrydPhJInRgp0jd8m-kF5nNCA datasets: - abisee/cnn_dailymail - EdinburghNLP/xsum base_model: - google-t5/t5-large --- # T5-Large Fine-tuned on the combined XSum + CNN/DailyMail Datasets **Task:** Abstractive Summarization (English) **Base Model:** google-t5/t5-large **License:** MIT ## Overview This model is a T5-Large checkpoint fine-tuned jointly on [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum) and [CNN/DailyMail](https://huggingface.co/datasets/abisee/cnn_dailymail) datasets. It produces concise, abstractive summaries and has been widely adopted as a baseline in summarization research. ## Performance ~ On XSum test set | Metric | Score | |--------|-------| | ROUGE-1 | 36.77 | | ROUGE-2 | 14.69 | | ROUGE-L | 30.06 | | Loss | 1.64 | | Avg. Length | 18.6 tokens | ## Usage ### Quick Start ```python from transformers import pipeline summarizer = pipeline("summarization", model="sysresearch101/t5-large-finetuned-xsum-cnn") article = "Your article text here..." summary = summarizer(article, max_length=80, min_length=20, do_sample=False) print(summary[0]['summary_text']) ``` ### Advanced Usage ```python from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("sysresearch101/t5-large-finetuned-xsum-cnn") model = AutoModelForSeq2SeqLM.from_pretrained("sysresearch101/t5-large-finetuned-xsum-cnn") inputs = tokenizer("summarize: " + article, return_tensors="pt", max_length=512, truncation=True) outputs = model.generate( **inputs, max_length=80, min_length=20, num_beams=4, no_repeat_ngram_size=2, length_penalty=1.0, repetition_penalty=2.5, use_cache=True, early_stopping=True, do_sample = True, temperature = 0.8, top_k = 50, top_p = 0.95 ) summary = tokenizer.decode(outputs[0], skip_special_tokens=True) ``` ## Training Data - [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum): BBC articles with single-sentence summaries - [CNN/DailyMail](https://huggingface.co/datasets/abisee/cnn_dailymail): News articles with multi-sentence summaries - ## Intended Use - **Primary:** Summarization. - **Secondary:** Educational demonstrations, reproducible baselines, Research benchmarking, academic studies on summarization ## Limitations - Optimized for English news text; performance may vary on other domains - Tends to produce very concise summaries (18-20 tokens average) - No built-in fact-checking or content filtering ## Citation ```bibtex @misc{stept2023_t5_large_xsum_cnn_summarization, author = {Shlomo Stept (sysresearch101)}, title = {T5-Large Fine-tuned on XSum + CNN/DailyMail for Abstractive Summarization}, year = {2023}, publisher = {Hugging Face}, url = {https://huggingface.co/sysresearch101/t5-large-finetuned-xsum-cnn} } ``` ## Papers Using This Model * [Zhu et al. (2023). *Annotating and Detecting Fine-grained Factual Errors for Dialogue Summarization.* ACL 2023 (Long).](https://aclanthology.org/2023.acl-long.377.pdf) * European Food Safety Authority. (2023). Implementing AI Vertical use cases – Scenario 1. EFSA Journal, Special Publication EN-8223. https://doi.org/10.2903/sp.efsa.2023.EN-8223 * *(Forthcoming)* Budget-Constrained Learning to Defer for Autoregressive Generation (under review, ICLR 2025) ## Contact Created by [Shlomo Stept](https://shlomostept.com) ([ORCID: 0009-0009-3185-589X](https://orcid.org/0009-0009-3185-589X)) DARMIS AI - Website: [shlomostept.com](https://shlomostept.com) - LinkedIn: [linkedin.com/in/shlomo-stept](https://linkedin.com/in/shlomo-stept)