---
language:
- en
license: mit
tags:
- summarization
- t5-large-summarization
- pipeline:summarization
thumbnail: https://huggingface.co/front/thumbnails/facebook.png
model-index:
- name: sysresearch101/t5-large-finetuned-xsum-cnn
  results:
  - task:
      type: summarization
      name: Summarization
    dataset:
      name: xsum
      type: xsum
      config: default
      split: test
    metrics:
    - type: rouge
      value: 36.7656
      name: ROUGE-1
      verified: true
      verifyToken: >-
        eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiN2QzMDg4NTM0ZTc5MjAzNTY4MmY1YTRiMWI3M2I2NDdjMTM4ZGNhYzZhOWQzMWI0MjJlYmU3MTg0ZjVjMTEyZSIsInZlcnNpb24iOjF9.AuKHql0LQs0zDQNn7zvySnX50GAC8jEWyYz-LtBgWj0dcad86J8yfHbIDswmgx2ur0S3yttw72qNExag_Fw7Dw
    - type: rouge
      value: 14.6898
      name: ROUGE-2
      verified: true
      verifyToken: >-
        eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZTE3ZTExY2M3MTIwMWY0ODRkZDI1YjU2ZjRkOGJjOGQyYjcxMTMxOWExN2Q0OGNkZmNiYzYzYzVhODY4YzEwOSIsInZlcnNpb24iOjF9.F1Q17sa8IAsW8ouQ2VDLq_VvHDxjuMjVU3rMfvkbmKxAjTDKVTiaG6Eg9uSKIYzgJoDSsxhsZcjH-J0gGQv3Dg
    - type: rouge
      value: 30.0646
      name: ROUGE-L
      verified: true
      verifyToken: >-
        eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYzI1NjE0NmI5Nzc3ODFiNDI5YzVhNjUzNzU1NzA0ZDMwMjFjZDE1YzUxNjZmZTAwZTM0MmVmN2ZkYWUwMjBiZSIsInZlcnNpb24iOjF9.xehN8zOV6050WvoLZIJ-l2zB93jWY_ugcydDDqV06XwdKwZ7l0TI8BoLDOO7Mw7dRmHOWLNruDJZnOnW3_3pCQ
    - type: rouge
      value: 30.0563
      name: ROUGE-LSUM
      verified: true
      verifyToken: >-
        eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZmU0OTVhYTY0ZDJmOTU3OWE5MzgxYzdhNmQ3MjM3YzM2MGIzOGViY2ZkMTI1ZWI4NDMwOTlkODBjOGE4NTE4ZCIsInZlcnNpb24iOjF9.FtNN06HKSgEB1tiWpToEVnNfzhQs9ZR59386YynOY6T6oKWxbIiRyItzYXobNw96lg5c2sE4vdJSfdtbBpkyDA
    - type: loss
      value: 1.6373405456542969
      name: loss
      verified: true
      verifyToken: >-
        eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYTVjYzI0MmMyY2IzYTE0NDUxY2FiMDM4Mjk2NTI1NTk0NjFiYTY2OWMxODRjNWJhYjU4ZWU5OTk4Y2E5N2RkOSIsInZlcnNpb24iOjF9.Cz5AQ-B8IAXmf1Xc_7UJ0pI9XKYHxDEwmoP3ZFsS2Wmbk1pUB8o_Y8AErBR8-Q60qR_ndw8eSwrI0EnPohYHCw
    - type: gen_len
      value: 18.6054
      name: gen_len
      verified: true
      verifyToken: >-
        eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMWRlMjM5MzAyMjEzYzdkODFmNDk4NDg5NWM4NWIxMTU4YWMxNzZjMGFjOWJiMDdkMjQyMTY0ZGFmYzA2OTA0YiIsInZlcnNpb24iOjF9.IFiGJEsyD7Uhj8bo9SsAgibk9qCXZH6IWaLKULLxBz5N8WXF2vc2Mfg5OThEzdrydPhJInRgp0jd8m-kF5nNCA
datasets:
- abisee/cnn_dailymail
- EdinburghNLP/xsum
base_model:
- google-t5/t5-large
---

# T5-Large Fine-tuned on the combined XSum + CNN/DailyMail Datasets

**Task:** Abstractive Summarization (English)  
**Base Model:** google-t5/t5-large  
**License:** MIT

## Overview

This model is a T5-Large checkpoint fine-tuned jointly on [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum) and [CNN/DailyMail](https://huggingface.co/datasets/abisee/cnn_dailymail) datasets. It produces concise, abstractive summaries and has been widely adopted as a baseline in summarization research.

## Performance ~ On XSum test set

| Metric | Score |
|--------|-------|
| ROUGE-1 | 36.77 |
| ROUGE-2 | 14.69 |
| ROUGE-L | 30.06 |
| Loss | 1.64 |
| Avg. Length | 18.6 tokens |


## Usage

### Quick Start

```python
from transformers import pipeline

summarizer = pipeline("summarization", model="sysresearch101/t5-large-finetuned-xsum-cnn")

article = "Your article text here..."
summary = summarizer(article, max_length=80, min_length=20, do_sample=False)
print(summary[0]['summary_text'])
```

### Advanced Usage

```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("sysresearch101/t5-large-finetuned-xsum-cnn")
model = AutoModelForSeq2SeqLM.from_pretrained("sysresearch101/t5-large-finetuned-xsum-cnn")

inputs = tokenizer("summarize: " + article, return_tensors="pt", max_length=512, truncation=True)
outputs = model.generate(
    **inputs,
    max_length=80,
    min_length=20,
    num_beams=4,
    no_repeat_ngram_size=2,
    length_penalty=1.0,
    repetition_penalty=2.5,
    use_cache=True,
    early_stopping=True,
    do_sample = True,
    temperature = 0.8,
    top_k = 50,
    top_p = 0.95
)

summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
```

## Training Data

- [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum): BBC articles with single-sentence summaries
- [CNN/DailyMail](https://huggingface.co/datasets/abisee/cnn_dailymail): News articles with multi-sentence summaries
- 
## Intended Use
- **Primary:** Summarization.
- **Secondary:** Educational demonstrations, reproducible baselines, Research benchmarking, academic studies on summarization


## Limitations
- Optimized for English news text; performance may vary on other domains
- Tends to produce very concise summaries (18-20 tokens average)
- No built-in fact-checking or content filtering


## Citation

```bibtex
@misc{stept2023_t5_large_xsum_cnn_summarization,
  author = {Shlomo Stept (sysresearch101)},
  title = {T5-Large Fine-tuned on XSum + CNN/DailyMail for Abstractive Summarization},
  year = {2023},
  publisher = {Hugging Face},
  url = {https://huggingface.co/sysresearch101/t5-large-finetuned-xsum-cnn}
}
```


## Papers Using This Model
* [Zhu et al. (2023). *Annotating and Detecting Fine-grained Factual Errors for Dialogue Summarization.* ACL 2023 (Long).](https://aclanthology.org/2023.acl-long.377.pdf)
* European Food Safety Authority. (2023). Implementing AI Vertical use cases – Scenario 1. EFSA Journal, Special Publication EN-8223. https://doi.org/10.2903/sp.efsa.2023.EN-8223
* *(Forthcoming)* Budget-Constrained Learning to Defer for Autoregressive Generation (under review, ICLR 2025)


## Contact

Created by [Shlomo Stept](https://shlomostept.com) ([ORCID: 0009-0009-3185-589X](https://orcid.org/0009-0009-3185-589X))
DARMIS AI

- Website: [shlomostept.com](https://shlomostept.com)
- LinkedIn: [linkedin.com/in/shlomo-stept](https://linkedin.com/in/shlomo-stept)