EchoCheck: Political Stance Classification
A RoBERTa-based model fine-tuned for classifying political stance of text into three categories: left, center, and right.
Model Details
Model Description
EchoCheck is a fine-tuned RoBERTa-base model designed to classify the political leaning of news articles and political text. The model was trained on over 2.3 million news articles from the BIGNEWSBLN dataset and achieves 95.50% accuracy on the held-out test set.
- Developed by: Alexandru-Gabriel Morariu
- Model type: RoBERTa-base with sequence classification head
- Language(s): English
- License: MIT
- Fine-tuned from: roberta-base
Model Sources
- Repository: https://github.com/Alex-GHP/echocheck
Uses
Direct Use
This model can be used directly for:
- Classifying political stance of news articles
- Analyzing political bias in text content
- Research on media bias and political polarization
- Building applications that need to understand political leaning of text
Downstream Use
The model can be integrated into:
- News aggregation platforms for bias labeling
- Browser extensions for political bias detection
- Research tools for political science studies
- Content moderation systems
- Educational tools about media literacy
Out-of-Scope Use
This model should NOT be used for:
- Making decisions about individuals based on their political views
- Censorship or suppression of political speech
- Automated content removal without human review
- Non-English text (model is English-only)
- Classification of non-political content
- As the sole basis for important decisions
Bias, Risks, and Limitations
Known Limitations
- Language: English only - not suitable for other languages
- Domain: Trained on news articles - may perform differently on social media, academic papers, or casual conversation
- Time Period: Training data reflects political discourse up to the dataset collection date
- US-centric: The left/center/right classification is based on US political spectrum and may not translate well to other countries' political systems
Risks
- Evolving Language: Political terminology and framing evolve over time; model may become less accurate
- Context Sensitivity: Short texts or ambiguous statements may be misclassified
- Confirmation Bias: Users should not rely solely on this model's predictions
- Misuse Potential: Could be misused to target individuals based on perceived political views
Recommendations
- Always use human review alongside model predictions
- Consider the model's confidence scores when making decisions
- Be aware that political classification is inherently subjective
- Update or retrain periodically to account for shifting political discourse
- Do not use for high-stakes decisions without additional verification
How to Get Started with the Model
Quick Start
from transformers import RobertaForSequenceClassification, AutoTokenizer
import torch
# Load model and tokenizer
model = RobertaForSequenceClassification.from_pretrained("alxdev/echocheck-political-stance")
tokenizer = AutoTokenizer.from_pretrained("alxdev/echocheck-political-stance")
# Move to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model.eval()
# Classify text
text = "The government should increase social spending to support working families."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512, padding=True)
inputs = {k: v.to(device) for k, v in inputs.items()}
with torch.no_grad():
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=-1)
prediction = probs.argmax().item()
labels = {0: "center", 1: "left", 2: "right"}
print(f"Prediction: {labels[prediction]}")
print(f"Confidence: {probs[0][prediction]:.2%}")
Using Pipeline
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="alxdev/echocheck-political-stance",
device=0 # Use GPU, or -1 for CPU
)
result = classifier("Lower taxes will stimulate economic growth and job creation.")
print(result)
# [{'label': 'LABEL_2', 'score': 0.85}] # LABEL_2 = right
Label Mapping
| Label ID | Label Name | Description |
|---|---|---|
| 0 | center | Moderate/neutral political stance |
| 1 | left | Progressive political stance |
| 2 | right | Conservative political stance |
Training Details
Training Data
The model was trained on the BIGNEWSBLN dataset:
- Source: https://github.com/launchnlp/POLITICS
- Original Paper: POLITICS: Pretraining with Same-story Article Comparison for Ideology Prediction and Stance Detection
- Total Articles: ~2.33 million
- Split:
- Training: 1,865,241 articles (80%)
- Validation: 233,153 articles (10%)
- Test: 233,158 articles (10%)
Training Procedure
Preprocessing
- Tokenization using RoBERTa tokenizer
- Maximum sequence length: 512 tokens
- Padding to max length
- Truncation of longer sequences
Training Hyperparameters
| Parameter | Value |
|---|---|
| Base Model | roberta-base |
| Optimizer | AdamW |
| Learning Rate | 2e-5 |
| Batch Size | 24 |
| Epochs | 3 |
| Warmup | 10% of total steps |
| Weight Decay | 0.01 |
| LR Schedule | Linear with warmup |
| Training Regime | FP32 |
| Loss Function | CrossEntropyLoss |
Speeds, Sizes, Times
- Training Time: ~30-40 hours
- Hardware: NVIDIA RTX 4070 (12GB VRAM), 64GB DDR5 RAM
- Model Size: ~500MB
- Parameters: 124,647,939 total (all trainable)
Evaluation
Testing Data
- Dataset: BIGNEWSBLN
- Size: 233,158 articles
- Distribution: Balanced across all three classes (~77,000 each)
Metrics
- Accuracy: Overall correctness of predictions
- Precision: How many predicted positives are actually positive
- Recall: How many actual positives are correctly predicted
- F1-Score: Harmonic mean of precision and recall
- Confusion Matrix: Detailed breakdown of predictions vs. actual labels
Results
Overall Performance
| Metric | Score |
|---|---|
| Accuracy | 95.50% |
| Macro F1 | 95.49% |
| Weighted F1 | 95.49% |
Per-Class Performance
| Class | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| Center | 0.949 | 0.955 | 0.952 | 77,220 |
| Left | 0.953 | 0.964 | 0.959 | 77,951 |
| Right | 0.963 | 0.945 | 0.954 | 77,987 |
Confusion Matrix
| Predicted Center | Predicted Left | Predicted Right | |
|---|---|---|---|
| Actual Center | 73,756 | 1,890 | 1,574 |
| Actual Left | 1,543 | 75,164 | 1,244 |
| Actual Right | 2,426 | 1,826 | 73,735 |
Summary
The model achieves strong performance across all three political stance categories with balanced precision and recall. The slight confusion between "center" and "right" categories is expected given the nuanced nature of political language.
Technical Specifications
Model Architecture and Objective
- Architecture: RoBERTa-base with a linear classification head
- Hidden Size: 768
- Attention Heads: 12
- Hidden Layers: 12
- Vocabulary Size: 50,265
- Max Position Embeddings: 514
- Classification Head: Linear(768 → 3)
- Objective: Multi-class classification (CrossEntropyLoss)
Compute Infrastructure
Hardware
- GPU: NVIDIA GeForce RTX 4070 (12GB VRAM)
- RAM: 64GB DDR5
- Storage: NVMe SSD
Software
- Framework: PyTorch 2.10+
- Transformers: 4.57+
- Python: 3.14+
- OS: Linux
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
Citation
If you use this model in your research, please cite:
BibTeX:
@misc{morariu2026echocheck,
author = {Morariu, Alexandru-Gabriel},
title = {EchoCheck: Political Stance and Ideology Classification using NLP Techniques},
year = {2026},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/alxdev/echocheck-political-stance}},
note = {Bachelor's Thesis, "Titu Maiorescu" University}
}
APA:
Morariu, A.-G. (2026). EchoCheck: Political Stance Classification using RoBERTa. HuggingFace. https://huggingface.co/alxdev/echocheck-political-stance
Glossary
- Political Stance: The ideological leaning of a text (left, center, or right)
- RoBERTa: Robustly Optimized BERT Pretraining Approach, a transformer-based language model
- Fine-tuning: The process of training a pre-trained model on a specific downstream task
- Tokenization: Converting text into numerical tokens that the model can process
More Information
- Training Repository: https://github.com/Alex-GHP/echocheck
- Dataset: BIGNEWSBLN
- Base Model: RoBERTa-base on HuggingFace
Model Card Authors
Alexandru-Gabriel Morariu
Model Card Contact
- Email: alex.morariu.dev@gmail.com
- GitHub: @Alex-GHP
- Downloads last month
- 137
Model tree for alxdev/echocheck-political-stance
Base model
FacebookAI/roberta-base