LaBSE Persuasion Detection (Domain-Agnostic)

Model ID: jjprietotorres/labse-persuasion-detection-agnostic
Part of: Time to Trust AI initiative

🧠 Model Overview

This is a SetFit model that can be used for text classification.
It implements a few-shot learning pipeline designed for persuasion and rhetorical intent detection.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer (LaBSE) with contrastive learning to improve semantic representation.
Training a classification head using features from the fine-tuned Sentence Transformer to moderate and classify persuasive content.

The model identifies five rhetorical or persuasion-related cohorts:

Supply Scarcity — framing scarcity or urgency
Authority Endorsement — appealing to authority or credibility
Misrepresentation — containing misleading or manipulative language
Logical Appeal — reasoning-based persuasion
Neutral — factual or non-persuasive statements

⚙️ Intended Use

This model detects persuasive or manipulative rhetoric across content types and domains.
It is domain-agnostic, suitable for use in:

Chatbot or conversational AI safety layers
AI content moderation pipelines
Adversarial persuasion or prompt-injection detection
Rhetorical bias auditing in generated or user content

The goal is to enhance trust, transparency, and explainability within AI systems.

🚀 Usage

To use this model for inference, first install the SetFit library:

python -m pip install setfit

You can then run inference as follows:

from setfit import SetFitModel

# Download from the Hub and run inference
model = SetFitModel.from_pretrained("jjprietotorres/labse-persuasion-detection-agnostic")

labels = ["Supply Scarcity", "Authority Endorsement","Misrepresentation","Neutral","Logical Appeal"]

# Run inference
texts = [
    "Experts agree this is the only viable solution.",
    "Hurry, offer ends tonight!",
    "Data clearly supports our argument."
]
probas = model.predict_proba(texts)
pooled_probas = probas.max(axis=0).values.tolist()
idxs = probas.max(axis=0).indices.tolist()

result = zip(labels, pooled_probas, [texts[i] for i in idxs])
report = [{"text": text, "label": label, "score": score} for label, score, text in result if label != "Neutral"]

print(report)

Expected output:

[{'text': 'Hurry, offer ends tonight!', 'label': 'Supply Scarcity', 'score': 0.5646279805087063}, {'text': 'Data clearly supports our argument.', 'label': 'Authority Endorsement', 'score': 0.25684773715338116}, {'text': 'Experts agree this is the only viable solution.', 'label': 'Misrepresentation', 'score': 0.5366529547271727}, {'text': 'Data clearly supports our argument.', 'label': 'Logical Appeal', 'score': 0.23385324245461217}]

🧩 Training Setup

Base model: LaBSE (Language-Agnostic BERT Sentence Embedding)
Architecture: Sentence Transformer fine-tuned with contrastive learning
Classifier: SetFit head trained on limited labeled samples
Frameworks: SetFit, Sentence Transformers, PyTorch
Training data: custom persuasion dataset covering rhetorical cohorts
Few-shot strategy: enables data-efficient fine-tuning

📊 Evaluation (example)

Metric	Score
Accuracy	0.86
F1 Score	0.84
Precision	0.83
Recall	0.85

(replace with your actual metrics once validated)

⚖️ Ethical Considerations

This model identifies persuasive tactics, not factual accuracy or moral intent.
Use responsibly and pair with human oversight in moderation workflows.
Model outputs may reflect training data biases.
Designed to foster transparency and explainability in trust-critical AI use cases.

🔍 Integration within Time to Trust AI

This model is a trust-building component of the Time to Trust AI framework.
By publishing it openly, the goal is to make trust operational — demonstrating transparency in model design, intent, and interpretability.

📚 Citation

If you use this model or methodology, please cite both the SetFit paper and this model:

@article{https://doi.org/10.48550/arxiv.2209.11055,
  doi = {10.48550/ARXIV.2209.11055},
  url = {https://arxiv.org/abs/2209.11055},
  author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
  title = {Efficient Few-Shot Learning Without Prompts},
  publisher = {arXiv},
  year = {2022},
  copyright = {Creative Commons Attribution 4.0 International}
}

@misc{prietotorres2025labsepersuasion,
  title={LaBSE Persuasion Detection (Domain-Agnostic)},
  author={J.J. Prieto-Torres},
  year={2025},
  howpublished={\url{https://huggingface.co/jjprietotorres/labse-persuasion-detection-agnostic}},
}

🪪 License

Apache License 2.0 — you are free to use, modify, and distribute with attribution.

💬 Contact

Author: J.J. Prieto
Hugging Face: jjprietotorres

Downloads last month: 5

Safetensors

Model size

0.5B params

Tensor type

F32

Space using jjprietotorres/labse-persuasion-detection-agnostic 1

Collection including jjprietotorres/labse-persuasion-detection-agnostic

Time to trust

Collection

2 items • Updated Nov 3, 2025

Paper for jjprietotorres/labse-persuasion-detection-agnostic

Efficient Few-Shot Learning Without Prompts

Paper • 2209.11055 • Published Sep 22, 2022 • 4

jjprietotorres
/

labse-persuasion-detection-agnostic