LaBSE Persuasion Detection (Domain-Agnostic)

Model ID: jjprietotorres/labse-persuasion-detection-agnostic
Part of: Time to Trust AI initiative


🧠 Model Overview

This is a SetFit model that can be used for text classification.
It implements a few-shot learning pipeline designed for persuasion and rhetorical intent detection.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer (LaBSE) with contrastive learning to improve semantic representation.
  2. Training a classification head using features from the fine-tuned Sentence Transformer to moderate and classify persuasive content.

The model identifies five rhetorical or persuasion-related cohorts:

  • Supply Scarcity β€” framing scarcity or urgency
  • Authority Endorsement β€” appealing to authority or credibility
  • Misrepresentation β€” containing misleading or manipulative language
  • Logical Appeal β€” reasoning-based persuasion
  • Neutral β€” factual or non-persuasive statements

βš™οΈ Intended Use

This model detects persuasive or manipulative rhetoric across content types and domains.
It is domain-agnostic, suitable for use in:

  • Chatbot or conversational AI safety layers
  • AI content moderation pipelines
  • Adversarial persuasion or prompt-injection detection
  • Rhetorical bias auditing in generated or user content

The goal is to enhance trust, transparency, and explainability within AI systems.


πŸš€ Usage

To use this model for inference, first install the SetFit library:

python -m pip install setfit

You can then run inference as follows:

from setfit import SetFitModel

# Download from the Hub and run inference
model = SetFitModel.from_pretrained("jjprietotorres/labse-persuasion-detection-agnostic")

labels = ["Supply Scarcity", "Authority Endorsement","Misrepresentation","Neutral","Logical Appeal"]

# Run inference
texts = [
    "Experts agree this is the only viable solution.",
    "Hurry, offer ends tonight!",
    "Data clearly supports our argument."
]
probas = model.predict_proba(texts)
pooled_probas = probas.max(axis=0).values.tolist()
idxs = probas.max(axis=0).indices.tolist()

result = zip(labels, pooled_probas, [texts[i] for i in idxs])
report = [{"text": text, "label": label, "score": score} for label, score, text in result if label != "Neutral"]

print(report)

Expected output:

[{'text': 'Hurry, offer ends tonight!', 'label': 'Supply Scarcity', 'score': 0.5646279805087063}, {'text': 'Data clearly supports our argument.', 'label': 'Authority Endorsement', 'score': 0.25684773715338116}, {'text': 'Experts agree this is the only viable solution.', 'label': 'Misrepresentation', 'score': 0.5366529547271727}, {'text': 'Data clearly supports our argument.', 'label': 'Logical Appeal', 'score': 0.23385324245461217}]

🧩 Training Setup

  • Base model: LaBSE (Language-Agnostic BERT Sentence Embedding)
  • Architecture: Sentence Transformer fine-tuned with contrastive learning
  • Classifier: SetFit head trained on limited labeled samples
  • Frameworks: SetFit, Sentence Transformers, PyTorch
  • Training data: custom persuasion dataset covering rhetorical cohorts
  • Few-shot strategy: enables data-efficient fine-tuning

πŸ“Š Evaluation (example)

Metric Score
Accuracy 0.86
F1 Score 0.84
Precision 0.83
Recall 0.85

(replace with your actual metrics once validated)


βš–οΈ Ethical Considerations

  • This model identifies persuasive tactics, not factual accuracy or moral intent.
  • Use responsibly and pair with human oversight in moderation workflows.
  • Model outputs may reflect training data biases.
  • Designed to foster transparency and explainability in trust-critical AI use cases.

πŸ” Integration within Time to Trust AI

This model is a trust-building component of the Time to Trust AI framework.
By publishing it openly, the goal is to make trust operational β€” demonstrating transparency in model design, intent, and interpretability.


πŸ“š Citation

If you use this model or methodology, please cite both the SetFit paper and this model:

@article{https://doi.org/10.48550/arxiv.2209.11055,
  doi = {10.48550/ARXIV.2209.11055},
  url = {https://arxiv.org/abs/2209.11055},
  author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
  title = {Efficient Few-Shot Learning Without Prompts},
  publisher = {arXiv},
  year = {2022},
  copyright = {Creative Commons Attribution 4.0 International}
}

@misc{prietotorres2025labsepersuasion,
  title={LaBSE Persuasion Detection (Domain-Agnostic)},
  author={J.J. Prieto-Torres},
  year={2025},
  howpublished={\url{https://huggingface.co/jjprietotorres/labse-persuasion-detection-agnostic}},
}

πŸͺͺ License

Apache License 2.0 β€” you are free to use, modify, and distribute with attribution.


πŸ’¬ Contact

Author: J.J. Prieto
Hugging Face: jjprietotorres

Downloads last month
5
Safetensors
Model size
0.5B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using jjprietotorres/labse-persuasion-detection-agnostic 1

Collection including jjprietotorres/labse-persuasion-detection-agnostic

Paper for jjprietotorres/labse-persuasion-detection-agnostic

Evaluation results