all-minilm-l6-v2-civic
This is a sentence-transformers model fine-tuned using TSDAE (Transformer-based Denoising AutoEncoder) on government/civic meeting transcripts.
Model Details
- Base Model: all-minilm-l6-v2
- Training Method: TSDAE (unsupervised domain adaptation)
- Original Source: ar9av/all-minilm-l6-v2-civic
Description
TSDAE fine-tuned model on civic/government meeting transcripts. The model was trained on 10,000 government meeting transcripts to learn domain-specific representations while remaining general-purpose for downstream tasks.
Key Features
- โ Learned domain-specific abbreviations (BOS = Board of Supervisors, etc.)
- โ Enhanced understanding of government/civic terminology
- โ General-purpose embeddings suitable for various downstream tasks
Usage
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('TSG/all-minilm-l6-v2-civic')
embeddings = model.encode(['Your text here'])
Training Data
- 10,000 government/civic meeting transcripts
- Various meeting types: City Council, Planning Commission, Board of Education, etc.
- Text cleaned and chunked for training
Evaluation
The model shows significant improvement in:
- Domain-specific semantic similarity
- Abbreviation understanding (BOS, CC, BOE, etc.)
- Clustering quality for civic domain
Intended Use
This model is designed for general-purpose semantic similarity and embedding tasks, with enhanced understanding of government/civic domain language.
Original Model
This model was originally trained and hosted at: https://huggingface.co/ar9av/all-minilm-l6-v2-civic
- Downloads last month
- -