--- language: - ru - en - zh library_name: sentence-transformers pipeline_tag: sentence-similarity tags: - sentence-transformers - sentence-similarity - feature-extraction widget: [] license: mit --- # SentenceTransformer This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a 312-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Maximum Sequence Length:** 1024 tokens - **Output Dimensionality:** 312 tokens - **Similarity Function:** Cosine Similarity ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 1024, 'do_lower_case': False}) with Transformer model: RobertaModel (1): Pooling({'word_embedding_dimension': 312, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) (2): Normalize() ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("mlsa-iai-msu-lab/sci-rus-tiny3.1") # Run inference sentences = [ 'The weather is lovely today.', "It's so sunny outside!", 'He drove to the stadium.', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 312] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ### Metrics RuSciBench few shoot | model name | ru | en | total avg | |:------------------------------------|------:|------:|------------:| | mlsa-iai-msu-lab/sci-rus-tiny | 28.37 | 27.87 | 60.79 | | mlsa-iai-msu-lab/sci-rus-small-cite | 38.4 | 38.68 | 67.34 | | mlsa-iai-msu-lab/sci-rus-tiny3-cite | 39.36 | 39.5 | 67.58 | | **mlsa-iai-msu-lab/sci-rus-tiny3.5** | 38.93 | 39.27 | 69.17 | | mlsa-iai-msu-lab/sci-rus-tiny3.1 | 39 | 39.92 | 69.36 | RuSciBench | model_name | ru | en | total avg | |:------------------------------------|------:|------:|------------:| | mlsa-iai-msu-lab/sci-rus-tiny | 35.77 | 35.21 | 64.48 | | mlsa-iai-msu-lab/sci-rus-tiny3-cite | 44.55 | 44.79 | 70.2 | | mlsa-iai-msu-lab/sci-rus-small-cite | 44.53 | 44.81 | 70.4 | | mlsa-iai-msu-lab/sci-rus-tiny3.1 | 44.52 | 45.15 | 72.05 | | **mlsa-iai-msu-lab/sci-rus-tiny3.5** | 44.48 | 45.4 | 72.09 |