rvo commited on
Commit
6ef5dfe
·
verified ·
1 Parent(s): 6614f0d

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -14
README.md CHANGED
@@ -47,7 +47,7 @@ A technical report detailing our proposed `LEAF` training procedure is [availabl
47
 
48
  * **State-of-the-Art Performance**: `mdbr-leaf-mt` achieves new state-of-the-art results for compact embedding models, **ranking #1** on the [public MTEB v2 (Eng) benchmark leaderboard](https://huggingface.co/spaces/mteb/leaderboard) for models with ≤30M parameters.
49
  * **Flexible Architecture Support**: `mdbr-leaf-mt` supports asymmetric retrieval architectures enabling even greater retrieval results. [See below](#asymmetric-retrieval-setup) for more information.
50
- * **MRL and Quantization Support**: embedding vectors generated by `mdbr-leaf-mt` compress well when truncated (MRL) and can be stored using more efficient types like `int8` and `binary`. [See below](#mrl) for more information.
51
 
52
  ## Benchmark Comparison
53
 
@@ -114,8 +114,11 @@ for i, query in enumerate(queries):
114
 
115
  See [here](https://huggingface.co/MongoDB/mdbr-leaf-mt/blob/main/transformers_example_mt.ipynb).
116
 
117
- ## Asymmetric Retrieval Setup
118
-
 
 
 
119
  `mdbr-leaf-mt` is *aligned* to [`mxbai-embed-large-v1`](https://huggingface.co/mixedbread-ai/mxbai-embed-large-v1), the model it has been distilled from, making the asymmetric system below possible:
120
 
121
  ```python
@@ -136,25 +139,19 @@ Retrieval results from asymmetric mode are usually superior to the [standard mod
136
 
137
  Embeddings have been trained via [MRL](https://arxiv.org/abs/2205.13147) and can be truncated for more efficient storage:
138
  ```python
139
- from torch.nn import functional as F
140
-
141
- query_embeds = model.encode(queries, prompt_name="query", convert_to_tensor=True)
142
- doc_embeds = model.encode(documents, convert_to_tensor=True)
143
-
144
- # Truncate and normalize according to MRL
145
- query_embeds = F.normalize(query_embeds[:, :256], dim=-1)
146
- doc_embeds = F.normalize(doc_embeds[:, :256], dim=-1)
147
 
148
  similarities = model.similarity(query_embeds, doc_embeds)
149
 
150
  print('After MRL:')
151
  print(f"* Embeddings dimension: {query_embeds.shape[1]}")
152
- print(f"* Similarities:\n\t{similarities}")
153
 
154
  # After MRL:
155
  # * Embeddings dimension: 256
156
  # * Similarities:
157
- # tensor([[0.9164, 0.7219],
158
  # [0.6682, 0.8393]], device='cuda:0')
159
  ```
160
 
@@ -180,7 +177,7 @@ similarities = query_embeds.astype(int) @ doc_embeds.astype(int).T
180
 
181
  print('After quantization:')
182
  print(f"* Embeddings type: {query_embeds.dtype}")
183
- print(f"* Similarities:\n{similarities}")
184
 
185
  # After quantization:
186
  # * Embeddings type: int8
 
47
 
48
  * **State-of-the-Art Performance**: `mdbr-leaf-mt` achieves new state-of-the-art results for compact embedding models, **ranking #1** on the [public MTEB v2 (Eng) benchmark leaderboard](https://huggingface.co/spaces/mteb/leaderboard) for models with ≤30M parameters.
49
  * **Flexible Architecture Support**: `mdbr-leaf-mt` supports asymmetric retrieval architectures enabling even greater retrieval results. [See below](#asymmetric-retrieval-setup) for more information.
50
+ * **MRL and Quantization Support**: embedding vectors generated by `mdbr-leaf-mt` compress well when truncated (MRL) and can be stored using more efficient types like `int8` and `binary`. [See below](#mrl-truncation) for more information.
51
 
52
  ## Benchmark Comparison
53
 
 
114
 
115
  See [here](https://huggingface.co/MongoDB/mdbr-leaf-mt/blob/main/transformers_example_mt.ipynb).
116
 
117
+ ## Asymmetric Retrieval Setup
118
+
119
+ > [!Note]
120
+ > **Note**: a version of this asymmetric setup, conveniently packaged into a single model, is [available here](https://huggingface.co/MongoDB/mdbr-leaf-mt-asym).
121
+
122
  `mdbr-leaf-mt` is *aligned* to [`mxbai-embed-large-v1`](https://huggingface.co/mixedbread-ai/mxbai-embed-large-v1), the model it has been distilled from, making the asymmetric system below possible:
123
 
124
  ```python
 
139
 
140
  Embeddings have been trained via [MRL](https://arxiv.org/abs/2205.13147) and can be truncated for more efficient storage:
141
  ```python
142
+ query_embeds = model.encode(queries, prompt_name="query", truncate_dim=256)
143
+ doc_embeds = model.encode(documents, truncate_dim=256)
 
 
 
 
 
 
144
 
145
  similarities = model.similarity(query_embeds, doc_embeds)
146
 
147
  print('After MRL:')
148
  print(f"* Embeddings dimension: {query_embeds.shape[1]}")
149
+ print(f"* Similarities: \n\t{similarities}")
150
 
151
  # After MRL:
152
  # * Embeddings dimension: 256
153
  # * Similarities:
154
+ # tensor([[0.9164, 0.7219],
155
  # [0.6682, 0.8393]], device='cuda:0')
156
  ```
157
 
 
177
 
178
  print('After quantization:')
179
  print(f"* Embeddings type: {query_embeds.dtype}")
180
+ print(f"* Similarities: \n{similarities}")
181
 
182
  # After quantization:
183
  # * Embeddings type: int8