Upload README.md
Browse files
README.md
CHANGED
|
@@ -47,7 +47,7 @@ A technical report detailing our proposed `LEAF` training procedure is [availabl
|
|
| 47 |
|
| 48 |
* **State-of-the-Art Performance**: `mdbr-leaf-mt` achieves new state-of-the-art results for compact embedding models, **ranking #1** on the [public MTEB v2 (Eng) benchmark leaderboard](https://huggingface.co/spaces/mteb/leaderboard) for models with ≤30M parameters.
|
| 49 |
* **Flexible Architecture Support**: `mdbr-leaf-mt` supports asymmetric retrieval architectures enabling even greater retrieval results. [See below](#asymmetric-retrieval-setup) for more information.
|
| 50 |
-
* **MRL and Quantization Support**: embedding vectors generated by `mdbr-leaf-mt` compress well when truncated (MRL) and can be stored using more efficient types like `int8` and `binary`.
|
| 51 |
|
| 52 |
## Benchmark Comparison
|
| 53 |
|
|
@@ -114,8 +114,11 @@ for i, query in enumerate(queries):
|
|
| 114 |
|
| 115 |
See [here](https://huggingface.co/MongoDB/mdbr-leaf-mt/blob/main/transformers_example_mt.ipynb).
|
| 116 |
|
| 117 |
-
## Asymmetric Retrieval Setup
|
| 118 |
-
|
|
|
|
|
|
|
|
|
|
| 119 |
`mdbr-leaf-mt` is *aligned* to [`mxbai-embed-large-v1`](https://huggingface.co/mixedbread-ai/mxbai-embed-large-v1), the model it has been distilled from, making the asymmetric system below possible:
|
| 120 |
|
| 121 |
```python
|
|
@@ -136,25 +139,19 @@ Retrieval results from asymmetric mode are usually superior to the [standard mod
|
|
| 136 |
|
| 137 |
Embeddings have been trained via [MRL](https://arxiv.org/abs/2205.13147) and can be truncated for more efficient storage:
|
| 138 |
```python
|
| 139 |
-
|
| 140 |
-
|
| 141 |
-
query_embeds = model.encode(queries, prompt_name="query", convert_to_tensor=True)
|
| 142 |
-
doc_embeds = model.encode(documents, convert_to_tensor=True)
|
| 143 |
-
|
| 144 |
-
# Truncate and normalize according to MRL
|
| 145 |
-
query_embeds = F.normalize(query_embeds[:, :256], dim=-1)
|
| 146 |
-
doc_embeds = F.normalize(doc_embeds[:, :256], dim=-1)
|
| 147 |
|
| 148 |
similarities = model.similarity(query_embeds, doc_embeds)
|
| 149 |
|
| 150 |
print('After MRL:')
|
| 151 |
print(f"* Embeddings dimension: {query_embeds.shape[1]}")
|
| 152 |
-
print(f"* Similarities
|
| 153 |
|
| 154 |
# After MRL:
|
| 155 |
# * Embeddings dimension: 256
|
| 156 |
# * Similarities:
|
| 157 |
-
#
|
| 158 |
# [0.6682, 0.8393]], device='cuda:0')
|
| 159 |
```
|
| 160 |
|
|
@@ -180,7 +177,7 @@ similarities = query_embeds.astype(int) @ doc_embeds.astype(int).T
|
|
| 180 |
|
| 181 |
print('After quantization:')
|
| 182 |
print(f"* Embeddings type: {query_embeds.dtype}")
|
| 183 |
-
print(f"* Similarities
|
| 184 |
|
| 185 |
# After quantization:
|
| 186 |
# * Embeddings type: int8
|
|
|
|
| 47 |
|
| 48 |
* **State-of-the-Art Performance**: `mdbr-leaf-mt` achieves new state-of-the-art results for compact embedding models, **ranking #1** on the [public MTEB v2 (Eng) benchmark leaderboard](https://huggingface.co/spaces/mteb/leaderboard) for models with ≤30M parameters.
|
| 49 |
* **Flexible Architecture Support**: `mdbr-leaf-mt` supports asymmetric retrieval architectures enabling even greater retrieval results. [See below](#asymmetric-retrieval-setup) for more information.
|
| 50 |
+
* **MRL and Quantization Support**: embedding vectors generated by `mdbr-leaf-mt` compress well when truncated (MRL) and can be stored using more efficient types like `int8` and `binary`. [See below](#mrl-truncation) for more information.
|
| 51 |
|
| 52 |
## Benchmark Comparison
|
| 53 |
|
|
|
|
| 114 |
|
| 115 |
See [here](https://huggingface.co/MongoDB/mdbr-leaf-mt/blob/main/transformers_example_mt.ipynb).
|
| 116 |
|
| 117 |
+
## Asymmetric Retrieval Setup
|
| 118 |
+
|
| 119 |
+
> [!Note]
|
| 120 |
+
> **Note**: a version of this asymmetric setup, conveniently packaged into a single model, is [available here](https://huggingface.co/MongoDB/mdbr-leaf-mt-asym).
|
| 121 |
+
|
| 122 |
`mdbr-leaf-mt` is *aligned* to [`mxbai-embed-large-v1`](https://huggingface.co/mixedbread-ai/mxbai-embed-large-v1), the model it has been distilled from, making the asymmetric system below possible:
|
| 123 |
|
| 124 |
```python
|
|
|
|
| 139 |
|
| 140 |
Embeddings have been trained via [MRL](https://arxiv.org/abs/2205.13147) and can be truncated for more efficient storage:
|
| 141 |
```python
|
| 142 |
+
query_embeds = model.encode(queries, prompt_name="query", truncate_dim=256)
|
| 143 |
+
doc_embeds = model.encode(documents, truncate_dim=256)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 144 |
|
| 145 |
similarities = model.similarity(query_embeds, doc_embeds)
|
| 146 |
|
| 147 |
print('After MRL:')
|
| 148 |
print(f"* Embeddings dimension: {query_embeds.shape[1]}")
|
| 149 |
+
print(f"* Similarities: \n\t{similarities}")
|
| 150 |
|
| 151 |
# After MRL:
|
| 152 |
# * Embeddings dimension: 256
|
| 153 |
# * Similarities:
|
| 154 |
+
# tensor([[0.9164, 0.7219],
|
| 155 |
# [0.6682, 0.8393]], device='cuda:0')
|
| 156 |
```
|
| 157 |
|
|
|
|
| 177 |
|
| 178 |
print('After quantization:')
|
| 179 |
print(f"* Embeddings type: {query_embeds.dtype}")
|
| 180 |
+
print(f"* Similarities: \n{similarities}")
|
| 181 |
|
| 182 |
# After quantization:
|
| 183 |
# * Embeddings type: int8
|