Is this model finetuned with MsMarco or mMarco

by rnyak - opened Aug 13, 2024

Aug 13, 2024

•

edited Aug 13, 2024

Hello, thank you for releasing this multilingual reranker model with Apache 2.0 license. I'd like to ask if you used any non-commercially available datasets (e.g. MsMarco, mMARCO) for finetuning /training this model? In the paper ( https://arxiv.org/pdf/2407.19669), section B.2 states that MS MARCO and mMARCO-zh were used. These datasets are for research purpose only.

Could you please clarify?

Thanks.

thenlper

Alibaba-NLP org Aug 13, 2024

Thank you for your inquiry regarding the multilingual reranker model. We appreciate your interest in our work.

To clarify, the model does leverage the MS MARCO and mMARCO-zh datasets, which are indeed intended for research purposes only. We acknowledge the restrictions associated with these datasets and ensure that all usage complies with the terms provided by the dataset creators.

The findings presented in the paper reflect our commitment to using high-quality, publicly available data while adhering to the specified licensing agreements.

Best regards!

rnyak

Aug 13, 2024

thanks for your quick response!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment