Spaces:

microsoft
/

paza-bench

Running

App Files Files Community

Some languages are not supported by the models

by TrueTuner - opened Mar 21

Discussion

TrueTuner

Mar 21

First of all, the benchmark is excellent work and very useful for the community.

But I have a question:

You calculated the WER scores for:

granite-speech
nvidia-nemo (parakeet)
kyutai

However, it is clearly stated that these models do not support African languages.
Have you fine-tuned each of these models on your datasets?
Or do these models actually perform well in African languages natively?

Thank you for clarifying this for us.

muchai-mercy

Microsoft org Apr 16

•

edited Apr 30

Thanks @TrueTuner . Yes, some of the models do not support African languages. We did not fine-tune these models except the Paza models. All 3 metrics reported are from the base models. However, one intention of this benchmark was to highlight the zero-shot accuracy vs efficiency trade-offs across SOTA ASR models, as a useful output for finetuning considerations. For example, Parakeet achieves the best RTFx on most languages, which is a strong efficiency baseline over all other models and it becomes a practical candidate for fine-tuning on unsupported African languages rather than a claim of native language support.

muchai-mercy changed discussion status to closed Apr 16

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment