Please check listed GPQA scores, they don't match what's listed on nvidia/NVIDIA-Nemotron-Nano-9B-v2 ...

#1
by SkyMind - opened

...or verify that those listed here are correct (& provide any clarification).

(https://huggingface.co/nvidia/NVIDIA-Nemotron-Nano-9B-v2 says 64.0%, not 83.84.)

Thanks!

Red Hat AI org

Hi @SkyMind . Our results are computed with a different harness than Nvidia's. We did our best to reproduce the evaluation conditions within that harness but results still deviate. That said, we did post the wrong metric for GPQA. We corrected that and now we get around 55%. Thanks for pointing out the discrepancy.

alexmarques changed discussion status to closed

Sign up or log in to comment