News

Microsoft's Phi 3 Medium 128K Instruct, Meta's Llama 3 70B, Google's Gemma 2 and Mistral AI's Mistral 7B Instruct were able to output benchmark test sets verbatim. Hunt's data even provided the ...