Advertisement
Alibaba launches maths-specific AI models said to outperform LLMs from OpenAI, Google
- The new Qwen2-Math large language models are expected to help solve complex maths problems
Reading Time:2 minutes
Why you can trust SCMP
1
Ann Caoin Shanghai
Alibaba Group Holding is aiming to raise the bar in artificial intelligence (AI) development by launching a group of maths-specific large language models (LLMs) called Qwen2-Math, which the e-commerce giant claims can outperform the capabilities of OpenAI’s GPT-4o in that field.
Advertisement
“Over the past year, we have dedicated significant efforts to researching and enhancing the reasoning capabilities of large language models, with a particular focus on their ability to solve arithmetic and mathematical problems,” the Qwen team, part of Alibaba’s cloud computing unit, said in a post published on developer platform GitHub on Thursday. Alibaba owns the South China Morning Post.
The latest LLMs – the technology underpinning generative AI services like ChatGPT – were built on the Qwen2 LLMs released by Alibaba in June and covers three models based on their scale of parameters – a machine-learning term for variables present in an AI system during training, which helps establish how data prompts yield the desired output.
The model with the largest parameter count, Qwen2-Math-72B-Instruct, outperformed proprietary US-developed LLMs in maths benchmarks, according to the Qwen team’s post. Those included GPT-4o, Anthropic’s Claude 3.5 Sonnet, Google’s Gemini 1.5 Pro and Meta Platforms’ Llama-3.1-405B.
“We hope that Qwen2-Math can contribute to the community for solving complex mathematical problems,” the post said.
The Qwen2-Math AI models were tested on both English and Chinese maths benchmarks, according to the post. These included GSM8K, a data set of 8,500 high-quality linguistically diverse grade school maths problems; OlympiadBench, a high-level bilingual multimodal scientific benchmark; and the gaokao, the mainland’s daunting university entrance examination.
Advertisement