Maths test stumps AI models: which number is bigger, 9.90 or 9.11?

Name: How does China’s AI stack up against ChatGPT?
Uploaded: 2024-07-18T01:00:14.000Z
Duration: 5 min 3 s
Description: How does China’s AI stack up against ChatGPT?

Large language models, the technology underpinning generative AI services like ChatGPT, struggle with basic maths knowledge

Reading Time:2 minutes

Why you can trust SCMP

Generative artificial intelligence technology does not inherently possess mathematical capabilities. Illustration: Shutterstock

Wency Chenin Shanghai

Published: 9:00am, 18 Jul 2024

The wave of artificial intelligence (AI) chatbots allowed for public use in mainland China enables many users to create new content – including audio, code, images, simulations, videos and grammatically correct text – to entertain and help with everyday tasks.

That demand has led to the local development of more than 200 large language models (LLMs), the technology underpinning generative AI (GenAI) services like ChatGPT. LLMs are deep-learning AI algorithms that can recognise, summarise, translate, predict and generate content using very large data sets.

In spite of such resources behind chatbots, AI models have been proven to struggle with basic maths knowledge this past weekend on the Chinese reality show Singer 2024, a singing competition produced by Hunan Television.

Mainland artist Sun Nan received 13.8 per cent of online votes to edge out US singer Chanté Moore, who received 13.11 per cent of votes. Some local netizens poked fun at the ranking, claiming that the latter number was larger. Ask AI, one commenter suggested. The results they got were mixed.

05:03

How does China’s AI stack up against ChatGPT?

Both Moonshot AI’s chatbot Kimi and Baichuan’s own Baixiaoying initially gave the wrong answer. They corrected themselves, as well as apologised, after the user who made the query adopted a so-called chain-of-thought approach – a reasoning method in which an AI application is guided step-by-step through a problem.

Select Voice

Choose your listening speed

Get through articles 2x faster

1.25x

250 WPM

Slow

Average

Fast

00:0000:00

1.25x