News

Phi-4 and an rStar-Math paper suggest that compact, specialized models can provide powerful alternatives to the industry’s largest systems.
Telling AI model to “take a deep breath” causes math scores to soar in study DeepMind used AI models to optimize their own prompts, with surprising results.
Driven by new technology called OpenAI o1, the chatbot can test various strategies and try to identify mistakes as it tackles complex tasks.
On the MATH benchmark of competition level math word problems, for example, Meta's model posted a score of 73.8, compared to GPT-4o's 76.6 and Claude 3.5 Sonnet's 71.1.