#reasoning-tasks

[ follow ]
Artificial intelligence
fromInfoQ
2 months ago

GLM-4.5 Launches with Strong Reasoning, Coding, and Agentic Capabilities

Zhipu AI launched GLM-4.5 and GLM-4.5-Air, AI models for reasoning, coding, and agent tasks with a dual-mode system.
Artificial intelligence
fromInfoQ
6 months ago

Google DeepMind Introduces QuestBench to Evaluate LLMs in Solving Logic and Math Problems

DeepMind's QuestBench benchmark helps evaluate LLMs' capability to ask crucial clarifying questions for solving logic and math problems.
[ Load more ]