Abstract Reasoning HD

Databricks' OfficeQA uncovers disconnect: AI agents ace abstract tests but stall at 45% on enterprise docs

There is no shortage of AI benchmarks in the market today, with popular options like Humanity's Last Exam (HLE), ARC-AGI-2 and GDPval, among numerous others. AI agents excel at solving abstract math ...

NBC News

AI's capabilities may be exaggerated by flawed tests, according to new study

Researchers behind a new study say that the methods used to evaluate AI systems’ capabilities routinely oversell AI performance and lack scientific rigor. The study, led by researchers at the Oxford ...

SiliconANGLE

Samsung researchers create tiny AI model that shames the biggest LLMs in reasoning puzzles

Researchers from Samsung Electronic Co. Ltd. have created a tiny artificial intelligence model that punches far above its weight on certain kinds of “reasoning” tasks, challenging the industry’s ...

TechNode

Ant Group Open-Sources Ring-1T-Preview, a Trillion-Parameter Reasoning Model Scoring Higher Than GPT-5

Ant Group has released Ring-1T-Preview, a trillion-parameter natural language reasoning model and the first open-source system of its scale. On the CodeForces coding benchmark, the preview model ...

TechRepublic

OpenAI and Google DeepMind Outshine Students at World’s Top Coding Contest

OpenAI and Google DeepMind Outshine Students at World’s Top Coding Contest Your email has been sent GPT-5 leads the way with first-try correct solutions Gemini showcases Google DeepMind’s leap in ...

SiliconANGLE

OpenAI, Google reasoning models achieve gold-level scores in ICPC coding contest

OpenAI and Google LLC today disclosed that their latest reasoning models achieved gold-level performance in a recent coding competition. The ICPC, as the event is called, is the world’s most ...

9to5google

Gemini 2.5 Deep Think scores competitive coding gold in ‘profound leap’ for abstract problem-solving

After a mathematics win in July, Gemini 2.5 Deep Think has now earned a gold-medal level performance in competitive coding. The International Collegiate Programming Contest (ICPC) is the “oldest, ...

New York Post

Charlie Kirk lived out the mission that our universities betrayed

Charlie Kirk was a one-man answer to campus illiberalism. REUTERS Charlie Kirk was a one-man answer to campus illiberalism. He rose to prominence at a time when university administrators and ...

The New York Times

The Fever Dream of Imminent Superintelligence Is Finally Breaking

Mr. Marcus is a founder of two A.I. companies and the author of six books on natural and artificial intelligence. GPT-5, OpenAI’s latest artificial intelligence system, was supposed to be a game ...

San Francisco Examiner

New study calls AI reasoning a ‘brittle mirage’

The assertion that generative artificial-intelligence models like OpenAI’s new GPT-5 can reason like people do appears to have taken another blow. A new study published earlier this month by ...

Ars Technica

LLMs’ “simulated reasoning” abilities are a “brittle mirage,” researchers find

In recent months, the AI industry has started moving toward so-called simulated reasoning models that use a “chain of thought” process to work through tricky problems in multiple logical steps. At the ...

IEEE

A Concise Survey on Lane Topology Reasoning for HD Mapping

Abstract: Lane topology reasoning techniques play a crucial role in high-definition (HD) mapping and autonomous driving applications. While recent years have witnessed significant advances in this ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results