Reasoning Models Know When They’re Right: NYU Researchers Introduce a Hidden-State Probe That Enables Efficient Self-Verification and Reduces Token Usage by 24%

Artificial intelligence systems have made significant strides in simulating human-style reasoning, particularly mathematics and logic. These models don’t just generate answers—they walk through a series of logical steps to reach conclusions, offering insights into how and why those answers are produced. This step-by-step reasoning, often called Chain-of-Thought (CoT), has become vital in how machines handle complex problem-solving tasks.

A common problem researchers encounter with these models is inefficiency during inference. Reasoning models often continue processing even after reaching a correct conclusion. This overthinking results in the unnecessary generation of tokens, increasing computational cost. Whether these models have an internal sense of correctness remains unclear—do they realize when an intermediate answer is right? If they could identify this internally, the models could halt processing earlier, becoming more efficient without losing accuracy.

Many current approaches measure a model’s confidence through verbal prompts or by analyzing multiple outputs. These black-box strategies ask the model to report how sure it is of its answer. However, they are often imprecise and computationally expensive. On the other hand, white-box methods investigate models’ internal hidden states to extract signals that may correlate with answer correctness. Prior work shows that a model’s internal states can indicate the validity of final answers, but applying this to intermediate steps in long reasoning chains is still an underexplored direction.

The research introduced by a team from New York University and NYU Shanghai tackled this gap by designing a lightweight probe—a simple two-layer neural network—to inspect a model’s hidden states at intermediate reasoning steps. The models used for experimentation included the DeepSeek-R1-Distill series and QwQ-32B, known for their step-by-step reasoning capabilities. These models were tested across various datasets involving mathematical and logical tasks. The researchers trained their probe to read the internal state associated with each chunk of reasoning and predict whether the current intermediate answer was correct.

To construct their approach, the researchers first segmented each long CoT output into smaller parts or chunks, using markers like “wait” or “verify” to identify breaks in reasoning. They used the last token’s hidden state in each chunk as a representation and matched this to a correctness label, which was judged using another model. These representations were then used to train the probe on binary classification tasks. The probe was fine-tuned using grid search across hyperparameters like learning rate and hidden layer size, with most models converging to linear probes—indicating that correctness information is often linearly embedded in the hidden states. The probe worked for fully formed answers and showed the ability to predict correctness before an answer was even completed, hinting at look-ahead capabilities.

Performance results were clear and quantifiable. The probes achieved ROC-AUC scores exceeding 0.9 for some datasets like AIME when using models like R1-Distill-Qwen-32B. Expected Calibration Errors (ECE) remained under 0.1, showing high reliability. For example, R1-Distill-Qwen-32B had an ECE of just 0.01 on GSM8K and 0.06 on MATH datasets. In application, the probe was used to implement a confidence-based early exit strategy during inference. The reasoning process was stopped when the probe’s confidence in an answer exceeded a threshold. At a confidence threshold of 0.85, the accuracy remained at 88.2%, while the inference token count was reduced by 24%. Even at a threshold of 0.9, accuracy stayed at 88.6%, with a 19% token reduction. Compared to static exit methods, this dynamic strategy achieved up to 5% higher accuracy using the same or fewer tokens.

This study offers an efficient, integrated way for reasoning models to self-verify during inference. The researchers’ approach pinpoints a gap—while models inherently know when they’re right, they don’t act on it. The research reveals a path toward smarter, more efficient reasoning systems by leveraging internal representations through probing. It shows that tapping into what the model already “knows” can lead to meaningful performance and resource use improvements.

Check out Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 85k+ ML SubReddit.

The post Reasoning Models Know When They’re Right: NYU Researchers Introduce a Hidden-State Probe That Enables Efficient Self-Verification and Reduces Token Usage by 24% appeared first on MarkTechPost.

Source: Read MoreÂ

CodeSOD: A Unique Way to Primary Key

BrowserStack launches Figma plugin for detecting accessibility issues in design phase

Parasoft brings agentic AI to service virtualization in latest release

Node.js vs. Python for Backend: 7 Reasons C-Level Leaders Choose Node.js Talent

The best CRM software with email marketing in 2025: Expert tested and reviewed

This multi-port car charger can power 4 gadgets at once – and it’s surprisingly cheap

I’m a wearables editor and here are the 7 Pixel Watch 4 rumors I’m most curious about

8 ways I quickly leveled up my Linux skills – and you can too

The Intersection of Agile and Accessibility – A Series on Designing for Everyone

The Intersection of Agile and Accessibility – A Series on Designing for Everyone

Zero Trust & Cybersecurity Mesh: Your Org’s Survival Guide

Execute Ping Commands and Get Back Structured Data in PHP

A Tomb Raider composer has been jailed — His legacy overshadowed by $75k+ in loan fraud

A Tomb Raider composer has been jailed — His legacy overshadowed by $75k+ in loan fraud

“I don’t think I changed his mind” — NVIDIA CEO comments on H20 AI GPU sales resuming in China following a meeting with President Trump

Galaxy Z Fold 7 review: Six years later — Samsung finally cracks the foldable code

Reasoning Models Know When They’re Right: NYU Researchers Introduce a Hidden-State Probe That Enables Efficient Self-Verification and Reduces Token Usage by 24%

How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

Boolformer: Symbolic Regression of Logic Functions with Transformers

CVE-2024-58100 – Linux Kernel bpf Changes_pkt_data Property Vulnerability

RustoBot Botnet Exploits Router Flaws in Sophisticated Attacks

Why developer expertise matters more than ever in the age of AI

CVE-2025-52902 – Apache File Browser Stored XSS Vulnerability

Chrome Zero-Day Alert: CVE-2025-5419 Actively Exploited in the Wild

CVE-2025-45021 – PHPGurukul Directory Management System SQL Injection

Windows 10 KB5055518 remove seconds from the clock, following Windows 11

You Can Now Auto-Generate Google Forms Using Gemini Using Prompts or Files – Here’s How

Reasoning Models Know When They’re Right: NYU Researchers Introduce a Hidden-State Probe That Enables Efficient Self-Verification and Reduces Token Usage by 24%

Related Posts