Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      CodeSOD: A Unique Way to Primary Key

      July 22, 2025

      BrowserStack launches Figma plugin for detecting accessibility issues in design phase

      July 22, 2025

      Parasoft brings agentic AI to service virtualization in latest release

      July 22, 2025

      Node.js vs. Python for Backend: 7 Reasons C-Level Leaders Choose Node.js Talent

      July 21, 2025

      The best CRM software with email marketing in 2025: Expert tested and reviewed

      July 22, 2025

      This multi-port car charger can power 4 gadgets at once – and it’s surprisingly cheap

      July 22, 2025

      I’m a wearables editor and here are the 7 Pixel Watch 4 rumors I’m most curious about

      July 22, 2025

      8 ways I quickly leveled up my Linux skills – and you can too

      July 22, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The Intersection of Agile and Accessibility – A Series on Designing for Everyone

      July 22, 2025
      Recent

      The Intersection of Agile and Accessibility – A Series on Designing for Everyone

      July 22, 2025

      Zero Trust & Cybersecurity Mesh: Your Org’s Survival Guide

      July 22, 2025

      Execute Ping Commands and Get Back Structured Data in PHP

      July 22, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      A Tomb Raider composer has been jailed — His legacy overshadowed by $75k+ in loan fraud

      July 22, 2025
      Recent

      A Tomb Raider composer has been jailed — His legacy overshadowed by $75k+ in loan fraud

      July 22, 2025

      “I don’t think I changed his mind” — NVIDIA CEO comments on H20 AI GPU sales resuming in China following a meeting with President Trump

      July 22, 2025

      Galaxy Z Fold 7 review: Six years later — Samsung finally cracks the foldable code

      July 22, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»MIT Researchers Introduce DISCIPL: A Self-Steering Framework Using Planner and Follower Language Models for Efficient Constrained Generation and Reasoning

    MIT Researchers Introduce DISCIPL: A Self-Steering Framework Using Planner and Follower Language Models for Efficient Constrained Generation and Reasoning

    April 16, 2025

    Language models predict sequences of words based on vast datasets and are increasingly expected to reason and perform complex linguistic manipulations. Yet, despite their growing sophistication, even powerful models often falter when assigned problems that require step-by-step logic, especially those bound by explicit constraints or structured problem-solving, highlighting their current limitations in applied reasoning.

    The difficulty arises in generating language that strictly adheres to given conditions. Tasks might specify exact word counts, position of keywords, or thematic constraints, all of which are challenging for models prioritizing probability-based fluency. For example, models often fail to construct a coherent sentence while embedding words at particular locations or composing paragraphs under multiple concurrent requirements. The challenge isn’t just generating relevant content but generating content that rigidly fits a set of formal, predefined rules without compromising fluency.

    Currently, methods like chain-of-thought prompting attempt to guide models through a reasoning path, but these are limited by their serial execution and expensive inference costs. Parallel approaches such as guess-and-check or best-of-N sampling rely on generating and filtering multiple candidates. Yet, they need separate scoring mechanisms and often yield inconsistent results. These tools improve performance slightly but cannot guarantee the satisfaction of all constraints, especially when models lack an inherent understanding of those constraints.

    Researchers from MIT and Yale introduced a novel approach named DISCIPL, designed to enable what they term “self-steering” language models. This method defines two roles: a Planner language model, which generates a tailored inference program, and a population of Follower models that execute this program to solve the task. Unlike previous systems, the Planner creates a logic that structures the reasoning process. By separating the planning from execution, the method allows for dynamic and adaptive computation strategies tailored to each task.

    The inner workings of DISCIPL involve generating inference code using a language called LLAMPPL, which is a Python-based framework for probabilistic programming with language models. The Planner writes code that defines how to explore possible solutions, while Follower models run the code to search for valid outputs. These programs operate by iteratively proposing partial solutions and scoring them based on constraints. The architecture supports multiple inference techniques, including importance sampling, sequential Monte Carlo (SMC), and rejection sampling, which are scalable based on computational budgets. This structured decomposition lets the system reallocate resources to more promising candidates during execution, improving precision and efficiency.

    In performance evaluations, DISCIPL proved remarkably effective. On the COLLIE benchmark for constrained sentence generation, the Follower model Llama-3.2-1B alone achieved only 4% Pass@1 success. When enhanced with DISCIPL and SMC, performance rose to 87%, surpassing GPT-4o-mini in some instances. The same setup scored as high as 88% Pass@1 for paragraph-level tasks. On a set of difficult real-world tasks called PUZZLES, covering grant writing and itinerary planning, DISCIPL consistently outperformed both the Planner and Follower operating alone. The method also demonstrated high coherency, with average scores around 7.45 out of 10 when using SMC, which starkly contrasts the 9+ scores from more fluent but incorrect outputs produced by baseline methods.

    Overall, the work introduces a fresh direction in language modeling where models generate answers and devise how they should be computed. By letting the Planner generate code that structures reasoning and Followers execute this code in parallel, the method achieves precision, adaptability, and fluency without requiring larger models or manual engineering. The research’s results illustrate a clear path for enabling smaller language models to outperform their size through intelligent orchestration and self-guided inference.


    Here is the Paper. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 90k+ ML SubReddit.

    🔥 [Register Now] miniCON Virtual Conference on AGENTIC AI: FREE REGISTRATION + Certificate of Attendance + 4 Hour Short Event (May 21, 9 am- 1 pm PST) + Hands on Workshop

    The post MIT Researchers Introduce DISCIPL: A Self-Steering Framework Using Planner and Follower Language Models for Efficient Constrained Generation and Reasoning appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleJetBrains AI Assistant : Revolutionizing Tech Solutions
    Next Article Model Compression Without Compromise: Loop-Residual Neural Networks Show Comparable Results to Larger GPT-2 Variants Using Iterative Refinement

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    July 22, 2025
    Machine Learning

    Boolformer: Symbolic Regression of Logic Functions with Transformers

    July 22, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    CVE-2025-30744 – Oracle Mobile Field Service HTTP Unauthorized Access and Data Manipulation Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-28168 – Outsystems Unrestricted File Upload Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-4273 – Intel UEFI Secure Boot Signature Validation Bypass

    Common Vulnerabilities and Exposures (CVEs)

    Uiua is a general purpose array-oriented programming language

    Linux

    Highlights

    Google Chrome will check Windows 11 eligibility on your PC for Windows 10 EOL

    July 4, 2025

    Microsoft has extended Windows 10’s support with some conditions, and Google Chrome has already started…

    CVE-2025-5383 – Yifang CMS Article Management Module Cross-Site Scripting Vulnerability

    May 31, 2025

    Everwild’s cancellation has me worried for one of my favorite dev teams and Xbox itself — It needs creative new games to thrive and refresh its identity

    July 2, 2025

    Storm-1977 Hits Education Clouds with AzureChecker, Deploys 200+ Crypto Mining Containers

    April 27, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.