Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      CodeSOD: A Unique Way to Primary Key

      July 22, 2025

      BrowserStack launches Figma plugin for detecting accessibility issues in design phase

      July 22, 2025

      Parasoft brings agentic AI to service virtualization in latest release

      July 22, 2025

      Node.js vs. Python for Backend: 7 Reasons C-Level Leaders Choose Node.js Talent

      July 21, 2025

      The best CRM software with email marketing in 2025: Expert tested and reviewed

      July 22, 2025

      This multi-port car charger can power 4 gadgets at once – and it’s surprisingly cheap

      July 22, 2025

      I’m a wearables editor and here are the 7 Pixel Watch 4 rumors I’m most curious about

      July 22, 2025

      8 ways I quickly leveled up my Linux skills – and you can too

      July 22, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The Intersection of Agile and Accessibility – A Series on Designing for Everyone

      July 22, 2025
      Recent

      The Intersection of Agile and Accessibility – A Series on Designing for Everyone

      July 22, 2025

      Zero Trust & Cybersecurity Mesh: Your Org’s Survival Guide

      July 22, 2025

      Execute Ping Commands and Get Back Structured Data in PHP

      July 22, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      A Tomb Raider composer has been jailed — His legacy overshadowed by $75k+ in loan fraud

      July 22, 2025
      Recent

      A Tomb Raider composer has been jailed — His legacy overshadowed by $75k+ in loan fraud

      July 22, 2025

      “I don’t think I changed his mind” — NVIDIA CEO comments on H20 AI GPU sales resuming in China following a meeting with President Trump

      July 22, 2025

      Galaxy Z Fold 7 review: Six years later — Samsung finally cracks the foldable code

      July 22, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Artificial Intelligence»Making AI models more trustworthy for high-stakes settings

    Making AI models more trustworthy for high-stakes settings

    May 1, 2025

    The ambiguity in medical imaging can present major challenges for clinicians who are trying to identify disease. For instance, in a chest X-ray, pleural effusion, an abnormal buildup of fluid in the lungs, can look very much like pulmonary infiltrates, which are accumulations of pus or blood.

    An artificial intelligence model could assist the clinician in X-ray analysis by helping to identify subtle details and boosting the efficiency of the diagnosis process. But because so many possible conditions could be present in one image, the clinician would likely want to consider a set of possibilities, rather than only having one AI prediction to evaluate.

    One promising way to produce a set of possibilities, called conformal classification, is convenient because it can be readily implemented on top of an existing machine-learning model. However, it can produce sets that are impractically large. 

    MIT researchers have now developed a simple and effective improvement that can reduce the size of prediction sets by up to 30 percent while also making predictions more reliable.

    Having a smaller prediction set may help a clinician zero in on the right diagnosis more efficiently, which could improve and streamline treatment for patients. This method could be useful across a range of classification tasks — say, for identifying the species of an animal in an image from a wildlife park — as it provides a smaller but more accurate set of options.

    “With fewer classes to consider, the sets of predictions are naturally more informative in that you are choosing between fewer options. In a sense, you are not really sacrificing anything in terms of accuracy for something that is more informative,” says Divya Shanmugam PhD ’24, a postdoc at Cornell Tech who conducted this research while she was an MIT graduate student.

    Shanmugam is joined on the paper by Helen Lu ’24; Swami Sankaranarayanan, a former MIT postdoc who is now a research scientist at Lilia Biosciences; and senior author John Guttag, the Dugald C. Jackson Professor of Computer Science and Electrical Engineering at MIT and a member of the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). The research will be presented at the Conference on Computer Vision and Pattern Recognition in June.

    Prediction guarantees

    AI assistants deployed for high-stakes tasks, like classifying diseases in medical images, are typically designed to produce a probability score along with each prediction so a user can gauge the model’s confidence. For instance, a model might predict that there is a 20 percent chance an image corresponds to a particular diagnosis, like pleurisy.

    But it is difficult to trust a model’s predicted confidence because much prior research has shown that these probabilities can be inaccurate. With conformal classification, the model’s prediction is replaced by a set of the most probable diagnoses along with a guarantee that the correct diagnosis is somewhere in the set.

    But the inherent uncertainty in AI predictions often causes the model to output sets that are far too large to be useful.

    For instance, if a model is classifying an animal in an image as one of 10,000 potential species, it might output a set of 200 predictions so it can offer a strong guarantee.

    “That is quite a few classes for someone to sift through to figure out what the right class is,” Shanmugam says.

    The technique can also be unreliable because tiny changes to inputs, like slightly rotating an image, can yield entirely different sets of predictions.

    To make conformal classification more useful, the researchers applied a technique developed to improve the accuracy of computer vision models called test-time augmentation (TTA).

    TTA creates multiple augmentations of a single image in a dataset, perhaps by cropping the image, flipping it, zooming in, etc. Then it applies a computer vision model to each version of the same image and aggregates its predictions.

    “In this way, you get multiple predictions from a single example. Aggregating predictions in this way improves predictions in terms of accuracy and robustness,” Shanmugam explains.

    Maximizing accuracy

    To apply TTA, the researchers hold out some labeled image data used for the conformal classification process. They learn to aggregate the augmentations on these held-out data, automatically augmenting the images in a way that maximizes the accuracy of the underlying model’s predictions.

    Then they run conformal classification on the model’s new, TTA-transformed predictions. The conformal classifier outputs a smaller set of probable predictions for the same confidence guarantee.

    “Combining test-time augmentation with conformal prediction is simple to implement, effective in practice, and requires no model retraining,” Shanmugam says.

    Compared to prior work in conformal prediction across several standard image classification benchmarks, their TTA-augmented method reduced prediction set sizes across experiments, from 10 to 30 percent.

    Importantly, the technique achieves this reduction in prediction set size while maintaining the probability guarantee.

    The researchers also found that, even though they are sacrificing some labeled data that would normally be used for the conformal classification procedure, TTA boosts accuracy enough to outweigh the cost of losing those data.

    “It raises interesting questions about how we used labeled data after model training. The allocation of labeled data between different post-training steps is an important direction for future work,” Shanmugam says.

    In the future, the researchers want to validate the effectiveness of such an approach in the context of models that classify text instead of images. To further improve the work, the researchers are also considering ways to reduce the amount of computation required for TTA.

    This research is funded, in part, by the Wistrom Corporation.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleThe new frontier of API governance: Ensuring alignment, security, and efficiency through decentralization
    Next Article Oura wins round 1 in smart ring patent fight against Ultrahuman and RingConn – now what?

    Related Posts

    Artificial Intelligence

    Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway Deployment

    July 22, 2025
    Repurposing Protein Folding Models for Generation with Latent Diffusion
    Artificial Intelligence

    Repurposing Protein Folding Models for Generation with Latent Diffusion

    July 22, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    Always deploy at peak traffic

    Learning Resources

    Atomia DNS – multi tenant system

    Linux

    CVE-2025-53177 – Fossil Calendar Storage Module Permission Bypass Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    Report: 71% of tech leaders won’t hire devs without AI skills

    Tech & Work

    Highlights

    CVE-2025-47274 – ToolHive Inadvertent Secrets Storage Vulnerability

    May 12, 2025

    CVE ID : CVE-2025-47274

    Published : May 12, 2025, 3:16 p.m. | 1 hour, 18 minutes ago

    Description : ToolHive is a utility designed to simplify the deployment and management of Model Context Protocol (MCP) servers. Due to the ordering of code used to start an MCP server container, versions of ToolHive prior to 0.0.33 inadvertently store secrets in the run config files which are used to restart stopped containers. This means that an attacker who has access to the home folder of the user who starts the MCP server can read secrets without needing access to the secrets store itself. This only applies to secrets which were used in containers whose run configs exist at a point in time – other secrets remaining inaccessible. ToolHive 0.0.33 fixes the issue. Some workarounds are available. Stop and delete any running MCP servers, or manually remove any runconfigs from `$HOME/Library/Application Support/toolhive/runconfigs/` (macOS) or `$HOME/.state/toolhive/runconfigs/` (Linux).

    Severity: 0.0 | NA

    Visit the link for more details, such as CVSS details, affected products, timeline, and more…

    How Building a Banking App Is Like Making a Michelin-Star Meal

    May 29, 2025

    CVE-2025-5179 – Realce Tecnologia Queue Ticket Kiosk Cross-Site Scripting Vulnerability

    May 26, 2025

    CVE-2025-3786 – Tenda AC15 Wireless Repeat Buffer Overflow Vulnerability

    April 22, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.