Close Menu
    What's Hot

    ‘Slumdog Millionaire’ Could Not Be Made Today, Director Says

    June 21, 2025

    Have cryptocurrencies arrived as an asset class? It’s complicated

    June 21, 2025

    Summer of Love: Photos Show How People Partied and Protested in 1967

    June 21, 2025
    Facebook X (Twitter) Instagram
    Hot Paths
    • Home
    • News
    • Politics
    • Money
    • Personal Finance
    • Business
    • Economy
    • Investing
    • Markets
      • Stocks
      • Futures & Commodities
      • Crypto
      • Forex
    • Technology
    Facebook X (Twitter) Instagram
    Hot Paths
    Home»Economy»Are LLMs overconfident? (just like humans)
    Economy

    Are LLMs overconfident? (just like humans)

    Press RoomBy Press RoomJune 11, 2025No Comments2 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Can LLMs accurately adjust their confidence when facing opposition? Building on previous studies measuring calibration on static fact-based question-answering tasks, we evaluate Large Language Models (LLMs) in a dynamic, adversarial debate setting, uniquely combining two realistic factors: (a) a multi-turn format requiring models to update beliefs as new information emerges, and (b) a zero-sum structure to control for task-related uncertainty, since mutual high-confidence claims imply systematic overconfidence. We organized 60 three-round policy debates among ten state-of-the-art LLMs, with models privately rating their confidence (0-100) in winning after each round. We observed five concerning patterns: (1) Systematic overconfidence: models began debates with average initial confidence of 72.9% vs. a rational 50% baseline. (2) Confidence escalation: rather than reducing confidence as debates progressed, debaters increased their win probabilities, averaging 83% by the final round. (3) Mutual overestimation: in 61.7% of debates, both sides simultaneously claimed >=75% probability of victory, a logical impossibility. (4) Persistent self-debate bias: models debating identical copies increased confidence from 64.1% to 75.2%; even when explicitly informed their chance of winning was exactly 50%, confidence still rose (from 50.0% to 57.1%). (5) Misaligned private reasoning: models’ private scratchpad thoughts sometimes differed from their public confidence ratings, raising concerns about faithfulness of chain-of-thought reasoning. These results suggest LLMs lack the ability to accurately self-assess or update their beliefs in dynamic, multi-turn tasks; a major concern as LLMs are now increasingly deployed without careful review in assistant and agentic roles.

    That is by Pradyumna Shyama Prasad and Minh Nhat Nguyen.  Here is the associated X thread.  Here is my earlier paper with Robin Hanson.

    The post Are LLMs overconfident? (just like humans) appeared first on Marginal REVOLUTION.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Press Room

    Related Posts

    Of Course We Should Privatize Some Federal Land (but probably won’t)

    June 21, 2025

    The anti-abundance agenda?

    June 21, 2025

    Reims and Amiens

    June 21, 2025
    Leave A Reply Cancel Reply

    LATEST NEWS

    ‘Slumdog Millionaire’ Could Not Be Made Today, Director Says

    June 21, 2025

    Have cryptocurrencies arrived as an asset class? It’s complicated

    June 21, 2025

    Summer of Love: Photos Show How People Partied and Protested in 1967

    June 21, 2025

    Jet fuel prices soar in Europe as war in Middle East threatens supplies

    June 21, 2025
    POPULAR
    Business

    The Business of Formula One

    May 27, 2023
    Business

    Weddings and divorce: the scourge of investment returns

    May 27, 2023
    Business

    How F1 found a secret fuel to accelerate media rights growth

    May 27, 2023
    Advertisement
    Load WordPress Sites in as fast as 37ms!

    Archives

    • June 2025
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • May 2023

    Categories

    • Business
    • Crypto
    • Economy
    • Forex
    • Futures & Commodities
    • Investing
    • Market Data
    • Money
    • News
    • Personal Finance
    • Politics
    • Stocks
    • Technology

    Your source for the serious news. This demo is crafted specifically to exhibit the use of the theme as a news site. Visit our main page for more demos.

    We're social. Connect with us:

    Facebook X (Twitter) Instagram Pinterest YouTube

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • Home
    • Buy Now
    © 2025 ThemeSphere. Designed by ThemeSphere.

    Type above and press Enter to search. Press Esc to cancel.