Are LLMs overconfident? (just like humans)

Can LLMs accurately adjust their confidence when facing opposition? Building on previous studies measuring calibration on static fact-based question-answering tasks, we evaluate Large Language Models (LLMs) in a dynamic, adversarial debate setting, uniquely combining two realistic factors: (a) a multi-turn format requiring models to update beliefs as new information emerges, and (b) a zero-sum structure to control for task-related uncertainty, since mutual high-confidence claims imply systematic overconfidence. We organized 60 three-round policy debates among ten state-of-the-art LLMs, with models privately rating their confidence (0-100) in winning after each round. We observed five concerning patterns: (1) Systematic overconfidence: models began debates with average initial confidence of 72.9% vs. a rational 50% baseline. (2) Confidence escalation: rather than reducing confidence as debates progressed, debaters increased their win probabilities, averaging 83% by the final round. (3) Mutual overestimation: in 61.7% of debates, both sides simultaneously claimed >=75% probability of victory, a logical impossibility. (4) Persistent self-debate bias: models debating identical copies increased confidence from 64.1% to 75.2%; even when explicitly informed their chance of winning was exactly 50%, confidence still rose (from 50.0% to 57.1%). (5) Misaligned private reasoning: models’ private scratchpad thoughts sometimes differed from their public confidence ratings, raising concerns about faithfulness of chain-of-thought reasoning. These results suggest LLMs lack the ability to accurately self-assess or update their beliefs in dynamic, multi-turn tasks; a major concern as LLMs are now increasingly deployed without careful review in assistant and agentic roles.

That is by Pradyumna Shyama Prasad and Minh Nhat Nguyen. Here is the associated X thread. Here is my earlier paper with Robin Hanson.

The post Are LLMs overconfident? (just like humans) appeared first on Marginal REVOLUTION.

Source link

What's Hot

Key Metrics Point to a Crash

Dell, HP Enterprise, and Okta are among the group seeing gains after IBM warning (IBM:NYSE)

Warren Buffett: I’ll Donate My Entire Berkshire Fortune by 2034

Are LLMs overconfident? (just like humans)

Wall Street slides as valuation concerns, rate-cut jitters linger

Wall St opens lower as valuation concerns, rate-cut jitters linger

They solved for the Kansas City Chiefs enforcement equilibrium

Key Metrics Point to a Crash

Dell, HP Enterprise, and Okta are among the group seeing gains after IBM warning (IBM:NYSE)

Warren Buffett: I’ll Donate My Entire Berkshire Fortune by 2034

June CPI Beat Lifts Bitcoin — Fed’s Next Move Matters

The Business of Formula One

Weddings and divorce: the scourge of investment returns

How F1 found a secret fuel to accelerate media rights growth

Archives

Categories

What's Hot

Are LLMs overconfident? (just like humans)

Related Posts

Subscribe to Updates