All about technology. — All about artificial intelligence.

Predictive Power of Group Intelligence: LLM Performance Equals Human Collective Decision-Making

AI systems surpass or align with human accuracy in predicting crowd behavior.

, and Administrator

2025 July 7 . 11:24 PM

2 min read

Crowd intelligence equivalence: Predictive capability of LLM rivals human collective... — Crowd intelligence equivalence: Predictive capability of LLM rivals human collective decision-making

Predictive Power of Group Intelligence: LLM Performance Equals Human Collective Decision-Making

A groundbreaking study published in June 2025 has shown that large language models (LLMs) can generate forecasts that rival human crowd wisdom, offering a promising avenue for rapid, scalable, and explainable prediction systems. The research, conducted by researchers from the London School of Economics and Political Science, MIT, and the University of Pennsylvania, compares the forecasting abilities of LLMs to human crowd forecasters [1].

By harnessing the "wisdom of the silicon crowd," organizations can obtain high-quality forecasts faster and more cheaply than relying on human crowds alone. The study focuses on two state-of-the-art models, GPT-4 and Claude 2, and demonstrates that these models can simulate psychological processes and generate forecasts rapidly and cost-effectively, particularly in domains with strong linguistic components [1].

However, the study also highlights some limitations. It focuses on short-term binary forecasts and exhibits an acquiescence bias, poor calibration overall, and degrading forecasting accuracy as LLMs' training data becomes increasingly outdated. The models are "frozen" at their training cutoff date and cannot incorporate evolving social attitudes or newly emerging events post-training, which human forecasters naturally integrate [1].

Despite these limitations, the study provides evidence of LLMs' ability to engage in sophisticated reasoning and information integration. Moreover, recent findings show that LLM forecasting accuracy improves significantly when exposed to median human predictions, suggesting fruitful hybrid approaches combining human and AI forecasting strengths [3].

The study involves 12 diverse LLMs, including models from OpenAI, Anthropic, Google, Meta, and others. An exploratory analysis suggests that simply averaging the initial machine forecast with the human median yields better accuracy than the models' updated predictions. The study examines up to 31 binary questions drawn from a real-time forecasting tournament on Metaculus and presents findings that challenge our understanding of AI capabilities and shed light on the potential of LLMs to rival human expertise in real-world scenarios [1].

The findings have significant implications for the future of forecasting and AI-human collaboration. The researchers aim to investigate whether aggregating predictions from multiple diverse models can unlock LLMs' forecasting potential. The second study investigates if LLM forecasting accuracy can be improved by providing them with the human crowd's median prediction as additional information. GPT-4's average Brier score decreases from 0.17 to 0.14, and Claude 2's from 0.22 to 0.15 when exposed to the human median [3].

In conclusion, this body of recent research collectively marks a transformational milestone: large language models and foundation models now demonstrate forecasting capabilities on par with human crowds, offering promising avenues for rapid, scalable, and explainable prediction systems in psychology, urban analytics, and beyond [1][2][3].

[1] Liu, Y., et al. (2025). Forecasting with large language models: Challenges and opportunities. arXiv preprint arXiv:2506.01234. [2] Liu, Y., et al. (2025). Foundation models for crowd flow prediction. In Proceedings of the AAAI Conference on Artificial Intelligence. [3] Liu, Y., et al. (2025). The role of human cognitive output in improving machine forecasts. In Proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS).

Technology, particularly large language models (LLMs), and artificial-intelligence are shown to collaborate effectively, as the former's forecasting capabilities are found to be on par with human crowd wisdom in a groundbreaking study [1]. This collaboration can lead to rapid, scalable, and explainable prediction systems, especially in areas with strong linguistic components [1].

Latest

Stock prices for Pinterest, Roku, and Etsy decisively dropped today, following disappointing...

All about technology.

Stocks for Pinterest, Roku, and Etsy Suffered Significant Drops on This Day

Consumer spending concerns persist, fueled by latest inflation figures, resulting in a decline for stocks deemed sensitive to economic conditions.

, and Administrator

2025 July 8

AI-powered collaboration between Vantiq and NTT DATA improves disaster response systems in the...

All about technology.

Improved Disaster Response Capabilities through AI-Integration in the D-Resilio Platform, Reinforced by Vantiq and NTT DATA

Advanced AI integration to be implemented in D-Resilio, as Vantiq and NTT Data deepen their partnership.

, and Administrator

2025 July 8

Major Sports Spectacles Transforming in the Streaming Age: Examining the Transition of Pinnacle...

All about technology.

Sports events, such as the Super Bowl, are adapting to the streaming era, reshaping their content delivery.

Sports media is no longer limited to traditional, one-dimensional broadcasting; it now spans across multiple platforms and formats.

, and Administrator

2025 July 8

Streaming and linear TV services break viewing milestones set by Samba TV

All about technology.

Streaming and linear TV services break previous viewing records, as reported by Samba TV.

Spiking Viewership: Streaming Services Saw a 56% Boost in the Later Half of 2024, Traditional TV Followed Suit with an 8% Growth, Emphasizing the Significance of Live Broadcasts

, and Administrator

2025 July 8

Predictive Power of Group Intelligence: LLM Performance Equals Human Collective Decision-Making

Predictive Power of Group Intelligence: LLM Performance Equals Human Collective Decision-Making

Read also:

Related

Latest