LLMs and Document-Based Queries in Investment: Overconfidence and Incomplete Retrieval

PRODUCT

SOLUTION

SECURITY

about

BLOG

Get Your First Issue List

Oct 10, 2025

LLMs and Document-Based Queries in Investment: Overconfidence and Incomplete Retrieval

Chris CH Moon

The use of large language models (LLMs) for document-based queries in investment research is accelerating rapidly. However, this adoption introduces structural risks that go far beyond occasional incorrect answers. Below is the same content reorganized into a clear, structured format, highlighting why standard LLM usage can undermine investment-grade analysis—and what is required to address it.

The Growing Risk of LLMs in Investment Research

LLMs are increasingly used to analyze reports, contracts, and disclosures. The core risk is not simple error frequency, but systematic overconfidence and hallucination, which directly threaten the integrity of investment decision-making.

Studies evaluating leading models such as ChatGPT and Gemini across approximately 300 documents show:

An average overconfidence / hallucination rate exceeding 30%
Errors that often appear plausible, not obviously false

Most mistakes are not fabrications, but cases where:

Weak or missing evidence is presented as fact
Limited information is overgeneralized

This phenomenon reflects epistemic misalignment, a limitation inherent to current LLM architectures.

Where Errors Escalate Most

LLMs perform relatively well on:

Obvious, frequently repeated facts

However, error rates rise sharply when queries involve:

Fine-grained details
Conditional or contextual interpretations
Cross-document dependencies

A single incorrect assumption often triggers cascading errors, producing clusters of internally inconsistent conclusions—particularly dangerous in analytical investment workflows.

The Nature of LLM Hallucinations

Hallucinations in LLMs are rarely isolated events. They tend to propagate.

Common Hallucination Patterns

Two dominant patterns repeatedly appear:

Arbitrary addition of metadata
Example: Labeling a document as “intended for legal professionals” without any supporting evidence.
Confusion between claims and facts
The model extrapolates from limited data and presents assumptions as universally true.

In investment research and due diligence, even a single hallucination can materially distort risk assessment and lead to irreversible outcomes.

Information Blind Spots: What the Model Never Finds

Beyond hallucinations, LLMs suffer from systematic information omission.

Structural limitations include:

Difficulty attending to long documents
Inconsistent retrieval of facts buried mid-report
Failure to connect information dispersed across multiple files

As a result, material information can remain entirely “unseen.”
In investment contexts—where one clause, metric, or operational detail can change valuation or risk—this blind spot is unacceptable.

Why Standard LLMs and Basic RAG Are Insufficient

Neither standalone LLMs nor simple retrieval-augmented generation (RAG) pipelines adequately address these risks.

What investment research actually requires is an agentic, multi-hop system capable of:

Iterative reasoning across multiple document segments
Actively linking scattered information across a data room
Verifying and cross-referencing extracted facts
Detecting contradictions and missing data

This structured, multi-stage reasoning is essential to reduce hallucination risk and surface deeply buried but material insights.

Conclusion: Accuracy Over Speed

LLMs can be powerful tools in investment research—but out-of-the-box deployment is dangerous.

Key risks include:

Overconfidence
Hallucinations
Structural blind spots

To use LLMs responsibly, organizations must adopt:

Agentic systems
Multi-hop reasoning
Verification-first workflows
Structured analytical frameworks

In document-driven investment decision-making, accuracy and completeness matter more than speed. Until these fundamental limitations are addressed, the promise of LLMs in investment research remains aspirational rather than actionable.

Insights

Jan 22, 2026

The Log: January 2026 Update

Insights

Jan 21, 2026

What is AI Due Diligence? Navigating the Market Confusion

Insights

Jan 7, 2026

Seeking the Irreducible Truth

AI Due Diligence Insights

Insights

Jan 22, 2026

The Log: January 2026 Update

Insights

Jan 22, 2026

The Log: January 2026 Update

Insights

Jan 21, 2026

What is AI Due Diligence? Navigating the Market Confusion

Insights

Jan 21, 2026

What is AI Due Diligence? Navigating the Market Confusion

Insights

Jan 7, 2026

Seeking the Irreducible Truth

Insights

Jan 7, 2026

Seeking the Irreducible Truth

The use of large language models (LLMs) for document-based queries in investment research is accelerating rapidly. However, this adoption introduces structural risks that go far beyond occasional incorrect answers. Below is the same content reorganized into a clear, structured format, highlighting why standard LLM usage can undermine investment-grade analysis—and what is required to address it.

The Growing Risk of LLMs in Investment Research

Studies evaluating leading models such as ChatGPT and Gemini across approximately 300 documents show:

An average overconfidence / hallucination rate exceeding 30%
Errors that often appear plausible, not obviously false

Most mistakes are not fabrications, but cases where:

Weak or missing evidence is presented as fact
Limited information is overgeneralized

This phenomenon reflects epistemic misalignment, a limitation inherent to current LLM architectures.

Where Errors Escalate Most

LLMs perform relatively well on:

Obvious, frequently repeated facts

However, error rates rise sharply when queries involve:

Fine-grained details
Conditional or contextual interpretations
Cross-document dependencies

A single incorrect assumption often triggers cascading errors, producing clusters of internally inconsistent conclusions—particularly dangerous in analytical investment workflows.

The Nature of LLM Hallucinations

Hallucinations in LLMs are rarely isolated events. They tend to propagate.

Common Hallucination Patterns

Two dominant patterns repeatedly appear:

Arbitrary addition of metadata
Example: Labeling a document as “intended for legal professionals” without any supporting evidence.
Confusion between claims and facts
The model extrapolates from limited data and presents assumptions as universally true.

In investment research and due diligence, even a single hallucination can materially distort risk assessment and lead to irreversible outcomes.