SignedAI: Multi-LLM Consensus to Prevent Hallucination at Scale

The fundamental problem with single-LLM AI systems is that the model cannot know when it is wrong. Confident hallucinations look exactly like confident correct answers from inside the model.

SignedAI solves this by asking multiple models the same question and requiring agreement before accepting any answer.

SignedAI is the 5th Genome in the RCT Ecosystem — the verification layer that sits between model output and user delivery. Every production-critical response passes through SignedAI consensus before it reaches the end user.

The result: 0.3% hallucination rate vs the industry average of 12–15% — a 95% reduction.

Why Single-Model AI Hallucinates

Language models are trained to produce plausible text, not verified truth. When a model is asked about something it does not know — or when its training data contains errors — it does not say "I don't know." It produces the most plausible-sounding continuation of the prompt.

This is structural, not a bug that can be fixed by better prompting. The model has no mechanism to know when it is wrong.

Traditional solutions:

RAG (Retrieval-Augmented Generation): Retrieve relevant documents and inject them as context. Helps with factual grounding but does not prevent reasoning errors.
Constitutional AI constraints: Train models to refuse certain types of outputs. Reduces harmful outputs but does not prevent hallucination in factual domains.
Human review: Effective but unscalable at enterprise volume.

SignedAI adds a fourth option: algorithmic consensus across geopolitically balanced, independently trained models.

The Four Voting Algorithms

SignedAI uses four voting methods that each catch different types of errors:

1. Majority Vote

The simplest method. If 4 of 6 models agree on the answer (core facts), the result is consistent.

What it catches: Clear errors where most models have correct training data and one model has a gap.

Limitation: Consensus hallucination — when multiple models were trained on the same incorrect source.

2. Weighted Vote

Models are weighted by their measured confidence for the specific task type. A model with demonstrated 94% accuracy on medical questions gets 3x weight vs a model with 74% accuracy on medical questions.

What it catches: Task-type specific errors where the primary model for a domain disagrees with generalist models.

Limitation: Only as good as the historical accuracy measurement.

3. Ranked Choice Vote

Each model evaluates the outputs of all other models and ranks them. The output that is ranked highest across all models wins.

What it catches: Outputs that are individually plausible but collectively inconsistent. If Model A produces output X and all other models rank it 6th, that is a strong signal that X is wrong even if no other model produced the exact correct answer.

Limitation: Computationally expensive (each model must read all other outputs).

4. Jaccard Similarity

Measures semantic similarity (not text match) across all output pairs. High similarity = consensus on meaning. Low similarity = genuine disagreement requiring human escalation.

Formula:

Jaccard(A, B) = |A ∩ B| / |A ∪ B|

Applied to semantic embedding space, where A and B are the semantic clusters of models' outputs.

What it catches: Outputs that use different words but mean the same thing (high Jaccard = real agreement) vs outputs that sound similar but have different meaning (low Jaccard = hidden disagreement).

Consensus Tiers in Practice

| Tier | Models | Threshold | Typical Latency | Use Case | |---|---|---|---|---| | S | 1 | N/A | <50ms (warm) / 3-5s (cold) | Chat, search, routine tasks | | 4 | 4 (2W:2E) | 50% (2/4) | 5–10s | Standard production queries | | 6 | 6 (3W:3E) | 67% (4/6) | 8–15s | Financial and legal outputs | | 8 | 8 (4W:4E) | 75% (6/8) | 12–25s | Medical, compliance, critical decisions |

The geopolitical composition (Western:Eastern) ensures cultural bias cannot dominate consensus. For a Tier 6 consensus to succeed, at least 2 Western and 2 Eastern models must agree — preventing either bloc from producing false consensus.

The SignedAI Genome Explained

SignedAI is the fifth of seven genomes in the RCT Ecosystem:

Architect Genome — Creator's DNA
ARTENT Genome — Creation intelligence
JITNA Genome — Protocol layer
Codex Genome — Knowledge vault
SignedAI Genome — Verification layer ← this article
RCT-KnowledgeVault Genome — Memory architecture
RCT-7 Genome — Continuous improvement

The SignedAI Genome is implemented across the HexaCore infrastructure and is triggered by the JITNA protocol for queries that meet Tier 4+ significance criteria.

Integration Points Across the Ecosystem

With JITNA

JITNA packets with intent criticality > 0.7 (high-stakes queries) automatically flag for SignedAI Tier 4+ processing. The JITNA negotiation protocol (PROPOSE → COUNTER → ACCEPT) is used to coordinate which models will participate in the consensus round.

With FDIA

The SignedAI consensus result is the primary input to the I (Intent verification) component of FDIA scoring. When SignedAI achieves Tier 8 consensus (75% agreement across 8 models), the Intent verification score is set to maximum — maximizing the F output score.

With RCTDB

Every SignedAI consensus round is committed to RCTDB with:

Which models participated
Each model's individual output (signed with Ed25519)
The voting method used
The final consensus result
Whether it passed or failed the threshold

This audit trail is available for enterprise compliance review and provides the evidence base for Section 33 PDPA explanations.

With Delta Engine

Successful SignedAI consensus results are stored in the Delta Engine at Tier Hot — meaning they are immediately available for warm recall. A Tier 8 consensus result that took 25 seconds to produce can be recalled in <50ms for all semantically similar future queries.

Hallucination Rate: The Math

Industry single-model hallucination rate: 12–15% (source: multiple external benchmarks, 2024–2025)

RCT Ecosystem hallucination rate: 0.3%

Why the difference:

P(hallucination with consensus) = P(majority hallucinate simultaneously)
                                 = P(single model hallucinates)^consensus_size
                                 × (1 / geopolitical_diversity_factor)

For Tier 6 (6 models, 67% threshold, 3W:3E balance):

P(hallucination) ≈ 0.13^4 × (1/diversity) ≈ 0.003 = 0.3%

This is a rough approximation — real hallucination rates depend on topic domain, query specificity, and model training overlap. But it illustrates why multi-model consensus with geopolitical diversity is structurally superior to single-model or same-origin model ensembles.

Frequently Asked Questions

Does SignedAI work for every type of query?

SignedAI is most valuable for queries where factual accuracy is critical: legal, financial, medical, compliance, and technical documentation. For creative tasks (writing, brainstorming), consensus mechanics are less meaningful and Tier S (single model) is preferred.

What happens when SignedAI cannot reach consensus?

When a query fails to reach the required consensus threshold, there are three outcomes depending on configuration:

Escalation: Route to human review (recommended for Tier 8 failures)
Best-effort response: Return the plurality answer with a confidence warning
Request clarification: Ask the user to provide more context (if application allows)

Can two models produce opposite answers and still reach consensus?

Yes, in Jaccard similarity scoring. If 4 models say "answer is X" and 2 models say "answer is Y," Jaccard measures the semantic distance between X and Y. If X and Y are semantically similar (different phrasing of the same fact), consensus is reached. If they are semantically opposite, the Jaccard score is low and consensus fails.

Summary

SignedAI brings enterprise-grade reliability to AI output through:

4 voting methods: Majority, Weighted, Ranked Choice, Jaccard Similarity
4 consensus tiers: S / 4 / 6 / 8 (by model count and agreement threshold)
Geopolitical balance: 3W:3E ensures cultural bias cannot dominate consensus
Full audit trail: Every consensus round committed to RCTDB with Ed25519 signatures
0.3% hallucination rate: vs industry 12–15% (95% reduction)

Because SignedAI consensus results are cached in the Delta Engine, the first consensus computation is the expensive one. Every subsequent semantically similar query costs near zero in <50ms.

This article was written by Ittirit Saengow, founder and sole developer of RCT Labs.

Executive takeaway

What enterprise teams should retain from this briefing

SignedAI is the multi-model consensus verification system of the RCT Ecosystem. Instead of trusting a single AI model's output, SignedAI routes critical queries through 4-8 models simultaneously and requires formal agreement before any result is released — reducing hallucination by 95% vs single-model systems.

SignedAIMulti-LLMConsensusHallucination Prevention

ShareResearch distribution tools

Where to go next from this article

Move from knowledge into platform evaluation

Each research article should connect to a solution page, an authority page, and a conversion path so discovery turns into real evaluation.

Explore Solutions

Go deeper into the related solution path.

Open solution

Open Glossary

Continue into the authority layer for deeper system context.

Open authority page

Request guided evaluation

Open the contact funnel aligned with this article's intent.

Start the conversation

Reverse Component Thinking: The Engineering Philosophy Behind RCT Labs

Reverse Component Thinking (RCT) is the engineering methodology at the core of RCT Labs. Instead of building forward from features, RCT starts from the desired outcome and decomposes backwards to find the smallest verifiable parts. This article explains why this inversion changes what you build — and why it matters for AI safety.

Thai AI Platform Vision 2030: Building a 50-100 Billion THB National Infrastructure

RCT Labs was built with a specific long-term vision: become the constitutional AI operating standard for 1,000+ Thai enterprises by 2030, generating 50-100 billion THB in national economic value. This article explains the vision, the technical foundation that makes it credible, and the role of open standards in achieving it.

Author credibility

Ittirit Saengow

Primary author

Ittirit Saengow (อิทธิฤทธิ์ แซ่โง้ว) is the founder, sole developer, and primary author of RCT Labs — a constitutional AI operating system platform built independently from architecture through publication. He conceived and developed the FDIA equation (F = (D^I) × A), the JITNA protocol specification (RFC-001), the 10-layer architecture, the 7-Genome system, and the RCT-7 process framework. The full platform — including bilingual infrastructure, enterprise SEO systems, 62 microservices, 41 production algorithms, and all published research — was built as a solo project in Bangkok, Thailand.

SignedAIMulti-LLMConsensus

View author profile

The fundamental problem with single-LLM AI systems is that the model cannot know when it is wrong. Confident hallucinations look exactly like confident correct answers from inside the model.

SignedAI solves this by asking multiple models the same question and requiring agreement before accepting any answer.

The result: 0.3% hallucination rate vs the industry average of 12–15% — a 95% reduction.

Why Single-Model AI Hallucinates

This is structural, not a bug that can be fixed by better prompting. The model has no mechanism to know when it is wrong.

Traditional solutions:

RAG (Retrieval-Augmented Generation): Retrieve relevant documents and inject them as context. Helps with factual grounding but does not prevent reasoning errors.
Constitutional AI constraints: Train models to refuse certain types of outputs. Reduces harmful outputs but does not prevent hallucination in factual domains.
Human review: Effective but unscalable at enterprise volume.

SignedAI adds a fourth option: algorithmic consensus across geopolitically balanced, independently trained models.

The Four Voting Algorithms

SignedAI uses four voting methods that each catch different types of errors:

1. Majority Vote

The simplest method. If 4 of 6 models agree on the answer (core facts), the result is consistent.

What it catches: Clear errors where most models have correct training data and one model has a gap.

Limitation: Consensus hallucination — when multiple models were trained on the same incorrect source.

2. Weighted Vote

What it catches: Task-type specific errors where the primary model for a domain disagrees with generalist models.

Limitation: Only as good as the historical accuracy measurement.

3. Ranked Choice Vote

Each model evaluates the outputs of all other models and ranks them. The output that is ranked highest across all models wins.

Limitation: Computationally expensive (each model must read all other outputs).

4. Jaccard Similarity

Measures semantic similarity (not text match) across all output pairs. High similarity = consensus on meaning. Low similarity = genuine disagreement requiring human escalation.

Formula:

Jaccard(A, B) = |A ∩ B| / |A ∪ B|

Applied to semantic embedding space, where A and B are the semantic clusters of models' outputs.

Consensus Tiers in Practice

The SignedAI Genome Explained

SignedAI is the fifth of seven genomes in the RCT Ecosystem:

Architect Genome — Creator's DNA
ARTENT Genome — Creation intelligence
JITNA Genome — Protocol layer
Codex Genome — Knowledge vault
SignedAI Genome — Verification layer ← this article
RCT-KnowledgeVault Genome — Memory architecture
RCT-7 Genome — Continuous improvement

The SignedAI Genome is implemented across the HexaCore infrastructure and is triggered by the JITNA protocol for queries that meet Tier 4+ significance criteria.

Integration Points Across the Ecosystem

With JITNA

With FDIA

With RCTDB

Every SignedAI consensus round is committed to RCTDB with:

Which models participated
Each model's individual output (signed with Ed25519)
The voting method used
The final consensus result
Whether it passed or failed the threshold

This audit trail is available for enterprise compliance review and provides the evidence base for Section 33 PDPA explanations.

With Delta Engine

Hallucination Rate: The Math

Industry single-model hallucination rate: 12–15% (source: multiple external benchmarks, 2024–2025)

RCT Ecosystem hallucination rate: 0.3%

Why the difference:

P(hallucination with consensus) = P(majority hallucinate simultaneously)
                                 = P(single model hallucinates)^consensus_size
                                 × (1 / geopolitical_diversity_factor)

For Tier 6 (6 models, 67% threshold, 3W:3E balance):

P(hallucination) ≈ 0.13^4 × (1/diversity) ≈ 0.003 = 0.3%

Frequently Asked Questions

Does SignedAI work for every type of query?

What happens when SignedAI cannot reach consensus?

When a query fails to reach the required consensus threshold, there are three outcomes depending on configuration:

Escalation: Route to human review (recommended for Tier 8 failures)
Best-effort response: Return the plurality answer with a confidence warning
Request clarification: Ask the user to provide more context (if application allows)

Can two models produce opposite answers and still reach consensus?

Summary

SignedAI brings enterprise-grade reliability to AI output through:

4 voting methods: Majority, Weighted, Ranked Choice, Jaccard Similarity
4 consensus tiers: S / 4 / 6 / 8 (by model count and agreement threshold)
Geopolitical balance: 3W:3E ensures cultural bias cannot dominate consensus
Full audit trail: Every consensus round committed to RCTDB with Ed25519 signatures
0.3% hallucination rate: vs industry 12–15% (95% reduction)

Because SignedAI consensus results are cached in the Delta Engine, the first consensus computation is the expensive one. Every subsequent semantically similar query costs near zero in <50ms.

This article was written by Ittirit Saengow, founder and sole developer of RCT Labs.

Executive takeaway

What enterprise teams should retain from this briefing

SignedAIMulti-LLMConsensusHallucination Prevention

ShareResearch distribution tools

Where to go next from this article

Move from knowledge into platform evaluation

Each research article should connect to a solution page, an authority page, and a conversion path so discovery turns into real evaluation.

Explore Solutions

Go deeper into the related solution path.

Open solution

Open Glossary

Continue into the authority layer for deeper system context.

Open authority page

Request guided evaluation

Open the contact funnel aligned with this article's intent.

Start the conversation

Reverse Component Thinking: The Engineering Philosophy Behind RCT Labs

Thai AI Platform Vision 2030: Building a 50-100 Billion THB National Infrastructure

Author credibility

Ittirit Saengow

Primary author

SignedAIMulti-LLMConsensus

View author profile

Why Single-Model AI Hallucinates

The Four Voting Algorithms

1. Majority Vote

2. Weighted Vote

3. Ranked Choice Vote

4. Jaccard Similarity

Consensus Tiers in Practice

The SignedAI Genome Explained

Integration Points Across the Ecosystem

With JITNA

With FDIA

With RCTDB

With Delta Engine

Hallucination Rate: The Math

Frequently Asked Questions

Does SignedAI work for every type of query?

What happens when SignedAI cannot reach consensus?

Can two models produce opposite answers and still reach consensus?

Summary

What enterprise teams should retain from this briefing

Move from knowledge into platform evaluation

Reverse Component Thinking: The Engineering Philosophy Behind RCT Labs

Thai AI Platform Vision 2030: Building a 50-100 Billion THB National Infrastructure

Ittirit Saengow

Related Articles

Constitutional AI vs RAG: Which Architecture Actually Prevents Hallucination?

Delta Engine: How RCT Labs Achieves 74% Memory Compression and Sub-50ms Recall

Evaluation Harnesses for Enterprise LLMs: Beyond Vibe-Testing

SignedAI: Multi-LLM Consensus to Prevent Hallucination at Scale

Why Single-Model AI Hallucinates

The Four Voting Algorithms

1. Majority Vote

2. Weighted Vote

3. Ranked Choice Vote

4. Jaccard Similarity

Consensus Tiers in Practice

The SignedAI Genome Explained

Integration Points Across the Ecosystem

With JITNA

With FDIA

With RCTDB

With Delta Engine

Hallucination Rate: The Math

Frequently Asked Questions

Does SignedAI work for every type of query?

What happens when SignedAI cannot reach consensus?

Can two models produce opposite answers and still reach consensus?

Summary

What enterprise teams should retain from this briefing

Move from knowledge into platform evaluation

Reverse Component Thinking: The Engineering Philosophy Behind RCT Labs

Thai AI Platform Vision 2030: Building a 50-100 Billion THB National Infrastructure

Ittirit Saengow

Related Articles

Constitutional AI vs RAG: Which Architecture Actually Prevents Hallucination?

Delta Engine: How RCT Labs Achieves 74% Memory Compression and Sub-50ms Recall

Evaluation Harnesses for Enterprise LLMs: Beyond Vibe-Testing