Skip to main content
Validate / Benchmark Summary

A public-safe benchmark summary that keeps methods and caveats visible

The benchmark summary now behaves as the validation layer of the resource system, emphasizing which numbers are public-safe, how they were measured, and what limitations apply to each metric.

Benchmark framingPublic-safe metricsMeasurement notesCaveat disclosure
Hallucination
0.3%

controlled enterprise workload

FDIA accuracy
0.92

benchmark factual QA

Warm recall
<50ms

hot-zone cache hits

Test result
4,849 / 0 / 0

pass / fail / error

Metrics

The primary metrics without hiding the method behind them

Use this summary to view the key metrics together with routes into the layers that explain the method or supporting architecture.

Environment

The test environment that can be disclosed publicly

This provides the minimum context needed to interpret the numbers without pretending to be a full dossier for every workload.

Version
v5.4.5
Test date
March 21, 2026
OS
Linux x86_64
Node.js
22.x LTS
Test runner
pytest + Hypothesis
CI/CD
GitHub Actions
Next routes

The next pages to use when interpreting the benchmark

Benchmark reading is incomplete without methodology and evaluation, otherwise the numbers lose their decision context.

The benchmark summary should always be read with methodology and evaluation

The numbers on this page are for framing and validation, not as a substitute for the full decision process. Pair them with methodology, whitepapers, and the evaluation hub before making business or procurement conclusions.