Meet the Visual Perception Model.
VPMs are to UX what LLMs are to language.
A new class of AI, trained on human visual cognition — not the optical capabilities of the eye, but how the brain interprets and acts on what it sees. Benchmark is the first product built on a VPM: competitive UX audits, A/B testing, and redesign deliverables, all powered by a model that interprets your UI the way a real user will.
A working behavioral simulator. Feed it any persona, any task, any URL — it hands back a step-by-step play-by-play of what happens: which page the persona abandoned at, where they got lost, what specifically blocked them, and how long each step took.
AI parallelism is exactly the structural advantage human-panel testing can’t touch. Personas × tasks × flows is a 3-dimensional matrix; humans can only sample one cell at a time, while an AI agent fleet can saturate the whole matrix in the same audit window.
Existing tools miss the point.
You can’t benchmark your UX against competitors using tools built to test your own product. Here’s what’s currently in the gap.
Three differentiators that define the category.
Visual Perception Model — VPM.
The VPM is a model of interpretation. Every other AI in this category processes a screen; the VPM processes how a brain processes a screen — the order it attends to elements, the inferences it draws about what each element is for, the friction it generates when a layout fights the user’s intent. Three things make that work: the research foundation it inherits, the corpus it’s trained against, and the persona-conditioning mechanism that runs at inference time.
Research lineage, Aegis-warships origin, and the three pillars in depth — npire.net/vpm
VPM-driven A/B testing, without the live deployment.
A/B tests today mean deploying the worse variant to half your users for weeks while you wait for statistical significance. The VPM runs the same experiment internally — both variants, with persona-modeled human responses — in minutes. Same outcome signal, none of the live exposure or the wait.
From brief to report.
Define personas and tasks.
Multiple templates per persona, multiple per task. We lock the matrix before any testing begins.
AI executes the matrix in parallel.
An agent fleet runs every cell — each persona on each task on each flow — three times, using only what the persona knows.
Human review gates.
Any uncertainty pauses the run. A human resolves it before scoring. No flow is penalized for edge cases.
UCI scoring and analysis.
Each stage scored. Each friction event logged. Cross-flow comparison built. Findings ranked by impact.
Deliverables packaged.
Slide deck, interactive flow diagram, written report, archived audit record. Ready to share with leadership.
A standard you can cite.
A minimal flow scores under 15. A critical flow scores above 50. Unlike subjective usability ratings, UCI is formula-driven, reproducible, and directly comparable across audits, competitors, and time.
Six deliverables in every audit.
Per audit. No platform fees. No annual contract required.
VPM-driven A/B testing
Run paired-variant A/B (or A/B/n) simulations through the VPM. Per-persona signal, results in minutes, no live deployment.
If your team has ever debated how a competitor’s onboarding actually compares to yours and ended the conversation with “I think it’s faster” — Benchmark exists to settle it.