How Would I Buy? reads a landing page
Five deterministic stages. Nine scored signals. Around 2 minutes of real analysis — evidence anchored to your actual page content, not to general web statistics.
The analysis pipeline
- 01
Browser capture
A headless Chromium instance renders your page at desktop resolution (1440×1800 px) and mobile (390×844 px). JavaScript executes, dynamic content loads, lazy images resolve. We capture JPEG screenshots of both viewports and record the bounding boxes of every meaningful element — headings, hero, CTAs, forms, testimonials, images. This is what a real visitor sees, not a raw HTML scrape.
- 02
Deterministic rules pass
Before any LLM is involved, a structured parser counts and classifies elements: number of CTAs and their placement, form field count, heading hierarchy (H1–H3 presence and uniqueness), number of testimonials and whether they are attributed, presence of trust badges and security signals. These counts anchor the scoring stage — the LLM cannot hallucinate a CTA count that the parser has already measured.
- 03
Classification
An LLM identifies the page's primary purpose (direct sales, lead capture, demo request, SaaS trial, ecommerce product, agency/service, or content), primary audience persona, and core value proposition. This classification determines how the nine signal scores are weighted — a lead gen page is penalised more heavily for friction than a blog post.
- 04
Nine-signal scoring
A second, independent LLM pass scores each of the nine signals from 1–10. For each signal it must provide: a score, a confidence percentage, 1–3 specific strengths with evidence quotes from the page, 1–3 specific issues with evidence, and a rationale. Evidence must be traceable to actual page content — the model is instructed to quote or describe specific elements, not make general assertions.
- 05
Buyer narrative
A third pass writes the report as a buyer verdict: what would cause this visitor to convert, what would make them leave, and what single fix would have the largest impact. The narrative synthesises the nine signal scores into a human-readable story. The report concludes with a verdict (Strong Buy → Hard Pass) and a ranked action plan with specific next steps.
The nine conversion signals
Each signal is scored independently then weighted by the page's classified purpose. Weights are set by conversion research: friction matters more on a checkout than on a blog; credibility matters more on a £2,000 service than on a free tool.
Instant Appeal
High on direct sales, medium on contentFirst impression within 5 seconds. Scores the visual hierarchy, hero statement strength, and whether the above-the-fold content communicates a clear, credible value proposition without requiring the visitor to scroll or think.
Evidence sources
- Hero headline specificity
- Visual contrast and layout clarity
- Presence and placement of a primary CTA above the fold
Clarity
Highest weight across all page typesWhether a first-time visitor can answer three questions without scrolling: what is this, who is it for, and what should I do. Scores copy precision, absence of jargon, and unambiguous offer articulation.
Evidence sources
- Product/service description completeness
- Audience specificity
- Action clarity of primary CTA
Information Architecture
Medium on all page typesLogical page flow from problem to solution to proof to action. Scores heading hierarchy, section progression, and whether the page answers questions in the order a skeptical buyer would ask them.
Evidence sources
- Heading structure (H1–H3)
- Section sequence logic
- Scannability and content density
Credibility
High on lead gen and ecommerce; very high on high-ticketSocial proof density and quality. Scores testimonials (specificity, attribution), case studies, logos, reviews, certifications, press mentions, and security / privacy signals.
Evidence sources
- Number of testimonials and their specificity
- Named vs anonymous attribution
- Logos, certifications, security badges
Message Consistency
Medium across all page typesWhether the headline, hero, body copy, supporting visuals, and CTA all reinforce a single coherent message. Scores for drift, contradiction, and mixed audiences.
Evidence sources
- Headline-to-body alignment
- Consistent persona targeting
- CTA copy alignment with offer framing
Motivation
High on direct sales and trial pagesEmotional resonance and desire creation. Scores the articulation of outcome-level benefits (not just features), urgency, aspiration, and the ability to make the visitor feel the gap between their current state and what is possible.
Evidence sources
- Outcome vs feature framing
- Urgency signals (scarcity, deadlines)
- Emotional language and benefit specificity
Friction
Highest weight on conversion pagesBarriers between a willing visitor and conversion. Scores form complexity, commitment-level mismatch, cognitive load from jargon or excessive choice, and navigation elements that pull attention away from the primary action.
Evidence sources
- Form field count relative to offer value
- Number of competing CTAs
- Jargon density
- Navigation complexity
CTA Strength
High on all pages with a conversion goalVisibility, specificity, and persuasiveness of the primary call to action. Scores placement (above fold, end of page, sticky), copy specificity (vague "Submit" vs specific "Start my free trial"), and visual prominence.
Evidence sources
- Above-fold CTA presence
- CTA copy specificity
- Button contrast and size
- Repetition and placement across page
Mobile UX
High when mobile traffic share exceeds 50%Quality of the mobile experience. Scores layout reflow, touch target size, font legibility at mobile scale, mobile-specific CTA placement, and whether key content is hidden or truncated on small screens.
Evidence sources
- Mobile hero layout
- Touch target sizing
- Font size at 390px viewport
- CTA accessibility on mobile
Why deterministic + LLM?
Pure LLM analysis is fast but unreliable on countable facts. An LLM asked to count CTAs may hallucinate; a structured parser cannot. The deterministic pass gives the scoring stage a factual anchor — CTA count, form fields, heading structure — that constrains the LLM's room for invention. The LLM adds what parsers cannot: interpretation of persuasive strength, emotional resonance, and argument logic.
Separating classification from scoring prevents circular reasoning. The classifier names the page type before the scorer sees the weights — so the scoring weights are applied consistently based on what the page is, not what the scorer thinks it should be.
Evidence requirements at the scoring stage mean every finding can be traced to specific content on your page. When a report says “friction is high because the form has nine fields before the user understands the value,” that reflects a form field count from the deterministic pass, not a model assumption.
What this is not
- —Not a substitute for user testing. This is a structured first-pass audit. It surfaces likely friction and missing trust signals. It cannot tell you whether your audience has the problem you think they have, or whether they interpret your language the way you intend.
- —Not a performance audit. We measure conversion signals, not page speed, Core Web Vitals, or technical SEO. Use Lighthouse or PageSpeed Insights for those.
- —Not a design review. Aesthetic judgement is outside scope. Visual quality contributes to instant appeal scores only insofar as it affects perceived credibility and clarity — not as a subjective design preference.
- —Not infallible. The scorer is an LLM operating under strict evidence constraints. It can miss context that is obvious to a domain expert. Treat findings as structured hypotheses, not ground truth.
See the analysis in action
Paste any public URL and get a full scored report in around 2 minutes.