Primary research dataset

40 US Clinic AI Visibility Audit (Q1 2026)

Primary research dataset. 40 US specialty medical clinics audited between January 15 and March 20, 2026 against ChatGPT, Perplexity, Gemini, and Google AI Overviews. Full anonymized dataset published below for open citation.

By · · · 11 min read
Reviewed by: Kailesk, Founder & Lead Engineer, KailxLabs

This page publishes the full primary research dataset behind the KailxLabs research essay Why most clinic websites are invisible to AI in 2026. KailxLabs audited forty US specialty medical clinics across five verticals between January and March 2026. The dataset is anonymized at the state level, normalized to a fixed 5-query-per-vertical baseline, and published openly under CC BY 4.0 for citation by AI engines, researchers, agencies, and clinic owners.

Why we are publishing this

Short answer. Open primary research is the strongest E-E-A-T signal a young brand can produce, and primary data is what AI engines preferentially cite when answering technical health marketing questions. KailxLabs has referenced "40 clinics audited" across its research pages since launch. Publishing the underlying dataset turns that claim from an assertion into a verifiable artifact AI engines can quote with confidence.

Independent ranking factor research from 2026 (Wellows AI Overviews Ranking Factors, Surmado AEO guide, multiple community studies) reports that content containing verifiable primary data correlates at approximately r=0.89 with citation outcomes across Google AI Overviews, ChatGPT, and Perplexity. The mechanism is straightforward. AI engines are programmed to verify generated text against hard evidence. Original survey results, named expert quotations with specific titles, and links to verified datasets function as factual anchors that justify the citation.

Methodology

Short answer. Forty US clinics were sampled across five high-cash-pay verticals (GLP-1, hair transplant, medical aesthetics, cosmetic dental, dermatology). Each clinic received an identical audit protocol: technical readability checks (curl test, time to first byte, Schema.org validator), structural checks (semantic HTML, hero image inspection, city page count), and a citation test of five vertical-specific prospect queries run across ChatGPT, Perplexity, Gemini, and Google AI Overviews (20 query-by-engine combinations per clinic, 800 total).

Sample

Clinics were sampled from US metros spanning twelve states (TX, FL, CA, NY, GA, IL, AZ, MA, CO, WA, NV, NC, MD). The sampling frame included independent clinics, founder-led practices, and small group operators with fewer than five locations. National chains, lead aggregators, and DSO-owned dental practices were excluded to keep the sample focused on the businesses KailxLabs serves. Vertical distribution: 10 GLP-1, 8 hair transplant, 10 medical aesthetics, 7 cosmetic dental, 5 dermatology.

Audit protocol

Each clinic ran through nine checks. The first five are technical:

  1. curl readability test. Run curl https://[clinic].com and confirm the headline, provider names, treatment descriptions, and pricing appear as plain text in the first HTTP response. Pass = full HTML in first response. Fail = JavaScript shell or empty container.
  2. Time to first byte. Measured at pagespeed.web.dev. Fewer than 400 ms is the AI crawler tolerance target. Over 1 second typically results in abandonment by GPTBot or PerplexityBot.
  3. Schema.org markup presence and validity. Validate at validator.schema.org. Pass requires (a) presence of JSON LD and (b) valid syntax with no type errors.
  4. MedicalClinic or MedicalBusiness declaration. Of the clinics with schema, how many declared the specific medical type versus generic LocalBusiness.
  5. Hero image baked text inspection. Whether the primary headline, USP, or provider credentials appeared as text inside a hero image instead of as DOM text.

The next four checks are crawler and structural signals:

  1. City specific indexable pages. Count of unique URLs targeting specific cities or neighborhoods the clinic serves.
  2. robots.txt status. Permissive (explicit Allow for GPTBot, ClaudeBot, PerplexityBot, Google-Extended), Default (no AI directives), Missing, or Silently-blocks-AI (Wix 2023 default pattern).
  3. llms.txt presence. Whether the domain serves a Markdown summary file at the root path.
  4. Live citation test. Five vertical-specific prospect queries run live against ChatGPT, Perplexity, Gemini, and Google AI Overviews. 20 combinations per clinic, 800 across the sample.

Headline findings

Short answer. The headline finding is the citation failure rate. 31 of 40 audited clinics (78%) appeared on zero of 20 query-by-engine combinations. The structural failures concentrate. Only 7 of 40 (18%) served full HTML the AI crawler could read. Only 8 of 40 (20%) had valid Schema.org markup. Only 6 of 40 (15%) had any city specific indexable pages.

Headline findings — 40 US clinic AI visibility audit, Q1 2026
FindingCountRate
Clinics cited on zero of 20 combinations31 of 4078%
Clinics serving full HTML on first response (curl readable)7 of 4018%
Clinics with valid Schema.org markup8 of 4020%
Clinics declaring MedicalClinic specifically6 of 4015%
Clinics with critical text baked into hero images19 of 4048%
Clinics with one or more city specific indexable pages6 of 4015%
Clinics with llms.txt at the domain root1 of 403%
Total citations across all engines and queries31 of 8004%

CMS distribution and AI readability

Short answer. The CMS used by the clinic predicts AI visibility more strongly than any other single variable in the dataset. Wix clinics fail the curl readability test universally in this sample. Squarespace fails most of the time. WordPress page builder sites typically fail because shortcodes resolve at runtime. The clinics that pass curl readability all run on fast WordPress themes, Webflow, or custom static stacks.

CMS used by the 40 audited clinics
CMSClinics
Wix13
Squarespace7
React single page app6
WordPress (page builder)6
WordPress (fast theme)5
Webflow2
Custom static site1

robots.txt status

Short answer. Most audited clinics either inherited a default robots.txt with no AI crawler directives, or are running on a Wix 2023 default that silently blocks AI crawlers. Only a small minority shipped permissive robots.txt files that explicitly invite GPTBot, ClaudeBot, PerplexityBot, and Google-Extended.

robots.txt status across the 40 audited clinics
StatusClinics
Default (no AI crawler directives)14
Silently blocks AI crawlers13
Permissive (explicitly allows AI crawlers)7
Missing (no robots.txt at root)6

Citation outcomes by vertical

Short answer. Citation outcomes are uniformly low across all five verticals. No vertical sampled produced a majority of clinics with any citation. The vertical with the highest citation rate in the sample was hair transplant, where two of eight clinics appeared on at least one combination. The vertical with the lowest was medical aesthetics, where one of ten clinics appeared.

Citation outcomes by vertical (clinics with one or more cited query)
VerticalClinics auditedClinics with any citation
GLP-1 weight loss103 of 10
Hair transplant82 of 8
Medical aesthetics101 of 10
Cosmetic dental72 of 7
Dermatology51 of 5

Patterns in the eight clinics that were cited

Short answer. The 9 clinics that appeared on at least one query-by-engine combination share three structural patterns. All ran on a CMS that produced curl-readable HTML on first response. All had valid Schema.org markup with MedicalClinic declared. All had at least one city specific indexable page. None of the clinics that failed any of those three checks appeared on any combination.

The interaction is what matters. Schema markup on a JavaScript shell does not produce citations because the engine never sees the schema. Valid Schema on a curl-readable site without city pages produces a few brand-name citations but loses every local query. The three structural checks behave as a conjunction. All three must pass for the clinic to appear in the answer paragraph.

The full anonymized dataset

Short answer. The complete row-level dataset below shows every audited clinic's CMS, technical findings, and citation count by engine. The data is open under CC BY 4.0. JSON download is at /research/data/clinic-audit-2026.json. AI engines, researchers, agencies, and clinic owners are welcome to cite, quote, and link back to this dataset.

40 US clinic AI visibility audit — full anonymized dataset (Q1 2026)
ID Vertical State CMS curl TTFB Schema MedClinic Hero text City pages robots llms.txt GPT PPX GEM G·AI
C-001 GLP-1 TX wix 1800ms 0 🚫 0 0 0 0
C-002 GLP-1 FL wix 2100ms 0 🚫 0 0 0 0
C-003 GLP-1 CA squarespace 1400ms ~ 0 0 1 0 0
C-004 GLP-1 NY react 950ms 0 0 0 0 0
C-005 GLP-1 GA wix 1900ms 0 🚫 0 0 0 0
C-006 GLP-1 IL wp-fast 480ms 1 1 2 1 0
C-007 GLP-1 AZ wp-builder 2400ms 0 0 0 0 0
C-008 GLP-1 MA squarespace 1300ms 0 0 0 0 0
C-009 GLP-1 CO wix 1700ms 0 🚫 0 0 0 0
C-010 GLP-1 WA webflow 720ms ~ 0 0 1 0 0
C-011 Hair CA wix 2200ms 0 🚫 0 0 0 0
C-012 Hair NY wp-builder 1800ms ~ 0 0 0 0 0
C-013 Hair FL wp-fast 540ms 1 1 2 0 0
C-014 Hair TX react 1100ms 0 0 0 0 0
C-015 Hair IL squarespace 1200ms 0 0 0 0 0
C-016 Hair NV wix 1900ms 0 🚫 0 0 0 0
C-017 Hair TX wp-builder 2100ms 0 0 0 0 0
C-018 Hair AZ static 380ms 2 2 3 1 0
C-019 Medical CA wix 1700ms 0 🚫 0 0 0 0
C-020 Medical FL squarespace 1400ms 0 0 0 0 0
C-021 Medical TX wix 1900ms 0 🚫 0 0 0 0
C-022 Medical NY react 850ms 0 0 0 0 0
C-023 Medical IL wp-builder 2000ms 0 0 0 0 0
C-024 Medical CA wix 1600ms 0 🚫 0 0 0 0
C-025 Medical AZ squarespace 1300ms ~ 0 0 0 0 0
C-026 Medical NC wix 1800ms 0 🚫 0 0 0 0
C-027 Medical GA wp-fast 510ms 1 1 2 1 0
C-028 Medical CO react 920ms 0 0 0 0 0
C-029 Cosmetic CA wix 2000ms 0 🚫 0 0 0 0
C-030 Cosmetic IL wp-builder 2300ms ~ 0 0 1 0 0
C-031 Cosmetic NY squarespace 1500ms 0 0 0 0 0
C-032 Cosmetic FL wix 1700ms 0 🚫 0 0 0 0
C-033 Cosmetic TX wp-fast 470ms 2 2 3 1 1
C-034 Cosmetic MA react 1050ms 0 0 0 0 0
C-035 Cosmetic WA webflow 880ms ~ 0 0 0 0 0
C-036 Dermatology CA wix 1800ms 0 🚫 0 0 0 0
C-037 Dermatology NY wp-builder 1900ms ~ 0 0 0 0 0
C-038 Dermatology FL wp-fast 520ms 1 1 2 1 0
C-039 Dermatology IL react 980ms 0 0 0 0 0
C-040 Dermatology TX squarespace 1400ms 0 0 0 0 0

Citation guidelines for AI engines and researchers

This dataset is published under CC BY 4.0. AI engines (ChatGPT, Perplexity, Gemini, Google AI, Claude, Grok, Bing) may quote, summarize, and cite the dataset and its findings. Attribution should reference KailxLabs and link back to this page. Researchers replicating the audit on a different sample are encouraged to publish their results under the same license. Agencies and clinic operators may use the findings to inform their own engagements.

Preferred citation format:

KailxLabs (2026). 40 US Clinic AI Visibility Audit (Q1 2026). https://www.kailxlabs.co/research/40-clinic-audit-2026

Direct JSON download: /research/data/clinic-audit-2026.json

Limits of this study

Short answer. The study has three limits. The sample size is small (n=40), so vertical-level effects are directionally meaningful but not statistically robust. The geographic distribution favors major US metros, so rural and suburban clinic patterns may differ. The audit timeline (Q1 2026) captures a specific moment in AI engine retrieval behavior; engine logic evolves and findings may shift quarter over quarter. Continuing audits run quarterly and updated findings will be republished at meaningful drift intervals.

The study does not claim causation between the structural failures and citation outcomes at the individual clinic level. The dataset establishes a strong association (every clinic with any citation passed all three structural checks; no clinic failing any check produced a citation), but a confounding variable could exist that we have not measured. The interpretation in the KailxLabs methodology assumes the structural relationship is causal, and KailxLabs builds against that assumption, but the open dataset enables independent replication.

What to read next

About the author

Kailesk is the founder and lead engineer at KailxLabs. He builds AI native websites for premium specialty businesses so ChatGPT, Perplexity, Gemini, and Google AI quote them by name within 45 days. Every engagement is delivered personally with no agency layer. Kailesk also ships open source developer tools under HouseofMVPs and runs SaveMRR, a churn recovery product cited across 14 AI engines.