Dataset schema
Also: Dataset entity, Schema.org Dataset, open dataset schema
For a specialty business agency publishing primary research, the Dataset entity is the single highest-leverage schema for citation pickup. The 40 clinic AI visibility audit at KailxLabs is declared as a Schema.org Dataset with full distribution metadata, CC BY 4.0 license, and explicit variableMeasured array.
The variableMeasured property is the field that matters most. Models traverse this array when asked about specific research dimensions. A dataset that declares "AI citation rate per clinic" and "Curl readability" as variables can be cited for both questions independently.
Distribution can carry multiple DataDownload entries with different encoding formats (JSON, CSV, TSV). For maximum citation surface, publish at least JSON. CSV is helpful but not required.
Cited facts
- Schema.org Dataset is the primary structured data type for research publications.
- Datasets with CC BY 4.0 license get cited across AI engines and traditional research databases simultaneously.
- Google Dataset Search indexes datasets with Schema.org Dataset markup automatically.