Skip to Main Content

Convert any website into clean, structured data.

Convert one page or a full sitemap to structured data. Schedule recurring structured checks. Export results as CSV, Markdown, or JSON. Start in the UI, then automate the same workflow in the API.

Extract one page as Markdown or structured data.

Run full sitemap workflows in Markdown or structured mode.

Schedule recurring structured checks for saved page and sitemap runs.

Export results as CSV, Markdown, or JSON and reuse them in the API.

Change Monitor (Headline + Key Facts)

Shared template

Create a stable snapshot for drift/change monitoring.

Best for

Monitoring important pages where headline and facts change over time.

Template fields

3 fields

Titlestring

Current page headline

Summarystring

Short summary of the page state

Key factsarray

Important factual bullets to compare over time

https://example.com/pricing
Extraction ready

Sample extraction result

Structured output you can review, compare, and export

json

{

"title": "Pricing page emphasizes annual billing",

"summary": "The page now leads with annual pricing, adds a migration CTA, and keeps the primary proof points above the fold.",

"key_facts": [

"Annual plan appears before monthly billing",

"Primary CTA changed from demo to free trial",

"Migration proof section added below pricing tiers",

]

}

3 fields ready for CSV, JSON, and diff review
Ready to export412 ms

What you can do with Purepage today.

Start with a page, expand to a sitemap, then review, export, or schedule the workflows you want to keep running.

Start with one page.

Capture a single URL as Markdown or define a structured schema when you need stable fields. The UI and API use the same extraction contract, so you can start simple without painting yourself into a corner.

Use `/markdown` for clean page-to-Markdown output.
Use `/url-extract` when you need named fields, validation, and saved schemas.
Keep required fields explicit so repeated runs stay consistent.

Structured page setup

headlinerequired
pricecurrency
faqItemsarray
publishedAtdate

What stays stable

Required fields keep repeat runs strict.
Helper text makes extraction intent obvious.
Saved schemas unlock recurring structured checks.
Field names stay durable for exports and APIs.

Expand to the sitemap.

When one page works, move to the sitemap workflow. Queue a preview or a broader crawl, choose Markdown or structured output, and keep progress visible while the batch runs.

Analyze the sitemap first, then queue Markdown or structured extraction.
Track queued, running, and completed jobs from the queue and activity views.
Keep plan limits, URL filters, and workspace guardrails in place as coverage expands.

Sitemap run

Queued

124

Running

08

Completed

116

/pricing
/blog/launch
/docs/api
/customers

Batch controls

Choose Markdown or structured output before queueing the sitemap.
Filter included URLs and exclude archive content.
Limit each batch to the active workspace and plan.
Rerun structured sitemap batches on a recurring schedule.

Review, export, and rerun.

The useful part starts after the run finishes. Compare outputs, export what you need, and save recurring structured checks when a page or sitemap needs repeat monitoring.

Compare past structured runs to see what changed.
Export single runs and sitemap results as CSV, Markdown, or JSON.
Create recurring schedules for structured page checks and structured sitemap batches.

Change review

+ price: "$149"

+ ctaLabel: "Start trial"

title unchanged

- price: "$129"

- ctaLabel: "Try now"

publishedAt unchanged

Export formats

purepage-run-7ab31c.csv

purepage-run-7ab31c.json

sitemap-markdown-1739210823.md

sitemap-crawl-results-batch_204.json

Use exports for handoffs now, then move the same workflow into the API later.

Start in the workspace. Keep going in code.

The UI is not a demo layer sitting beside the API. It is the same feature set: page runs, sitemap runs, exports, and recurring structured schedules exposed for both operators and developers.

curl
JavaScript
Python
Browse API Docs

curl -X POST https://api.purepage.io/api/crawl/url \

-H "Authorization: Bearer <api-key>" \

-H "Content-Type: application/json" \

-d '{

"url": "https://example.com/pricing",

"output": "structured",

"schemaId": "schema_growth_pricing"

}'

Start in the UI, automate in the API

Run the same page and sitemap workflows over API keys once the team is ready to automate them.

Concrete endpoints, not vague promises

Single URL crawl, sitemap crawl, recurring schedules, diffing, and exports already exist in the shipped product.

Built for teams replacing copy-paste audits and brittle scripts.

Purepage is for teams that need the work to survive handoff. Run the extraction, keep the output legible, compare changes later, and export something another team can actually use.

Start with one URL. Scale when it works.

Capture one page as Markdown or structured data, expand to a sitemap when the workflow proves out, then export or schedule the structured checks you want to keep running.