Apache 2.0 · Self-hostable · No concurrency cap

Scrape the web without the $190/mo subscription

A flat $29/mo for 50k scrapes with no concurrency cap — or self-host the whole thing for free under Apache 2.0. No AGPL, no closed-source fire-engine. Pay-per-call coming soon.

Firecrawl Growth
$190 / mo
150k scrapes, 50 concurrent
Purify Pro
$29 / mo
50k scrapes, no concurrency cap
Purify Pay-as-you-go
$0.0008
per scrape · in development

Firecrawl pricing from firecrawl.dev/pricing (Apr 2026). Purify Pay-as-you-go is on the waitlist.

Works with your AI stack

ClaudeGPT-4GeminiLLaMAMistralLangChainCrewAIAutoGPTOpenClawPerplexityCursorVercel AI SDKDifyClaudeGPT-4GeminiLLaMAMistralLangChainCrewAIAutoGPTOpenClawPerplexityCursorVercel AI SDKDify
Real-world benchmarks

And your LLM bill goes down too

Purify strips navigation, ads, and boilerplate before the HTML hits your model. Measured on real sites with tiktoken.

Website
Raw tokens
After Purify
Savings
小红书 (Xiaohongshu)
158,742
353
-99.8%
sspai.com (少数派)
32,895
187
-99.4%
GitHub Repository
99,181
1,370
-98.6%
New York Times
103,744
2,130
-98.0%
Anthropic API Docs
129,066
4,837
-96.3%
BBC News Homepage
97,540
6,969
-92.9%
arXiv Paper
26,684
3,129
-88.3%
Wikipedia (LLM)
245,276
76,325
-68.9%

Measured using tiktoken (GPT-4 tokenizer). Raw = unmodified HTML. After Purify = extracted Markdown.

Quick start

Up and running in 60 seconds

Three steps from zero to clean Markdown.

01

Install Purify

Download a single binary. No Node.js, no Python, no Docker. Just one file.

# macOS / Linuxcurl -sSL https://purify.verifly.pro/install.sh | sh # Or download directly from GitHubwget https://github.com/Easonliuliang/purify/releases/latest/download/purify
02

Send a URL

POST any URL to the API. Works with dynamic JavaScript-heavy sites too.

curl -X POST https://purify.verifly.pro/api/v1/scrape \  -H "Authorization: Bearer YOUR_API_KEY" \  -H "Content-Type: application/json" \  -d '{"url": "https://github.com/Easonliuliang/purify"}'
03

Get clean Markdown

Receive structured, token-efficient Markdown. Ready for your LLM or AI agent.

{  "success": true,  "markdown": "# Purify\n\nTurn any web page into clean Markdown...\n",  "tokens_saved": "98.7%",  "processing_time_ms": 420}
MCP Native

Connect to any AI agent in seconds

Purify ships a built-in MCP server. Drop one config file and your AI assistant can scrape the web.

MCP Config

Claude / Cursor

Add Purify as an MCP server in Claude Desktop or Cursor. Your AI assistant can scrape any web page on demand.

// ~/Library/Application Support/Claude/claude_desktop_config.json{  "mcpServers": {    "purify": {      "command": "/path/to/purify-mcp",      "env": {        "PURIFY_API_URL": "https://purify.verifly.pro",        "PURIFY_API_KEY": "YOUR_API_KEY"      }    }  }}
Why Purify

Built for agents that actually run in production

No seat pricing. No concurrency wall. No closed-source anti-bot module. Everything ships in one Go binary.

No concurrency wall

Flat $29/mo for 50k scrapes with unlimited concurrency — no 5 req/s ceiling like Firecrawl's cheap tier. Pay-per-call ($0.0008) coming soon.

Full-featured self-host

Apache 2.0 — the extraction engine, browser fallback, and MCP server all ship in the binary. No closed-source fire-engine. No AGPL.

Single Go binary

One file, zero runtime dependencies. No Redis, no Postgres, no Playwright. Runs on a $5 VPS.

Built-in MCP Server

Native Model Context Protocol support. Connect Claude, Cursor, and other agents with one config file.

HTTP-first with automatic browser fallback

~100ms for static pages. Chromium only spins up when a page actually needs JavaScript.

99% token savings, verified

Strip nav, ads, and scripts before the HTML hits your LLM. Measured on real sites with tiktoken.

Comparison

Purify vs Firecrawl vs Jina Reader

The specific places the SaaS incumbents fall short — and how Purify fills each gap.

Feature
Purify
Firecrawl
Jina Reader
Pay-per-call pricing
Coming soon ($0.0008)
Subscription only
Subscription only
Concurrency cap on cheap tier
None
5 req/s ($19 tier)
Undisclosed
Self-host the whole product
Partial (fire-engine closed-source)
License
Apache 2.0
AGPL / Proprietary
Apache 2.0
Bring your own proxy
Coming soon
Deployment
Single Go binary
Cloud or 8+ services
Cloud only
Built-in MCP Server
Zero runtime dependencies
Token savings (verified)
Up to 99.8%
~70–80%
~70–80%

Frequently asked questions

Purify is a web scraping API for AI agents. It turns any URL into clean Markdown (or a structured object via BYO LLM). Everything ships in a single Go binary you can run yourself under Apache 2.0. Today we offer a flat $29/mo plan or free self-host; pay-per-call is on the near-term roadmap.

Similar category, different tradeoffs. Firecrawl is TypeScript + 8+ services, AGPL, subscription-only, and keeps its anti-bot module (fire-engine) closed-source. Purify is a single Go binary, Apache 2.0, with the whole product fully self-hostable. Today we offer a flat $29/mo plan with no concurrency cap; pay-per-call is in development.

Agent workloads are spiky. Subscription tiers either waste money in quiet weeks or throttle you the moment traffic spikes (Firecrawl's cheapest tier caps at 5 concurrent requests). Pay-per-call removes both failure modes — and it's the next thing we're shipping. Join the waitlist on the pricing page.

Yes, completely. No Docker required, no Redis, no Postgres, no Playwright — just the binary. Self-hosted instances have no usage limits and include every feature of the managed cloud.

Model Context Protocol is an open standard that lets AI agents (Claude, Cursor, OpenClaw) call external tools. Purify's built-in MCP server lets your agent scrape any URL with one config file.

Yes. Purify tries HTTP first for speed (~100ms) and automatically falls back to a headless Chromium when the content requires JavaScript.

Ready to purify the web?

Join developers building smarter AI agents — start free, no credit card needed.