Documentation → features
Web Scraping
CorsProxy includes a built‑in content extraction feature so you can pull structured data from pages directly in the browser.
Business plan required:
extractrequires a Business plan and a valid API key.
Extract content from HTML
Enable extraction by adding extract=1:
https://corsproxy.io/?url=https://example.com&extract=1
Parameters
| Parameter | Description | Example |
|---|---|---|
extract | Enable extraction (1) | extract=1 |
selector | CSS selector for main content | selector=article |
titleSelector | CSS selector for title | titleSelector=h1 |
bylineSelector | CSS selector for author/byline | bylineSelector=.byline |
strip | Comma‑separated selectors to remove | strip=.ads,.promo |
format | json (default) or text | format=text |
maxChars | Max characters in output | maxChars=5000 |
Example (structured JSON)
https://corsproxy.io/?url=https://news.ycombinator.com&extract=1&selector=.titleline%20%3E%20a
Example (plain text)
https://corsproxy.io/?url=https://example.com&extract=1&format=text
Response content type is application/json;charset=UTF-8 (or text/plain;charset=UTF-8 when format=text).
Related: File Conversion
CSV/XML/RSS conversion is documented separately in File Conversion.
Related guides
Cookies vs Local Storage: When to Use Each for Web Development
This guide will explore what cookies and local storage are, explain their differences, and provide insights into when to use each.
Avoid These Mistakes When Handling CORS
Cross-Origin Resource Sharing (CORS) is a fundamental security feature implemented in web browsers to control how resources are shared between different origins.
Common CSRF Protection Mistakes Developers Make (And How to Fix Them)
Cross-Site Request Forgery (CSRF) remains one of the most misunderstood web security vulnerabilities. Learn the most common CSRF protection mistakes developers make, how to distinguish CSRF from CORS errors, and implement bulletproof CSRF defenses.
Glossary terms
Datacenter Proxy
High-speed proxy servers hosted in data centers, offering fast connections and low latency for web scraping, automation, and high-volume data collection at affordable prices.
Headless Browser
A web browser without a graphical user interface that can be controlled programmatically, commonly used for automated testing, web scraping, and server-side rendering.
Playwright
An open-source browser automation framework developed by Microsoft that enables reliable end-to-end testing and web scraping across Chromium, Firefox, and WebKit with a single API.