Cloudflare protects over 20% of all websites on the internet. If you're building a web scraper in 2026, you will inevitably run into Cloudflare protection — whether it's a Turnstile widget on a login form, a JS Challenge interstitial, or full Bot Management at the WAF level. This guide breaks down each protection type, explains the most common bypass approaches, and shows you the fastest way to get through them with working code.
Types of Cloudflare protection
Cloudflare offers three main layers of protection that you'll encounter when scraping. Understanding which one you're facing is the first step to bypassing it.
How to identify the protection type
Before choosing a bypass method, figure out what you're dealing with:
- •Turnstile: Look for a «div» with class cf-turnstile or a script loading challenges.cloudflare.com/turnstile. The HTML source will contain a data-sitekey attribute.
- •JS Challenge: You'll see a full-page "Checking your browser..." screen. The response has a 403 status code and sets a __cf_bm cookie. After passing, you get cf_clearance.
- •Bot Management / WAF: You get 403 Forbidden immediately with no challenge page. Or you see a Cloudflare "Access denied" error page. Often based on IP reputation or TLS fingerprinting.
Common bypass approaches compared
There are three mainstream approaches to bypassing Cloudflare. Each has different trade-offs for speed, reliability, and cost.
| Approach | Speed | Reliability | Detection Risk | Cost/1K |
|---|---|---|---|---|
| Headless Browser | 5-30s | 60-80% | High | $2-5 |
| Stealth Plugins | 5-20s | 70-85% | Medium | $1-3 |
Solver APIRecommended | 0.25-5s | 95%+ | None | $0.80-1.00 |
When to use what
Different protection types require different approaches. Here's a quick decision guide:
Send the siteKey and page URL to a solver API. You get back a valid token that you submit with your form data. No browser needed — just plain HTTP requests. NSLSolver solves Turnstile in ~250ms.
The solver runs the JavaScript challenge and returns cf_clearance cookies and the matching User-Agent string. Use both in your subsequent requests with the same IP/proxy. NSLSolver returns these in 2-5 seconds.
WAF rules inspect your IP reputation and TLS fingerprint. Use a residential proxy to pass IP checks, and a solver API to handle any challenge pages. Always match the User-Agent from the solve response.
Solving Turnstile with NSLSolver
For Turnstile-protected forms, you only need the siteKey and the page URL. The API returns a token you submit with the form:
import requests
API_KEY = "nsl_YOUR_API_KEY"
# 1. Solve the Turnstile widget
solve = requests.post(
"https://api.nslsolver.com/solve",
headers={"X-API-Key": API_KEY},
json={
"type": "turnstile",
"siteKey": "0x4XXXXXXXXXXXXXXXXX",
"url": "https://target-site.com/login"
}
)
token = solve.json()["token"]
# 2. Submit the form with the solved token
result = requests.post(
"https://target-site.com/login",
data={
"username": "[email protected]",
"password": "your_password",
"cf-turnstile-response": token
}
)
print(result.status_code) # 200Solving CF Challenge with NSLSolver
For JS Challenge pages, the API runs the browser check and returns the clearance cookies and User-Agent. You must use these together on the same IP:
import requests
API_KEY = "nsl_YOUR_API_KEY"
# 1. Solve the CF Challenge — returns cookies + user agent
solve = requests.post(
"https://api.nslsolver.com/solve",
headers={"X-API-Key": API_KEY},
json={
"type": "challenge",
"url": "https://target-site.com/protected-page"
}
)
data = solve.json()
# 2. Use the returned cookies and user agent
session = requests.Session()
session.headers["User-Agent"] = data["userAgent"]
for cookie in data["cookies"]:
session.cookies.set(cookie["name"], cookie["value"])
# 3. Access the protected page
page = session.get("https://target-site.com/protected-page")
print(page.status_code) # 200
print(page.text[:500])Handling Bot Management / WAF
When a site uses advanced Bot Management, you need to combine a residential proxy with the solver API. Pass your proxy to the solve request so the cookies are bound to that IP:
import requests
API_KEY = "nsl_YOUR_API_KEY"
PROXY = "http://user:[email protected]:8080"
# 1. Solve the challenge through a residential proxy
solve = requests.post(
"https://api.nslsolver.com/solve",
headers={"X-API-Key": API_KEY},
json={
"type": "challenge",
"url": "https://target-site.com/data",
"proxy": PROXY
}
)
data = solve.json()
# 2. Use the same proxy + cookies for subsequent requests
session = requests.Session()
session.proxies = {"http": PROXY, "https": PROXY}
session.headers["User-Agent"] = data["userAgent"]
for cookie in data["cookies"]:
session.cookies.set(cookie["name"], cookie["value"])
page = session.get("https://target-site.com/data")
print(page.json())Performance: headless browser vs solver API
Running your own headless browser setup might seem like the DIY approach, but the numbers tell a different story:
| Metric | Headless Browser | NSLSolver API |
|---|---|---|
| Turnstile solve time | 5-15s | ~250ms |
| Challenge solve time | 10-30s | 2-5s |
| Success rate | 60-80% | 95%+ |
| Monthly cost (10K solves) | $50-200 | $8-10 |
| Maintenance | High — constant patching | None |
Based on internal benchmarks as of April 2026. Headless browser costs include server, proxy, and browser maintenance.
Cost analysis: build vs buy
Let's break down the real cost of running your own headless browser scraping infrastructure versus using a solver API:
- VPS/cloud server with 4+ GB RAM
- Residential proxy bandwidth
- Browser patching & anti-detect updates
- DevOps time for maintenance
- Pay only per solve — no fixed costs
- No servers to maintain
- No proxy management needed (for Turnstile)
- 100 free solves to start
Best practices for Cloudflare scraping
Regardless of which method you use, follow these practices to maximize your success rate and avoid getting blocked:
- •Rotate proxies: Don't send all requests through the same IP. Use a pool of residential proxies and rotate per session or per request batch.
- •Match User-Agents: Always use the exact User-Agent string returned by the solver. Cloudflare ties cf_clearance cookies to the User-Agent that solved the challenge.
- •Handle rate limits: Respect 429 responses. Implement exponential backoff (1s, 2s, 4s, 8s). Don't hammer the target site — spread requests over time.
- •Add retry logic: Transient failures happen. Retry failed solves 2-3 times before giving up. The production example above shows this pattern.
- •Reuse sessions: Once you have valid cookies, reuse them for multiple requests. cf_clearance cookies typically last 15-30 minutes. Don't re-solve for every request.
- •Mind your TLS fingerprint: If using your own HTTP client, ensure its TLS fingerprint matches a real browser. Libraries like curl_cffi (Python) or got-scraping (Node.js) help with this.
Production-ready scraper example
Here's a complete Python scraper that handles Cloudflare Challenge pages with proxy rotation, retry logic, and session reuse:
import requests
import time
import random
API_KEY = "nsl_YOUR_API_KEY"
BASE_URL = "https://api.nslsolver.com/solve"
PROXIES = [
"http://user:[email protected]:8080",
"http://user:[email protected]:8080",
]
def solve_cloudflare(url, solve_type="challenge", retries=3):
"""Solve Cloudflare protection with retry logic."""
for attempt in range(retries):
try:
proxy = random.choice(PROXIES)
resp = requests.post(
BASE_URL,
headers={"X-API-Key": API_KEY},
json={
"type": solve_type,
"url": url,
"proxy": proxy,
},
timeout=30,
)
resp.raise_for_status()
data = resp.json()
if data.get("success"):
return data, proxy
except requests.RequestException as e:
print(f"Attempt {attempt + 1} failed: {e}")
if attempt < retries - 1:
time.sleep(2 ** attempt) # Exponential backoff
return None, None
def scrape_page(url):
"""Scrape a Cloudflare-protected page."""
data, proxy = solve_cloudflare(url)
if not data:
raise Exception("Failed to solve Cloudflare challenge")
session = requests.Session()
session.headers["User-Agent"] = data["userAgent"]
session.proxies = {"http": proxy, "https": proxy}
for cookie in data["cookies"]:
session.cookies.set(cookie["name"], cookie["value"])
response = session.get(url)
response.raise_for_status()
return response.text
# Usage
html = scrape_page("https://target-site.com/data")
print(f"Got {len(html)} bytes")Pro tip: cf_clearance cookies are valid for 15-30 minutes. Cache and reuse them across requests to the same domain to avoid unnecessary solve calls and save money.
Start scraping Cloudflare sites today
100 free solves on signup. No credit card required. Turnstile in ~250ms, Challenge in 2-5s.