[Blog](https://scrapfly.io/blog)   /  [blocking](https://scrapfly.io/blog/tag/blocking)   /  [How to Fix 403 Forbidden Errors When Web Scraping](https://scrapfly.io/blog/posts/403-forbidden-web-scraping)   # How to Fix 403 Forbidden Errors When Web Scraping

 by [Bernardas Alisauskas](https://scrapfly.io/blog/author/bernardas) Apr 18, 2026 13 min read [\#blocking](https://scrapfly.io/blog/tag/blocking) [\#http](https://scrapfly.io/blog/tag/http) [\#python](https://scrapfly.io/blog/tag/python) 

 [  ](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2F403-forbidden-web-scraping "Share on LinkedIn")    

 
You send a request with Python and get a 403 Forbidden. The page loads fine in your browser, but your script hits a wall. In web scraping, a 403 rarely means you lack permission to view the page. It means the server flagged your request as automated.

In this guide, we'll cover what causes 403 errors in web scraping, the five detection vectors that trigger them, and seven Python fixes from proper headers to TLS fingerprinting. Let's get started!

[How to Bypass Anti-Bot Protection When Web ScrapingLearn how anti-bot systems detect scrapers and 5 universal bypass techniques including proxy rotation, fingerprinting, and fortified headless browsers.](https://scrapfly.io/blog/posts/how-to-bypass-anti-bot-protection-when-web-scraping)


## Key Takeaways

- A 403 Forbidden in web scraping means the server detected your request as automated, not that you lack access
- Five detection vectors cause most 403 errors: missing headers, User-Agent strings, IP reputation, TLS fingerprints, and cookie validation
- Fix 403 errors by escalating through solutions: headers, User-Agent rotation, delays, proxies, sessions, headless browsers, and TLS matching
- Start with the simplest fix and escalate only when needed
- Use Scrapfly when manual fixes get too complex to maintain

**Get web scraping tips in your inbox**Trusted by 100K+ developers and 30K+ enterprises. Unsubscribe anytime.


## What Is a 403 Forbidden Error?

HTTP 403 Forbidden tells the client it can't access a resource. Unlike 401 Unauthorized (which means "who are you?"), a 403 means "I know who you are, but you're not allowed in."

The HTTP spec defines 403 for cases where the server understands the request but refuses to authorize it. Common causes include:

- The user account lacks permission for that resource
- The client's IP address is on a blocklist
- Rate limits have been exceeded
- The connection looks automated

In traditional web development, 403 errors usually signal a permissions problem. A user tries to open an admin page without admin rights, and the server returns 403.

Web scraping is different. When your Python script gets a 403 from a public page, the server isn't checking login credentials. It's checking whether you look like a real browser or a bot. Anti-bot systems from Cloudflare, Akamai, and DataDome analyze your request and block anything that doesn't match a real browser.

Response headers can give clues about why you were blocked. Check for `X-` prefixed headers that mention rate limits or blocking reasons. Some services return error pages with details, while others give a generic "Access Denied."

While 403 errors happen for many reasons, web scrapers hit them because of bot detection. The rest of this guide focuses on diagnosing and fixing that.


## Why Do Web Scrapers Get 403 Errors?

Anti-bot systems check multiple signals to separate humans from bots. When even one signal looks off, the server returns a 403. Here are the five main detection vectors.

[How to Know What Anti-Bot Service a Website is Using?In this article we'll take a look at two popular tools: WhatWaf and Wafw00f which can identify what WAF service is used.](https://scrapfly.io/blog/posts/how-to-know-what-anti-bot-website-uses)

### Missing or Suspicious HTTP Headers

A bare `httpx.get(url)` call sends about three headers. A real Chrome browser sends 15 or more. Servers check for headers like `Accept`, `Accept-Language`, `Accept-Encoding`, and the `Sec-Fetch-*` family. Missing these headers is a red flag.

Some anti-bot systems also check header order. Chrome sends headers in a specific sequence. If your HTTP client sends them differently, the server notices the mismatch.

[Guide to Python Requests HeadersOur guide to request headers for Python requests library. How to configure and what do they mean.](https://scrapfly.io/blog/posts/python-requests-headers-guide)

### User-Agent Detection

Python's default User-Agent (`python-httpx/0.x.x` or `python-requests/2.x`) gets blocked everywhere. Every anti-bot system rejects these strings. Some websites also maintain blocklists of known bot User-Agents.

Changing your User-Agent alone often isn't enough. If the rest of your headers still look like a Python script, the server catches the mismatch. But a proper User-Agent is a required first step.

### IP Reputation and Rate Limiting

Datacenter IP addresses raise suspicion. Most real users browse from residential IPs provided by their internet service provider. If your scraper runs on AWS, Google Cloud, or a similar provider, many anti-bot systems block it on sight.

Rate limiting adds another layer. Too many requests from one IP in a short window triggers blocking. Some servers return a 429 Too Many Requests for this, but others use 403 to hide the blocking reason.

[What is Rate Limiting? Everything You Need to KnowDiscover what rate limiting is, why it matters, how it works, and how developers can implement it to build stable, scalable applications.](https://scrapfly.io/blog/posts/what-is-rate-limiting-everything-you-need-to-know)

### TLS and Browser Fingerprinting

This is the top reason "it works in my browser but not in Python." Every TLS client has a unique fingerprint based on its handshake parameters. Python's TLS fingerprint looks nothing like Chrome's.

Anti-bot systems use JA3 and JA4 hashing to identify which TLS library made the request. They also check HTTP/2 settings and frame ordering. Even with perfect headers and a Chrome User-Agent, a Python TLS fingerprint gives you away.

[Bypass Proxy Detection with Browser Fingerprint ImpersonationStop proxy blocks with browser fingerprint impersonation using this guide for Playwright, Selenium, curl-impersonate &amp; Scrapfly](https://scrapfly.io/blog/posts/bypass-proxy-detection-with-browser-fingerprint-impersonation)

### Cookie and Session Validation

Some sites set cookies on the first visit and expect them on later requests. If you skip the homepage and go straight to a product page, the missing cookies trigger a 403.

JavaScript-set cookies make this harder. Python HTTP clients can't run JavaScript, so they miss cookies that a browser would receive. This explains the common pattern where your first request works but follow-up requests return 403.


## How to Fix 403 Forbidden When Web Scraping

These seven solutions go from simplest to most advanced. Start at the top and move down until your 403 errors stop. Each fix targets a specific detection vector, so matching your symptom to the right solution saves time.

### Set Proper HTTP Headers

The quickest fix for 403 errors is sending browser-like headers. This works when the server checks for missing headers but doesn't inspect TLS fingerprints.

#### Header Example

python```python
import httpx

# Mimic Chrome's request headers
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
    "Accept-Language": "en-US,en;q=0.5",
    "Accept-Encoding": "gzip, deflate, br",
    "Sec-Fetch-Dest": "document",
    "Sec-Fetch-Mode": "navigate",
    "Sec-Fetch-Site": "none",
    "Sec-Fetch-User": "?1",
}

response = httpx.get("https://web-scraping.dev/products", headers=headers)
print(response.status_code)
```


This header set covers what most servers check. Pay attention to the `Sec-Fetch-*` headers since many anti-bot systems flag requests that miss them.

### Rotate User-Agents

If one User-Agent gets blocked after several requests, rotating through a list helps. Combine this with full header sets for the best results.

#### User-Agent Rotation

python```python
import httpx
import random

user_agents = [
    # Chrome on Windows
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
    # Chrome on macOS
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 14_7_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
    # Chrome on Linux
    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
    # Firefox on Windows
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:134.0) Gecko/20100101 Firefox/134.0",
    # Firefox on macOS
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 14.7; rv:134.0) Gecko/20100101 Firefox/134.0",
]

# Pick a random User-Agent for each request
headers = {"User-Agent": random.choice(user_agents)}
response = httpx.get("https://web-scraping.dev/products", headers=headers)
print(response.status_code)
```


Keep your User-Agent list updated. Outdated browser versions stand out in server logs and can trigger blocks on their own.

### Add Request Delays and Throttling

Rapid-fire requests with consistent timing look nothing like human browsing. Adding random delays between requests helps avoid rate-limiting 403 errors.

#### Delay Example

python```python
import httpx
import time
import random

urls = [
    "https://web-scraping.dev/product/1",
    "https://web-scraping.dev/product/2",
    "https://web-scraping.dev/product/3",
]

for url in urls:
    response = httpx.get(url)
    print(f"{url}: {response.status_code}")
    # Random delay between 2 and 5 seconds
    time.sleep(random.uniform(2, 5))
```


Use `random.uniform()` instead of a fixed delay so each pause is different. Consistent timing is a signal on its own. For async code, swap `time.sleep()` with `asyncio.sleep()`.

### Use Proxy Rotation

When 403 errors come from IP blocking or rate limiting, proxy rotation spreads your requests across many IP addresses. Residential proxies work best because their IPs match real internet users.

#### Proxy Rotation Example

python```python
import httpx
import random

proxies = [
    "http://user:pass@proxy1.example.com:8080",
    "http://user:pass@proxy2.example.com:8080",
    "http://user:pass@proxy3.example.com:8080",
]

# Rotate proxies for each request
proxy = random.choice(proxies)
with httpx.Client(proxy=proxy) as client:
    response = client.get("https://web-scraping.dev/products")
    print(response.status_code)
```


Datacenter proxies are cheaper but get blocked more often. Residential proxies cost more but pass IP reputation checks. Free proxies are unreliable and often already blocklisted.

[The Complete Guide To Using Proxies For Web ScrapingIntroduction to proxy usage in web scraping. What types of proxies are there? How to evaluate proxy providers and avoid common issues.](https://scrapfly.io/blog/posts/introduction-to-proxies-in-web-scraping)

### Handle Cookies and Sessions

If your scraper gets 403 on the second or third request, missing cookies are likely the cause. Use an httpx Client to keep cookies across requests.

#### Session Example

python```python
import httpx

# Client persists cookies across requests
with httpx.Client() as client:
    # Visit homepage to collect session cookies
    client.get("https://web-scraping.dev/")

    # Later requests include those cookies
    response = client.get("https://web-scraping.dev/products")
    print(response.status_code)
    print(dict(client.cookies))
```


The httpx Client works like a browser session. It stores cookies from each response and sends them with the next request. This fixes the "works once, then 403" pattern that many scrapers run into.

### Use a Headless Browser

When a site requires JavaScript to set cookies or pass bot checks, HTTP clients alone won't work. A headless browser runs real JavaScript and passes most fingerprint checks.

#### Playwright Example

python```python
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(headless=True)
    page = browser.new_page()

    # Navigate like a real browser
    page.goto("https://web-scraping.dev/products")

    # Get the rendered HTML
    html = page.content()
    print(f"Got {len(html)} characters of HTML")

    browser.close()
```


Headless browsers solve JavaScript fingerprinting and cookie issues. The tradeoff is speed and memory, so use them only when simpler fixes don't work.

[How to Bypass Cloudflare When Web Scraping in 2026Cloudflare offers one of the most popular anti scraping service, so in this article we'll take a look how it works and how to bypass it.](https://scrapfly.io/blog/posts/how-to-bypass-cloudflare-anti-scraping)

### Match TLS Fingerprints

This is the fix for "it works in my browser but not in Python." The `curl_cffi` library sends requests with Chrome's TLS fingerprint instead of Python's default. It solves TLS/JA3 detection without the overhead of a full browser.

#### curl\_cffi Example

python```python
from curl_cffi import requests

# Impersonate Chrome's TLS fingerprint
response = requests.get(
    "https://web-scraping.dev/products",
    impersonate="chrome",
)
print(response.status_code)
print(response.text[:500])
```


The `impersonate` parameter tells curl\_cffi to match Chrome's TLS handshake, HTTP/2 settings, and header order. You can also target exact versions like `"chrome131"`. This library gives you the speed of HTTP clients with the fingerprint of a real browser.

[Bypass Proxy Detection with Browser Fingerprint ImpersonationStop proxy blocks with browser fingerprint impersonation using this guide for Playwright, Selenium, curl-impersonate &amp; Scrapfly](https://scrapfly.io/blog/posts/bypass-proxy-detection-with-browser-fingerprint-impersonation)


Scrapfly

#### Need to bypass anti-bot protection?

Scrapfly's Anti-Scraping Protection handles Cloudflare, DataDome, and more — automatically.

[Try Free →](https://scrapfly.io/register)## When to Use Each Solution

Each solution targets a different detection method. Use this table to match your symptom with the right fix.

| Solution | Fixes | Complexity | Speed Impact |
|---|---|---|---|
| Set proper headers | Missing headers detection | Low | None |
| Rotate User-Agents | Basic bot detection | Low | None |
| Add delays | Rate limiting | Low | Slower |
| Proxy rotation | IP blocking, rate limiting | Medium | Slight |
| Handle sessions | Cookie validation | Medium | None |
| Headless browser | JS fingerprinting | High | Much slower |
| TLS fingerprinting | TLS/JA3 detection | High | None |

Here's a quick decision guide based on what you're seeing:

1. **Getting 403 on the first request?** Check your headers and User-Agent first.
2. **Getting 403 after several successes?** Add delays and rotate proxies.
3. **Works in browser but not Python?** Use curl\_cffi for TLS matching or a headless browser.
4. **Getting 403 on return visits?** Handle cookies with an httpx Client session.
5. **All of the above failing?** Use Scrapfly to handle bypass for you.


## Fix 403 Errors with Scrapfly


When manual fixes get too complex, Scrapfly handles 403 bypass for you. It routes requests through residential proxies, matches browser fingerprints, and runs JavaScript when needed.

#### Scrapfly Example

python```python
from scrapfly import ScrapeConfig, ScrapflyClient

client = ScrapflyClient(key="YOUR_SCRAPFLY_KEY")

result = client.scrape(ScrapeConfig(
    url="https://web-scraping.dev/products",
    # Turn on anti-bot bypass
    asp=True,
    # Use residential proxies
    proxy_pool="public_residential_pool",
    # Set target country
    country="us",
))
print(result.scrape_result['content'][:500])
```


One API call replaces headers, proxies, fingerprints, and session handling. Scrapfly picks the right bypass method for each target site.


## FAQ

What is the difference between 401 Unauthorized and 403 Forbidden?A 401 Unauthorized means the server doesn't know who you are — authentication is missing or invalid. A 403 Forbidden means the server knows your identity but refuses to grant access. In web scraping, a 401 typically points to a missing API key, expired auth token, or incorrect login credentials.


What is the difference between 403 Forbidden and 429 Too Many Requests?A 429 Too Many Requests explicitly tells you that you've exceeded the server's rate limit, while a 403 Forbidden can mean the same thing but hides the real reason behind a generic access denial. Some websites prefer returning 403 instead of 429 to avoid revealing their rate-limiting thresholds to scrapers.

[What is HTTP Error 429 Too Many Request and How to Fix itHTTP 429 is an infamous response code that indicates request throttling or distribution is needed. Let's take a look at how to handle it.](https://scrapfly.io/blog/posts/what-is-http-error-429-too-many-requests)


What is the difference between 403 Forbidden and 404 Not Found?A 403 Forbidden confirms the resource exists but denies you access, while a 404 Not Found says the resource doesn't exist at that URL. In practice, some websites deliberately return 404 instead of 403 as a security measure — this hides the existence of protected pages from unauthorized users.


Why does my browser load the page but Python returns 403?This is the most common 403 scenario in web scraping. Your browser sends a Chrome TLS fingerprint, executes JavaScript challenges, and carries session cookies — Python's default HTTP clients do none of these. The server's anti-bot system compares your request's TLS handshake against known browser fingerprints and rejects anything that looks like a scripting library.


How do I fix 403 Forbidden with Python requests?Start by adding a full set of browser-like HTTP headers to your requests, including `User-Agent`, `Accept`, `Accept-Language`, and the `Sec-Fetch-*` family. If headers alone don't work, escalate through the seven solutions in this guide: rotate User-Agents, add random delays, use proxy rotation, handle cookies with a session, try a headless browser, or match TLS fingerprints with `curl_cffi`.


Does Cloudflare cause 403 errors when scraping?Yes, Cloudflare is one of the most common causes of 403 errors in web scraping. It uses multiple detection layers including TLS fingerprinting (JA3/JA4 hashes), JavaScript challenges via Turnstile, HTTP/2 frame analysis, and behavioral biometrics. A standard Python HTTP client will fail Cloudflare's checks even with perfect headers because the TLS fingerprint gives it away immediately.


Can rotating proxies fix 403 errors?Proxies fix 403 errors only when the block is based on IP reputation or rate limiting. If you're scraping from a datacenter IP that's already blocklisted, switching to residential proxies can resolve the issue immediately. However, if the server blocks based on TLS fingerprints, missing headers, or JavaScript challenges, proxies alone won't help — you'll get 403 from every IP.


Why am I getting 403 after several successful requests?This pattern usually means the server's anti-bot system detected automated behavior after allowing your initial requests through. Common triggers include too many requests from the same IP in a short window, identical timing between requests, or missing session cookies that should accumulate during normal browsing. To fix this, add random delays with `random.uniform()` between requests, rotate your proxies so each request comes from a different IP, and use an `httpx.Client` session to persist cookies across requests.


How do I know which anti-bot system is causing the 403 error?Check the response headers and error page content for clues. Cloudflare, DataDome, and Akamai each return distinct signatures. For a detailed identification guide, see [how to identify which anti-bot a website uses](https://scrapfly.io/blog/posts/how-to-know-what-anti-bot-website-uses).


## Summary

Most 403 Forbidden errors in web scraping come from bot detection, not permissions. The fix depends on what triggers the block. Start with proper HTTP headers and a real User-Agent. Add delays and proxy rotation for rate limiting. Use curl\_cffi or a headless browser for TLS fingerprinting.

Match your symptom to the right solution. First-request 403 means bad headers. Mid-session 403 means rate limiting. "Works in browser only" means TLS fingerprinting.

For production scraping at scale, Scrapfly handles all these bypass techniques in a single API call. It picks the right method for each target and saves you from maintaining bypass code yourself.

[How to Bypass Anti-Bot Protection When Web ScrapingLearn how anti-bot systems detect scrapers and 5 universal bypass techniques including proxy rotation, fingerprinting, and fortified headless browsers.](https://scrapfly.io/blog/posts/how-to-bypass-anti-bot-protection-when-web-scraping)


   Table of Contents


  Table of Contents- [Key Takeaways](#key-takeaways)
- [What Is a 403 Forbidden Error?](#what-is-a-403-forbidden-error)
- [Why Do Web Scrapers Get 403 Errors?](#why-do-web-scrapers-get-403-errors)
- [Missing or Suspicious HTTP Headers](#missing-or-suspicious-http-headers)
- [User-Agent Detection](#user-agent-detection)
- [IP Reputation and Rate Limiting](#ip-reputation-and-rate-limiting)
- [TLS and Browser Fingerprinting](#tls-and-browser-fingerprinting)
- [Cookie and Session Validation](#cookie-and-session-validation)
- [How to Fix 403 Forbidden When Web Scraping](#how-to-fix-403-forbidden-when-web-scraping)
- [Set Proper HTTP Headers](#set-proper-http-headers)
- [Rotate User-Agents](#rotate-user-agents)
- [Add Request Delays and Throttling](#add-request-delays-and-throttling)
- [Use Proxy Rotation](#use-proxy-rotation)
- [Handle Cookies and Sessions](#handle-cookies-and-sessions)
- [Use a Headless Browser](#use-a-headless-browser)
- [Match TLS Fingerprints](#match-tls-fingerprints)
- [When to Use Each Solution](#when-to-use-each-solution)
- [Fix 403 Errors with Scrapfly](#fix-403-errors-with-scrapfly)
- [FAQ](#faq)
- [Summary](#summary)
 
    Join the Newsletter  Get monthly web scraping insights 

 
Scale Your Web Scraping

Anti-bot bypass, browser rendering, and rotating proxies, all in one API. Start with 1,000 free credits.

  No credit card required  1,000 free API credits  Anti-bot bypass included 

 [Start Free](https://scrapfly.io/register) [View Docs](https://scrapfly.io/docs/onboarding) 

 Not ready? Get our newsletter instead. 

 
## Explore this Article with AI

 [ ChatGPT ](https://chat.openai.com/?q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2F403-forbidden-web-scraping) [ Gemini ](https://www.google.com/search?udm=50&aep=11&q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2F403-forbidden-web-scraping) [ Grok ](https://x.com/i/grok?text=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2F403-forbidden-web-scraping) [ Perplexity ](https://www.perplexity.ai/search/new?q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2F403-forbidden-web-scraping) [ Claude ](https://claude.ai/new?q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2F403-forbidden-web-scraping) 


 ## Related Articles

 [  

 http blocking 

### How Headers Are Used to Block Web Scrapers and How to Fix It

Introduction to web scraping headers - what do they mean, how to configure them in web scrapers and how to avoid being b...

 
 ](https://scrapfly.io/blog/posts/how-to-avoid-web-scraping-blocking-headers) [  

 curl 

### How to Use cURL GET Requests

Here's everything you need to know about cURL GET requests and some common pitfalls you should avoid.

 
 ](https://scrapfly.io/blog/posts/how-to-use-curl-get-requests) [     

 http blocking 

### Post-Quantum TLS: Why Scraping Tools Are Now Exposed

Post-quantum TLS is now a live bot detection signal. Modern browsers send X25519MLKEM768 key shares by default, and scra...

 
 ](https://scrapfly.io/blog/posts/post-quantum-tls-bot-detection) 

  ## Related Questions

- [ Q What are Cloudflare Errors 1006, 1007, 1008? ](https://scrapfly.io/blog/answers/cloudflare-error-1006-1007-1008-access-denied)
- [ Q Web scraping - what is HTTP 403 status code? ](https://scrapfly.io/blog/answers/403-status-code)
 
  
 Bypass anti-bot protection automatically, **1,000 free credits** [Start Free](https://scrapfly.io/register)