What is HTTP 406 Error? (Not Acceptable)

by mostafa Oct 23, 2024

#http

When working on web scraping or automation, encountering HTTP errors can be frustrating, and HTTP error 406 is one that indicates a mismatch in the type of content being requested.

In this article, we’ll explore what HTTP 406 means, the common causes behind it, and whether it could be used as a blocking strategy. We’ll also dive into how Scrapfly can help you bypass this error effectively.

What is HTTP Error 406?

406 Not Acceptable error occurs when the server is unable to deliver a response in a format that matches the criteria defined by the client's Accept- headers. Essentially, the server understands the request, but it cannot find a response that fits the content types or formats that the client is willing to accept.

What are HTTP 406 Error Causes?

The most common cause of a 406 error is misconfigured Accept- headers. These headers tell the server what content types the client expects in the response, such as:

Accept: Specifies the expected media type, like application/json or text/html.
Accept-Language: Indicates the preferred languages for the response, e.g., en-US.
Accept-Encoding: Defines the compression formats that the client can handle, like gzip or deflate.

If the server cannot provide a response that matches the specified Accept- headers, it will return a 406 status code.

Practical Example

Let's explore how to configure headers, specifically Accept- headers, in common tools like python's httpx library, and cURL.

cURL

Python (httpx)

Javascript (fetch)

Rust

Ruby (typhoeus)

PHP (guzzle)

curl -H "Accept: application/json" -H "Accept-Language: en-US" https://httpbin.dev/json

import httpx

url = "https://httpbin.dev/json"
headers = {
    "Accept": "application/json",  # Expecting JSON response
    "Accept-Language": "en-US",     # Preferring English
}

response = httpx.get(url, headers=headers)
print(response.status_code)
print(response.text)

const url = "https://httpbin.dev/json";
const headers = {
    "Accept": "application/json",  // Expecting JSON response
    "Accept-Language": "en-US",    // Preferring English
};

fetch(url, { headers })
    .then(response => response.json())
    .then(data => console.log(data))
    .catch(error => console.error('Error:', error));

use reqwest::header::{ACCEPT, ACCEPT_LANGUAGE};
use std::error::Error;

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
    let client = reqwest::Client::new();
    let response = client
        .get("https://httpbin.dev/json")
        .header(ACCEPT, "application/json")
        .header(ACCEPT_LANGUAGE, "en-US")
        .send()
        .await?;

    println!("Status: {}", response.status());
    println!("Body: {}", response.text().await?);

    Ok(())
}

package main

import (
    "fmt"
    "io/ioutil"
    "net/http"
)

func main() {
    client := &http.Client{}
    req, err := http.NewRequest("GET", "https://httpbin.dev/json", nil)
    if err != nil {
        fmt.Println("Error:", err)
        return
    }
    req.Header.Add("Accept", "application/json")
    req.Header.Add("Accept-Language", "en-US")

    resp, err := client.Do(req)
    if err != nil {
        fmt.Println("Error:", err)
        return
    }
    defer resp.Body.Close()

    body, _ := ioutil.ReadAll(resp.Body)
    fmt.Println("Status:", resp.Status)
    fmt.Println("Body:", string(body))
}

require 'typhoeus'

url = "https://httpbin.dev/json"
response = Typhoeus.get(url, headers: {
    "Accept" => "application/json",     # Expecting JSON response
    "Accept-Language" => "en-US"        # Preferring English
})

puts "Status: #{response.code}"
puts "Body: #{response.body}"

<?php
$url = "https://httpbin.dev/json";
$client = new \GuzzleHttp\Client();
$response = $client->request('GET', $url, [
    'headers' => [
        'Accept' => 'application/json',     // Expecting JSON response
        'Accept-Language' => 'en-US',       // Preferring English
    ]
]);

echo "Status: " . $response->getStatusCode() . "\n";
echo "Body: " . $response->getBody();

In both examples, the client is requesting a response in application/json format and prefers the response language in en-US. If the server cannot match these criteria, a 406 error might occur.

To avoid 406 errors, ensure that your Accept- headers are set appropriately for the resource you're trying to access.

406 in Web Scraping

When it comes to web scraping 406 status code is most commonly encountered when Accept- family headers are not provided or misconfigured.

Most HTTP clients do no add default Accept- headers, so you need to set them manually. To verify what headers need take a look at how the website behaves in your web browser using Browser Developer Tools. Using the Network tab, you can see the exact Accept [tref how-to-scrape-hidden-apis "headers your browser is sending" %] and replicate them in your scrapers.

Alternatively, there's a small possibility that 406 error is returned deliberately by the server to block web scraping and deceive the scraper in thinking there's a technical issue. If that's the case see our guide on fortifying web scrapers against blocking.

Power Up with Scrapfly

ScrapFly provides web scraping, screenshot, and extraction APIs for data collection at scale.

Anti-bot protection bypass - scrape web pages without blocking!
Rotating residential proxies - prevent IP address and geographic blocks.
JavaScript rendering - scrape dynamic web pages through cloud browsers.
Full browser automation - control browsers to scroll, input and click on objects.
Format conversion - scrape as HTML, JSON, Text, or Markdown.
Python and Typescript SDKs, as well as Scrapy and no-code tool integrations.

It takes Scrapfly several full-time engineers to maintain this system, so you don't have to!

Summary

HTTP 406 errors are caused by a mismatch between the Accept- headers sent by the client and the formats the server can deliver. While unlikely, these errors can sometimes be used as a blocking mechanism. Using Scrapfly’s advanced tools, including proxy rotation and customizable requests, you can bypass 406 blocks and keep your web scraping running smoothly.

What is HTTP 406 Error? (Not Acceptable)

What is HTTP Error 406?

What are HTTP 406 Error Causes?

Practical Example

406 in Web Scraping

Power Up with Scrapfly

Summary

Explore this Article with AI

Related Knowledgebase

How To Copy as cURL With Google Chrome?

How to Copy as cURL With Safari?

How to Copy as cURL With Edge?

How to Copy as cURL With Firefox?

How to Copy as cURL With Brave?

What Python libraries support HTTP2?

Python httpx vs requests vs aiohttp - key differences

How to install mitmproxy certificate on Chrome and Chromium?

How to use proxies with Python httpx?

How to use proxies with PHP Guzzle?

How to use proxies with NodeJS axios?

How to add headers to every or some scrapy requests?

Related Articles

What is Rate Limiting? Everything You Need to Know

Guide to Axios Headers

What is HTTP 401 Error and How to Fix it

Comprehensive Guide to OkHttp for Java and Kotlin

What is HTTP 407 Status Code and How to Fix it

What is HTTP 499 Status Code and How to Fix it?