What is HTTP 406 Error? (Not Acceptable)

What is HTTP 406 Error? (Not Acceptable)

When working on web scraping or automation, encountering HTTP errors can be frustrating, and HTTP error 406 is one that indicates a mismatch in the type of content being requested.

In this article, we’ll explore what HTTP 406 means, the common causes behind it, and whether it could be used as a blocking strategy. We’ll also dive into how Scrapfly can help you bypass this error effectively.

What is HTTP Error 406?

406 Not Acceptable error occurs when the server is unable to deliver a response in a format that matches the criteria defined by the client's Accept- headers. Essentially, the server understands the request, but it cannot find a response that fits the content types or formats that the client is willing to accept.

What are HTTP 406 Error Causes?

The most common cause of a 406 error is misconfigured Accept- headers. These headers tell the server what content types the client expects in the response, such as:

  • Accept: Specifies the expected media type, like application/json or text/html.
  • Accept-Language: Indicates the preferred languages for the response, e.g., en-US.
  • Accept-Encoding: Defines the compression formats that the client can handle, like gzip or deflate.

If the server cannot provide a response that matches the specified Accept- headers, it will return a 406 status code.

Practical Example

Let's explore how to configure headers, specifically Accept- headers, in common tools like python's httpx library, and cURL.

cURL
Python (httpx)
Javascript (fetch)
Rust
Go
Ruby (typhoeus)
PHP (guzzle)
curl -H "Accept: application/json" -H "Accept-Language: en-US" https://httpbin.dev/json
import httpx

url = "https://httpbin.dev/json"
headers = {
    "Accept": "application/json",  # Expecting JSON response
    "Accept-Language": "en-US",     # Preferring English
}

response = httpx.get(url, headers=headers)
print(response.status_code)
print(response.text)
const url = "https://httpbin.dev/json";
const headers = {
    "Accept": "application/json",  // Expecting JSON response
    "Accept-Language": "en-US",    // Preferring English
};

fetch(url, { headers })
    .then(response => response.json())
    .then(data => console.log(data))
    .catch(error => console.error('Error:', error));
use reqwest::header::{ACCEPT, ACCEPT_LANGUAGE};
use std::error::Error;

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
    let client = reqwest::Client::new();
    let response = client
        .get("https://httpbin.dev/json")
        .header(ACCEPT, "application/json")
        .header(ACCEPT_LANGUAGE, "en-US")
        .send()
        .await?;

    println!("Status: {}", response.status());
    println!("Body: {}", response.text().await?);

    Ok(())
}
package main

import (
    "fmt"
    "io/ioutil"
    "net/http"
)

func main() {
    client := &http.Client{}
    req, err := http.NewRequest("GET", "https://httpbin.dev/json", nil)
    if err != nil {
        fmt.Println("Error:", err)
        return
    }
    req.Header.Add("Accept", "application/json")
    req.Header.Add("Accept-Language", "en-US")

    resp, err := client.Do(req)
    if err != nil {
        fmt.Println("Error:", err)
        return
    }
    defer resp.Body.Close()

    body, _ := ioutil.ReadAll(resp.Body)
    fmt.Println("Status:", resp.Status)
    fmt.Println("Body:", string(body))
}
require 'typhoeus'

url = "https://httpbin.dev/json"
response = Typhoeus.get(url, headers: {
    "Accept" => "application/json",     # Expecting JSON response
    "Accept-Language" => "en-US"        # Preferring English
})

puts "Status: #{response.code}"
puts "Body: #{response.body}"
<?php
$url = "https://httpbin.dev/json";
$client = new \GuzzleHttp\Client();
$response = $client->request('GET', $url, [
    'headers' => [
        'Accept' => 'application/json',     // Expecting JSON response
        'Accept-Language' => 'en-US',       // Preferring English
    ]
]);

echo "Status: " . $response->getStatusCode() . "\n";
echo "Body: " . $response->getBody();

In both examples, the client is requesting a response in application/json format and prefers the response language in en-US. If the server cannot match these criteria, a 406 error might occur.

To avoid 406 errors, ensure that your Accept- headers are set appropriately for the resource you're trying to access.

Can 406 Mean Blocking?

Although HTTP error 406 typically relates to Accept- headers being set appropriately, it’s worth noting that error codes are not always used consistently by websites. In rare cases, websites might misconfigure their responses or use 406 as a way to block certain requests.

While it’s unlikely that a 406 error code means you’re being blocked, it's still good practice to test the request with rotating proxies or by adjusting the content type. For more blocking bypass try these two popular tools:

Bypass 406 Blocks with Scrapfly

It is unlikely for a 406 error to mean you are being blocked. But if it does, Scrapfly will handle it for you!

scrapfly middleware

ScrapFly provides web scraping, screenshot, and extraction APIs for data collection at scale.

It takes Scrapfly several full-time engineers to maintain this system, so you don't have to!

Summary

HTTP 406 errors are caused by a mismatch between the Accept- headers sent by the client and the formats the server can deliver. While unlikely, these errors can sometimes be used as a blocking mechanism. Using Scrapfly’s advanced tools, including proxy rotation and customizable requests, you can bypass 406 blocks and keep your web scraping running smoothly.

Related Posts

What is HTTP 413 Error? (Payload Too Large)

HTTP status code 413 generally means that POST or PUT data is too large. Let's take a look at how to handle this.

What is HTTP 405 Error? (Method Not Allowed)

Quick look at HTTP status code 405 — what does it mean and how can it be prevented and bypassed in scraping?

Web Scraping With Go

Learn web scraping with Golang, from native HTTP requests and HTML parsing to a step-by-step guide to using Colly, the Go web crawling package.