What is HTTP 406 Error? (Not Acceptable)

What is HTTP 406 Error? (Not Acceptable)

When working on web scraping or automation, encountering HTTP errors can be frustrating, and HTTP error 406 is one that indicates a mismatch in the type of content being requested.

In this article, we’ll explore what HTTP 406 means, the common causes behind it, and whether it could be used as a blocking strategy. We’ll also dive into how Scrapfly can help you bypass this error effectively.

What is HTTP Error 406?

406 Not Acceptable error occurs when the server is unable to deliver a response in a format that matches the criteria defined by the client's Accept- headers. Essentially, the server understands the request, but it cannot find a response that fits the content types or formats that the client is willing to accept.

What are HTTP 406 Error Causes?

The most common cause of a 406 error is misconfigured Accept- headers. These headers tell the server what content types the client expects in the response, such as:

  • Accept: Specifies the expected media type, like application/json or text/html.
  • Accept-Language: Indicates the preferred languages for the response, e.g., en-US.
  • Accept-Encoding: Defines the compression formats that the client can handle, like gzip or deflate.

If the server cannot provide a response that matches the specified Accept- headers, it will return a 406 status code.

Practical Example

Let's explore how to configure headers, specifically Accept- headers, in common tools like python's httpx library, and cURL.

cURL
Python (httpx)
Javascript (fetch)
Rust
Go
Ruby (typhoeus)
PHP (guzzle)
curl -H "Accept: application/json" -H "Accept-Language: en-US" https://httpbin.dev/json
import httpx

url = "https://httpbin.dev/json"
headers = {
    "Accept": "application/json",  # Expecting JSON response
    "Accept-Language": "en-US",     # Preferring English
}

response = httpx.get(url, headers=headers)
print(response.status_code)
print(response.text)
const url = "https://httpbin.dev/json";
const headers = {
    "Accept": "application/json",  // Expecting JSON response
    "Accept-Language": "en-US",    // Preferring English
};

fetch(url, { headers })
    .then(response => response.json())
    .then(data => console.log(data))
    .catch(error => console.error('Error:', error));
use reqwest::header::{ACCEPT, ACCEPT_LANGUAGE};
use std::error::Error;

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
    let client = reqwest::Client::new();
    let response = client
        .get("https://httpbin.dev/json")
        .header(ACCEPT, "application/json")
        .header(ACCEPT_LANGUAGE, "en-US")
        .send()
        .await?;

    println!("Status: {}", response.status());
    println!("Body: {}", response.text().await?);

    Ok(())
}
package main

import (
    "fmt"
    "io/ioutil"
    "net/http"
)

func main() {
    client := &http.Client{}
    req, err := http.NewRequest("GET", "https://httpbin.dev/json", nil)
    if err != nil {
        fmt.Println("Error:", err)
        return
    }
    req.Header.Add("Accept", "application/json")
    req.Header.Add("Accept-Language", "en-US")

    resp, err := client.Do(req)
    if err != nil {
        fmt.Println("Error:", err)
        return
    }
    defer resp.Body.Close()

    body, _ := ioutil.ReadAll(resp.Body)
    fmt.Println("Status:", resp.Status)
    fmt.Println("Body:", string(body))
}
require 'typhoeus'

url = "https://httpbin.dev/json"
response = Typhoeus.get(url, headers: {
    "Accept" => "application/json",     # Expecting JSON response
    "Accept-Language" => "en-US"        # Preferring English
})

puts "Status: #{response.code}"
puts "Body: #{response.body}"
<?php
$url = "https://httpbin.dev/json";
$client = new \GuzzleHttp\Client();
$response = $client->request('GET', $url, [
    'headers' => [
        'Accept' => 'application/json',     // Expecting JSON response
        'Accept-Language' => 'en-US',       // Preferring English
    ]
]);

echo "Status: " . $response->getStatusCode() . "\n";
echo "Body: " . $response->getBody();

In both examples, the client is requesting a response in application/json format and prefers the response language in en-US. If the server cannot match these criteria, a 406 error might occur.

To avoid 406 errors, ensure that your Accept- headers are set appropriately for the resource you're trying to access.

406 in Web Scraping

When it comes to web scraping 406 status code is most commonly encountered when Accept- family headers are not provided or misconfigured.

Most HTTP clients do no add default Accept- headers, so you need to set them manually. To verify what headers need take a look at how the website behaves in your web browser using Browser Developer Tools. Using the Network tab, you can see the exact Accept [tref how-to-scrape-hidden-apis "headers your browser is sending" %] and replicate them in your scrapers.

Alternatively, there's a small possibility that 406 error is returned deliberately by the server to block web scraping and deceive the scraper in thinking there's a technical issue. If that's the case see our guide on fortifying web scrapers against blocking.

Power Up with Scrapfly

ScrapFly provides web scraping, screenshot, and extraction APIs for data collection at scale.

scrapfly middleware

It takes Scrapfly several full-time engineers to maintain this system, so you don't have to!

Summary

HTTP 406 errors are caused by a mismatch between the Accept- headers sent by the client and the formats the server can deliver. While unlikely, these errors can sometimes be used as a blocking mechanism. Using Scrapfly’s advanced tools, including proxy rotation and customizable requests, you can bypass 406 blocks and keep your web scraping running smoothly.

Related Posts

Guide to SSL Errors: What do they mean and how to fix them

Overview of SSL errors - what are they, what are common issues and how to resolve them.

What is Error 1015 (Cloudflare) and How to Fix it?

Discover why you're seeing Cloudflare Error 1015 and learn effective ways to resolve and prevent it.

What HTTP Error 412 Precondition Failed and How to Fix it?

Quick look at HTTP status code 412 - what does it mean, its common causes, and how it can be prevented.