What is HTTP 422 Error? (Unprocessable Entity)

What is HTTP 422 Error? (Unprocessable Entity)

What is HTTP Error 422?

HTTP error 422 Unprocessable Entity occurs when the server understands the request but finds that the content is syntactically correct, yet semantically invalid. Essentially, the data you’re submitting may be well-formed, but something about it is incorrect or incomplete, making it impossible for the server to process.

What are HTTP 422 Error Causes?

The primary cause of a 422 error code is sending data that, while properly formatted, is not valid according to the server's expectations. This often happens with POST requests when submitting form data, JSON, or XML that contain formatting errors.

For example, submitting a invalid or even a well-formed JSON document that lacks required fields or contains invalid values could trigger a 422 error.

It’s important to ensure that the content being sent matches the server’s requirements, such as validation rules, data types, or required fields, to avoid this error.

Practical Example

To demonstrate how a server might return an HTTP 422 status code, let's build a simple Flask API with a /submit endpoint that accepts POST requests. This example mimics submitting data to an API and returns a 422 error when the submitted data does not meet the server's validation rules (e.g., invalid email format).

from flask import Flask, jsonify, request

app = Flask(__name__)

# A simple validation function to check for a valid email format
def is_valid_email(email):
    return "@" in email and "." in email

@app.route("/submit", methods=["POST"])
def submit():
    data = request.json
    email = data.get("email")
    
    # Check if email is provided and valid
    if not email or not is_valid_email(email):
        # Unprocessable Entity: Invalid email format
        return jsonify({"error": "Invalid email format."}), 422
    
    # Otherwise, process the request
    return jsonify({"message": "Data submitted successfully."}), 201

if __name__ == "__main__":
    app.run(debug=True)

In the example above, we simulate a /submit endpoint that accepts POST requests containing JSON data. The server expects a valid email address in the request. If the email is missing or does not meet the simple validation check (containing "@" and "."), the server returns a 422 error, indicating the request is well-formed but semantically incorrect (i.e., invalid email). If the email is valid, the server processes the request and returns a success message.

We can test this server with a http client:

Python (httpx)
Javascript (fetch)
cURL
 import httpx

# Test successful submission with a valid email
response = httpx.post("http://127.0.0.1:5000/submit", json={"email": "valid@example.com"})
print(f"Successful Submission: {response.status_code}, {response.json()}")

# Test failed submission with an invalid email
response = httpx.post("http://127.0.0.1:5000/submit", json={"email": "invalid-email"})
print(f"Failed Submission: {response.status_code}, {response.json()}")
// Test successful submission with a valid email
fetch("http://127.0.0.1:5000/submit", {
    method: "POST",
    headers: {
        "Content-Type": "application/json",
    },
    body: JSON.stringify({ email: "valid@example.com" }),
})
    .then(response => response.json().then(data => console.log("Successful Submission:", response.status, data)))
    .catch(error => console.error("Error:", error));

// Test failed submission with an invalid email
fetch("http://127.0.0.1:5000/submit", {
    method: "POST",
    headers: {
        "Content-Type": "application/json",
    },
    body: JSON.stringify({ email: "invalid-email" }),
})
    .then(response => response.json().then(data => console.log("Failed Submission:", response.status, data)))
    .catch(error => console.error("Error:", error));
# Test successful submission with a valid email
curl -X POST http://127.0.0.1:5000/submit -H "Content-Type: application/json" -d '{"email": "valid@example.com"}'

# Test failed submission with an invalid email
curl -X POST http://127.0.0.1:5000/submit -H "Content-Type: application/json" -d '{"email": "invalid-email"}'

422 in Web Scraping

In web scraping 422 http code is usually encountered when an error is made in POST or PUT data generation. So, ensure that posted data is of valid format be it JSON, HTML or XML to avoid this error.

Furthermore, as scrapers don't know exactly how server is reading the received data it can be difficult to debug the exact cause. For this Browser Developer Tools can be used to inspect exactly how a website formats the data like symbol escaping, indentation etc all of which can play a part in data processing. Replicating the exact behavior will decrease chances of encountering http status 422 while scraping.

The 422 error could also mean that the server is blocking your requests deliberitely returning a 422 status code to signal that you are not allowed to access the resource. If you're receiving this status code on GET type request then that could be a sign of blocking.

Power Up with Scrapfly

ScrapFly provides web scraping, screenshot, and extraction APIs for data collection at scale.

scrapfly middleware

It takes Scrapfly several full-time engineers to maintain this system, so you don't have to!

Summary

HTTP 422 errors typically result from submitting well-formed but invalid data, often in POST requests. While it's unlikely that 422 errors are used to block scrapers, it’s always best to test with rotating proxies if the issue persists. Using Scrapfly’s advanced tools, you can bypass these potential blocks and ensure your tasks continue without disruption.

Related Posts

cURL vs Wget: Key Differences Explained

curl and wget are both popular terminal tools but often used for different tasks - let's take a look at the differences.

What is HTTP 415 Error? (Unsupported Media Type)

Quick look at HTTP status code 415 — what does it mean and how can it be prevented and bypassed in scraping?

What is HTTP 409 Error? (Conflict)

HTTP status code 409 generally means a conflict or mismatch with the server state. Learn why it happens and how to avoid it.