HTTP error 422 Unprocessable Entity occurs when the server understands the request but finds that the content is syntactically correct, yet semantically invalid. Essentially, the data you’re submitting may be well-formed, but something about it is incorrect or incomplete, making it impossible for the server to process.
What are HTTP 422 Error Causes?
The primary cause of a 422 error code is sending data that, while properly formatted, is not valid according to the server's expectations. This often happens with POST requests when submitting form data, JSON, or XML that contain formatting errors.
For example, submitting a invalid or even a well-formed JSON document that lacks required fields or contains invalid values could trigger a 422 error.
It’s important to ensure that the content being sent matches the server’s requirements, such as validation rules, data types, or required fields, to avoid this error.
Practical Example
To demonstrate how a server might return an HTTP 422 status code, let's build a simple Flask API with a /submit endpoint that accepts POST requests. This example mimics submitting data to an API and returns a 422 error when the submitted data does not meet the server's validation rules (e.g., invalid email format).
from flask import Flask, jsonify, request
app = Flask(__name__)
# A simple validation function to check for a valid email format
def is_valid_email(email):
return "@" in email and "." in email
@app.route("/submit", methods=["POST"])
def submit():
data = request.json
email = data.get("email")
# Check if email is provided and valid
if not email or not is_valid_email(email):
# Unprocessable Entity: Invalid email format
return jsonify({"error": "Invalid email format."}), 422
# Otherwise, process the request
return jsonify({"message": "Data submitted successfully."}), 201
if __name__ == "__main__":
app.run(debug=True)
In the example above, we simulate a /submit endpoint that accepts POST requests containing JSON data. The server expects a valid email address in the request. If the email is missing or does not meet the simple validation check (containing "@" and "."), the server returns a 422 error, indicating the request is well-formed but semantically incorrect (i.e., invalid email). If the email is valid, the server processes the request and returns a success message.
We can test this server with a http client:
Python (httpx)
Javascript (fetch)
cURL
import httpx
# Test successful submission with a valid email
response = httpx.post("http://127.0.0.1:5000/submit", json={"email": "valid@example.com"})
print(f"Successful Submission: {response.status_code}, {response.json()}")
# Test failed submission with an invalid email
response = httpx.post("http://127.0.0.1:5000/submit", json={"email": "invalid-email"})
print(f"Failed Submission: {response.status_code}, {response.json()}")
# Test successful submission with a valid email
curl -X POST http://127.0.0.1:5000/submit -H "Content-Type: application/json" -d '{"email": "valid@example.com"}'
# Test failed submission with an invalid email
curl -X POST http://127.0.0.1:5000/submit -H "Content-Type: application/json" -d '{"email": "invalid-email"}'
422 in Web Scraping
In web scraping 422 http code is usually encountered when an error is made in POST or PUT data generation. So, ensure that posted data is of valid format be it JSON, HTML or XML to avoid this error.
Furthermore, as scrapers don't know exactly how server is reading the received data it can be difficult to debug the exact cause. For this Browser Developer Tools can be used to inspect exactly how a website formats the data like symbol escaping, indentation etc all of which can play a part in data processing. Replicating the exact behavior will decrease chances of encountering http status 422 while scraping.
The 422 error could also mean that the server is blocking your requests deliberitely returning a 422 status code to signal that you are not allowed to access the resource. If you're receiving this status code on GET type request then that could be a sign of blocking.
It takes Scrapfly several full-time engineers to maintain this system, so you don't have to!
Summary
HTTP 422 errors typically result from submitting well-formed but invalid data, often in POST requests. While it's unlikely that 422 errors are used to block scrapers, it’s always best to test with rotating proxies if the issue persists. Using Scrapfly’s advanced tools, you can bypass these potential blocks and ensure your tasks continue without disruption.