How to Know What Anti-Bot Service a Website is Using?
In this article we'll take a look at two popular tools: WhatWaf and Wafw00f which can identify what WAF service is used.
Response status code 403 is a denial of content status code which means the client is forbidden from seeing this content.
In web scraping, this can be caused by invalid HTTP request parameters like:
X-Requested-With
, X-CSRF-Token
, Origin
or even Referer
. It's important to match the values and header ordering as seen on the website.Alternatively, the scraper could be identified as a web scraper and 403 can mean the scraper is simply being blocked.
To prevent scrapers from being identified and blocked see our complete how to scrape without being blocked.
Repeated 403 status codes can lead to a complete scraper block, so these errors should be addressed ASAP.
This knowledgebase is provided by Scrapfly — a web scraping API that allows you to scrape any website without getting blocked and implements a dozens of other web scraping conveniences. Check us out 👇