TooManyRedirects
error can be seen when using Python requests module to scrape websites with incorrectly configured redirects:
import requests
requests.get("http://httpbin.dev/redirect/31") # default redirect limit is 30
# will raise:
# TooManyRedirects(requests.exceptions.TooManyRedirects: Exceeded 30 redirects.
# we can set max redirects using requests.Session:
session = requests.Session()
session.max_redirects = 2
session.get("http://httpbin.dev/redirect/3")
When web scraping, this usually means one of 3 things:
To handle ToomanyRedirects
exception we should disable automatic redirects and handle them manually:
import requests
session = requests.Session()
response = session.get("http://httpbin.dev/redirect/3", allow_redirects=False)
redirect_url = response.headers['Location']
# now we can manually inspect and fix the redirect url if necessary and then follow it:
response2 = session.get(redirect_url, allow_redirects=False)