Python's requests package supports both HTTP and SOCKS5 proxies which can be set for each request or the whole script:
import requests
# proxy pattern is:
# scheme://username:password@IP:PORT
# For example:
# no auth HTTP proxy:
my_proxy = "http://160.11.12.13:1020"
# or socks5
my_proxy = "socks://160.11.12.13:1020"
# proxy with authentication
my_proxy = "http://my_username:my_password@160.11.12.13:1020"
# note: that username and password should be url quoted if they contain URL sensitive characters like "@":
from urllib.parse import quote
my_proxy = f"http://{quote('foo@bar.com')}:{quote('password@123')}@160.11.12.13:1020"
proxies = {
# this proxy will be applied to all http:// urls
'http': 'http://160.11.12.13:1020',
# this proxy will be applied to all https:// urls (not the S)
'https': 'http://160.11.12.13:1020',
# we can also use proxy only for specific pages
'https://httpbin.dev': 'http://160.11.12.13:1020',
}
requests.get("https://httpbin.dev/ip", proxies=proxies)
Note that proxy can also be set through the standard *_PROXY
environment variables:
$ export HTTP_PROXY="http://160.11.12.13:1020"
$ export HTTPS_PROXY="http://160.11.12.13:1020"
$ export ALL_PROXY="socks://160.11.12.13:1020"
$ python
import requests
# this will use the proxies we set
requests.get("https://httpbin.dev/ip")
Finally, when web scraping using proxies we should rotate proxies for each request. See our how to rotate proxies guide for more. For more on proxies see introduction to proxies in web scraping
This knowledgebase is provided by Scrapfly — a web scraping API that allows you to scrape any website without getting blocked and implements a dozens of other web scraping conveniences. Check us out 👇