Playwright Examples for Web Scraping and Automation
Learn Playwright with Python and JavaScript examples for automating browsers like Chromium, WebKit, and Firefox.
Python's httpx HTTP client package supports both HTTP and SOCKS5 proxies. Here's how to use proxies with httpx:
import httpx
from urllib.parse import quote
# proxy pattern is:
# scheme://username:password@IP:PORT
# For example:
# no auth HTTP proxy:
my_proxy = "http://160.11.12.13:1020"
# or socks5
my_proxy = "http://160.11.12.13:1020|socks5"
# proxy with authentication
my_proxy = "http://my_username:my_password@160.11.12.13:1020"
# note: that username and password should be url quoted if they contain URL sensitive characters like "@":
my_proxy = f"http://{quote('foo@bar.com')}:{quote('password@123')}@160.11.12.13:1020"
proxies = {
# this proxy will be applied to all http:// urls
'http://': 'http://160.11.12.13:1020',
# this proxy will be applied to all https:// urls (not the S)
'https://': 'http://160.11.12.13:1020',
# we can also use proxy only for specific pages
'https://httpbin.dev': 'http://160.11.12.13:1020',
}
with httpx.Client(proxies=proxies) as client:
r = client.get("https://httpbin.dev/ip")
# or async
async with httpx.AsyncClient(proxies=proxies) as client:
r = await client.get("https://httpbin.dev/ip")
Note that proxy can also be set through the standard *_PROXY
environment variables:
$ export HTTP_PROXY="http://160.11.12.13:1020"
$ export HTTPS_PROXY="http://160.11.12.13:1020"
$ export ALL_PROXY="socks://160.11.12.13:1020"
$ python
import httpx
# this will use the proxies we set
with httpx.Client() as client:
r = client.get("https://httpbin.dev/ip")
When web scraping, it's best to rotate proxies for each request. For that see our article: How to Rotate Proxies in Web Scraping
This knowledgebase is provided by Scrapfly data APIs, check us out! 👇