Guide to SeleniumBase — A Better & Easier Selenium
SeleniumBase streamlines browser automation with simple syntax, cross-browser support, and robust features, perfect for testing and web scraping.
Selenium doesn't have a request interception functionality out of the box but we can enable it using selenium-wire
extension.
Capturing background requests can be an important step of a web scraping process and this area is where Selenium is lacking, so let's take a look how to use Selenium extension - selenium-wire
to implement this vital feature.
To start selenium-wire
can be installed using pip install selenium-wire
command. Then all requests are captured automatically and stored in driver.request
variable:
from seleniumwire import webdriver # Import from seleniumwire
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
driver = webdriver.Chrome()
driver.get('https://web-scraping.dev/product/1')
# wait for element to appear and click it to trigger background requests
element = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, 'load-more-reviews')))
element.click()
# Access requests via the `requests` attribute
for request in driver.requests:
if request.response:
print(
request.url,
request.response.status_code,
request.response.headers['Content-Type'],
request.response.body,
)
driver.quit()
Often these background requests can contain important dynamic data and using this capturing technique is an easy way to scrape it. For more see our web scraping background requests guide.
This knowledgebase is provided by Scrapfly data APIs, check us out! 👇