What is a Headless Browser? Top 5 Headless Browser Tools
Quick overview of new emerging tech of browser automation - what exactly are these tools and how are they used in web scraping?
To handle browser dialog pop-ups in Puppeteer like this one seen on web-scraping.dev cart page:
We can use the dialog event handler to check the dialog message and press yes/no. This can be done using the page.on("dialog", handler)
method:
const puppeteer = require('puppeteer');
async function run() {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
// set up a dialog event handler
page.on('dialog', async dialog => {
console.log(dialog.message());
if(dialog.message().includes('clear your cart')) {
console.log(`clicking "Yes" to ${dialog.message()}`);
await dialog.accept(); // press 'Yes'
} else {
await dialog.dismiss(); // press 'No'
}
});
// add something to cart
await page.goto('https://web-scraping.dev/product/1');
await page.click('.add-to-cart');
// try clearing cart which raises a dialog that says "are you sure you want to clear your cart?"
await page.goto('https://web-scraping.dev/cart');
await page.waitForSelector('.cart-full .cart-item');
await page.click('.cart-full .cart-clear');
// check the cart
const cartItems = await page.$('.cart-item .cart-title');
console.log(`items in cart: ${cartItems ? 1 : 0}`); // Should print 0 if no items in cart.
await browser.close();
}
run();
In the examle above, we attach a dialog handler to our page
object which checks whether the dialog message contains the text "clear your cart" and if so, it clicks "Yes" to clear the cart. Otherwise, it clicks "No" to cancel the dialog.
This knowledgebase is provided by Scrapfly data APIs, check us out! 👇