How to use CSS selectors in NodeJS when web scraping?

To parse web scraped content in NodeJS using CSS selectors we recommend the Cheerio library:

const cheerio = require('cheerio');

const $ = cheerio.load(`

    <h1>Page title</h1>
<p>some paragraph</p>
<a href="http://scrapfly.io/blog">some link</a>

`);

$('h1').text();
"Page title"
$('a').attribute("href");
"http://scrapfly.io/blog"

Another popular library is Osmosis which supports HTML parsing through both CSS and XPath selectors:

const osmosis = require("osmosis");

const html = `
<a class="link" href="http://scrapfly.io/">link 1</a>
<a class="link" href="http://scrapfly.blog/">link 2</a>
`
osmosis
    .parse(html)
    .find('a.link') 
    .log(console.log);
Question tagged: NodeJS, Data Parsing, Css Selectors

Related Posts

How to Scrape Forms

Learn how to scrape forms through a step-by-step guide using HTTP clients and headless browsers.

How to Scrape With Headless Firefox

Discover how to use headless Firefox with Selenium, Playwright, and Puppeteer for web scraping, including practical examples for each library.

How to Use Chrome Extensions with Playwright, Puppeteer and Selenium

In this article, we'll explore different useful Chrome extensions for web scraping. We'll also explain how to install Chrome extensions with various headless browser libraries, such as Selenium, Playwright and Puppeteer.