How to use CSS selectors in NodeJS when web scraping?

To parse web scraped content in NodeJS using CSS selectors we recommend the Cheerio library:

const cheerio = require('cheerio');

const $ = cheerio.load(`
<body>
    <h1>Page title</h1>
    <p>some paragraph</p>
    <a href="http://scrapfly.io/blog">some link</a>
</body>
`);

$('h1').text();
"Page title"
$('a').attribute("href");
"http://scrapfly.io/blog"

Another popular library is Osmosis which supports HTML parsing through both CSS and XPath selectors:

const osmosis = require("osmosis");

const html = `
<a class="link" href="http://scrapfly.io/">link 1</a>    
<a class="link" href="http://scrapfly.blog/">link 2</a>
`
osmosis
    .parse(html)
    .find('a.link') 
    .log(console.log);

Related Posts

Web Scraping With Node-Unblocker

Tutorial on using Node-Unblocker - a nodejs library - to avoid blocking while web scraping and using it to optimize web scraping stacks.

Web Scraping With NodeJS and Javascript

In this article we'll take a look at scraping using Javascript through NodeJS. We'll cover common web scraping libraries, frequently encountered challenges and wrap everything up by scraping etsy.com

Web Scraping With a Headless Browser: Puppeteer

Introduction to using Puppeteer in Nodejs for web scraping dynamic web pages and web apps. Tips and tricks, best practices and example project.