How to find HTML elements by text with Cheerio and NodeJS?

Using NodeJS' Cheerio we can find any HTML element by partial or exact text value using the :contains() pseudo selector:

const cheerio = require('cheerio');

const $ = cheerio.load(`
    <a>ignore</a>
<a href="http://example.com">link</a>
<a>ignore</a>
`);
console.log(
    $('a:contains("link")').text()
);
"link"

This selector is case sensitive so it might be dangerous to use in web scraping. Instead, it's advised to filter values by text:

const cheerio = require('cheerio');

const $ = cheerio.load(`
    <a>ignore</a>
<a href="http://example.com">Link</a>
<a>ignore</a>
`);

console.log(
    $('a').filter(
        (i, element) => { return $(element).text().toLowerCase().includes("link")}
    ).text()
);
"link"
Question tagged: NodeJS

Related Posts

How to Scrape Sitemaps to Discover Scraping Targets

Usually to find scrape targets we look at site search or category pages but there's a better way - sitemaps! In this tutorial, we'll be taking a look at how to find and scrape sitemaps for target locations.

Web Scraping With Node-Unblocker

Tutorial on using Node-Unblocker - a nodejs library - to avoid blocking while web scraping and using it to optimize web scraping stacks.

Web Scraping With NodeJS and Javascript

In this article we'll take a look at scraping using Javascript through NodeJS. We'll cover common web scraping libraries, frequently encountered challenges and wrap everything up by scraping etsy.com