How to find HTML elements by text with Cheerio and NodeJS?

Using NodeJS' Cheerio we can find any HTML element by partial or exact text value using the :contains() pseudo selector:

const cheerio = require('cheerio');

const $ = cheerio.load(`
    <a>ignore</a>
<a href="http://example.com">link</a>
<a>ignore</a>
`);
console.log(
    $('a:contains("link")').text()
);
"link"

This selector is case sensitive so it might be dangerous to use in web scraping. Instead, it's advised to filter values by text:

const cheerio = require('cheerio');

const $ = cheerio.load(`
    <a>ignore</a>
<a href="http://example.com">Link</a>
<a>ignore</a>
`);

console.log(
    $('a').filter(
        (i, element) => { return $(element).text().toLowerCase().includes("link")}
    ).text()
);
"link"
Question tagged: NodeJS

Related Posts

How to Use Chrome Extensions with Playwright, Puppeteer and Selenium

In this article, we'll explore different useful Chrome extensions for web scraping. We'll also explain how to install Chrome extensions with various headless browser libraries, such as Selenium, Playwright and Puppeteer.

How to Scrape Sitemaps to Discover Scraping Targets

Usually to find scrape targets we look at site search or category pages but there's a better way - sitemaps! In this tutorial, we'll be taking a look at how to find and scrape sitemaps for target locations.

Web Scraping With Node-Unblocker

Tutorial on using Node-Unblocker - a nodejs library - to avoid blocking while web scraping and using it to optimize web scraping stacks.