How to use XPath selectors in NodeJS when web scraping?

by scrapecrow Oct 31, 2022

CSS selectors are much more widely used in NodeJS and Javascript ecosystems though for web scraping we might need more powerful features of XPath selectors.
There are few options available for XPath selectors. Most popular one in web scraping is the osmosis library:

const osmosis = require("osmosis");

const html = `
<a href="http://scrapfly.io/">link 1</a>
<a href="http://scrapfly.blog/">link 2</a>
`
osmosis
    .parse(html)
    .find('//a/@href')
    .log(console.log);

Another alternative is the xmldom library:

import xpath from 'xpath';
import { DOMParser } from '@xmldom/xmldom'

const tree = new DOMParser().parseFromString(`

    <h1>Page title</h1>
<p>some paragraph</p>
<a href="http://scrapfly.io/blog">some link</a>

`);

console.log({
    // we can extract text of the node, which returns `Text` object:
    title: xpath.select('//h1/text()', tree)[0].data,
    // or a specific attribute value, which return `Attr` object:
    url: xpath.select('//a/@href', tree)[0].value,
});

How to use XPath selectors in NodeJS when web scraping?

Related Articles

How to Parse XML

Ultimate XPath Cheatsheet for HTML Parsing in Web Scraping

How to Scrape Sitemaps to Discover Scraping Targets

Web Scraping With Ruby

Web Scraping With NodeJS and Javascript

How to Web Scrape with Puppeteer and NodeJS in 2025