How to use XPath selectors in NodeJS when web scraping?

const osmosis = require("osmosis"); const html = ` <a href="http://scrapfly.io/">link 1</a> <a href="http://scrapfly.blog/">link 2</a> ` osmosis .parse(html) .find('//a/@href') .log(console.log);

import xpath from 'xpath'; import { DOMParser } from '@xmldom/xmldom' const tree = new DOMParser().parseFromString(` <h1>Page title</h1> <p>some paragraph</p> <a href="http://scrapfly.io/blog">some link</a> `); console.log({ // we can extract text of the node, which returns `Text` object: title: xpath.select('//h1/text()', tree)[0].data, // or a specific attribute value, which return `Attr` object: url: xpath.select('//a/@href', tree)[0].value, });

May 10, 2024

How to use XPath selectors in NodeJS when web scraping?

Provided by Scrapfly

Company

Tools

Resources

Learn Web Scraping

Usage

How to use XPath selectors in NodeJS when web scraping?

Provided by Scrapfly

Related Questions

Related Posts

How to Scrape Forms

How to Scrape With Headless Firefox

How to Use Chrome Extensions with Playwright, Puppeteer and Selenium

How to Scrape Sitemaps to Discover Scraping Targets

Company

Tools

Resources

Learn Web Scraping

Usage