XPath Knowledgebase

XPath is a powerful language for navigating and querying XML and HTML documents. It allows developers to select nodes in a document using a concise special syntax, making it an essential tool for web scraping and data extraction.

Unlike CSS selectors, XPath provides full HTML tree navigation capabilities. This allows for really advanced queries and data extraction techniques, such as selecting nodes based on their position in the document, attributes, or even text content.

However Xpath can be more complex and difficult to learn, especially since it's not used outside of web scraping and data extraction. But once you get the hang of it, XPath can be a very powerful tool for extracting data from HTML documents.

Parsing HTML with Xpath

Introduction to xpath in the context of web-scraping. How to extract data from HTML documents using xpath, best practices and available tools.

Parsing HTML with Xpath

See below for more on XPath in the context of web scraping and data programming 👇

How to select elements by attribute value in XPath?

To select HTML elements by attribute value the @ syntax can be used together with = or contains() functions. Here's how.

#xpath

How to count selections in XPath and why?

To count number of selected elements by an XPath selector the count() function can be used. Here's how to do it and why it's useful.

#xpath

How to get the name of an HTML element in XPath?

To find the name of a selected HTML element with XPath the name() function can be used. Here's how and why is this useful.

#xpath

How to join values using XPath concat?

To join values in XPath the concat() function can be used to concatenate strings into one string. Here's how.

#xpath

How to reverse expressions in XPath?

To reverse expressions and predicates in XPath the not() function can be used. Here's how and why it's so useful.

#xpath

How to select element with one of many names in XPath?

To select an element with name matching one from an array of names the name() method can be used. Here's how.

#xpath

How to select elements by ID in XPath?

To select elements by ID attribute in XPath we can directly match it using = operator in a predicate or contains() function. Here's how.

#xpath

How to select any element using wildcard in XPath?

To select any element the wildcard "*" axis selector can be used which will select any HTML element of any name within the current context.

#xpath