How to select all elements between two elements in XPath?

To select an element that is between two known elements in XPath we have several options. Let's see these few interactive examples:

  1. If we can identify an anchor element we can restrict our selection using preceding-sibling or following-sibling:
<article> <p>ignore</p> <p>ignore</p> <h2>anchor</h2> <p>select</p> <p>select</p> <p>select</p> <h2>title2</h2> <p>ignore</p> <p>ignore</p> </article>

Above we select all <p> elements that have the first preceding <h2> element with text anchor.

  1. If we know the count of unique preceding or following elements we can use count():
<article> <p>ignore</p> <p>ignore</p> <h2>anchor</h2> <p>select</p> <p>select</p> <p>select</p> <h2>title2</h2> <p>ignore</p> <p>ignore</p> </article>

Here, we select all <p> elements that have exactly one preceding <h2> element. Element counting is less reliable than using an anchor element but is often much easier to implement.

XPath provides a lot of flexibility in selecting elements as we can navigate the HTML tree in every direction and match elements by any attribute.
For more on XPath, see our XPath introduction tutorial

Question tagged: XPath, Data Parsing

Related Posts

How to Parse XML

In this article, we'll explain about XML parsing. We'll start by defining XML files, their format and how to navigate them for data extraction.

Ultimate XPath Cheatsheet for HTML Parsing in Web Scraping

Ultimate companion for HTML parsing using XPath selectors. This cheatsheet contains all syntax explanations with interactive examples.

Web Scraping With Ruby

Introduction to web scraping with Ruby. How to handle http connections, parse html files for data, best practices, tips and an example project.