Is it possible to select preceding siblings using CSS selectors?

It's not possible to select preceding siblings using CSS selectors (unlike following siblings).

However, depending on your scraping stack there are several different ways to achieve this:

  1. Use XPath with preceding-sibling selector.

  2. Use Beautifulsoup and Python to select the preceding siblings:

from bs4 import BeautifulSoup

html = """
<div>
  <h2>Heading 1</h2>
  <p>Paragraph 1</p>
  <p>Paragraph 2</p>
  <h2>Heading 2</h2>
  <p>Paragraph 3</p>
  <p>Paragraph 4</p>
</div>
"""
soup = BeautifulSoup(html, "html.parser")

# Find root element:
second_h2_element = soup.find_all("h2")[1]
# Select the preceding siblings using .previous_siblings property:
preceding_siblings = second_h2_element.previous_siblings
for sibling in preceding_siblings:
    print(sibling.text)
  1. Using Cheerio and Javascript to select the preceding siblings:
const cheerio = require("cheerio");

const html = `
<div>
<h2>Heading 1</h2>
<p>Paragraph 1</p>
<p>Paragraph 2</p>
<h2>Heading 2</h2>
<p>Paragraph 3</p>
<p>Paragraph 4</p>
</div>
`;

const $ = cheerio.load(html);

// Get the second h2 element
const second_h2_element = $("h2").eq(1);

// Select the preceding siblings of the h2 element
const preceding_siblings = second_h2_element.prevAll();

// Loop over the preceding siblings and print their text content
preceding_siblings.each(function() {
  console.log($(this).text());
});
Question tagged: Css Selectors

Related Posts

Web Scraping With Ruby

Introduction to web scraping with Ruby. How to handle http connections, parse html files for data, best practices, tips and an example project.

Web Scraping With NodeJS and Javascript

In this article we'll take a look at scraping using Javascript through NodeJS. We'll cover common web scraping libraries, frequently encountered challenges and wrap everything up by scraping etsy.com

Parsing HTML with CSS Selectors

Introduction to using CSS selectors to parse web-scraped content. Best practices, available tools and common challenges by interactive examples.