How to select elements by attribute using CSS selectors?

CSS selectors allow selecting elements by any attribute value such as class, id, href etc. This means we can extract any HTML elements based on attribute value with CSS selectors.

While class and id have special notation shortcuts - . and # respectively - any attribute can be selected using the attribute selector ([attribute]) which supports multiple operators:

  1. The = match can be used for exact equality e.g. [attr=match] :
<div> <div title="product">select</div> <div title="sold-product">ignore</div> </div>
  1. The ~= turns the attribute into an array of space-separated values and checks whether it contains the match (similar to .class and #id matches):
<div> <div title="product">select</div> <div title="sold product new">select</div> <div title="sold-product">ignore</div> </div>
  1. The |= checks for exact equality except ignores hypen suffixes (e.g. -language):
<div> <div title="product">select</div> <div title="product-language-en">select</div> <div title="sold-product">ignore</div> <!-- it checks exact match so will not match if suffixed value is in the middle --> <div title="new product-language-en disabled">ignore</div> </div>
  1. The ^= checks whether the attribute starts with the match:
<div> <div title="product1">select</div> <div title="product2">select</div> <div title="product3">select</div> <div title="4product">ignore</div> </div>
  1. The $= checks whether the attribute ends with the match:
<div> <div title="1product">select</div> <div title="2product">select</div> <div title="3product">select</div> <div title="product4">ignore</div> </div>
  1. The *= checks whether the attribute contains the match:
<div> <div title="1product-en">select</div> <div title="2product-de">select</div> <div title="prod-duct">ignore</div> </div>

Finally, to make all of these matches case insensitive add i before the closing bracket, e.g. [attr=match i]

Provided by Scrapfly

This knowledgebase is provided by Scrapfly — a web scraping API that allows you to scrape any website without getting blocked and implements a dozens of other web scraping conveniences. Check us out 👇