Web Scraping with Python and BeautifulSoup
Beautifulsoup is one the most popular libraries in web scraping. In this tutorial, we'll take a hand-on overview of how to use it, what is it good for and explore a real -life web scraping example.
No, Python's BeautifulSoup doesn't support XPath selectors despite supporting lxml
backend which can perform XPath queries.
To use XPath selectors either lxml
or parsel
packages must be used.
parsel is a modern wrapper around lxml
which makes xpath selections very easy:
from parsel import Selector
selector = Selector(text='<div class="price">22.85</div>')
print(selector.xpath("//div[@class='price']/text()").get())
"22.85"
Alternatively, lxml can be used directly:
from lxml import html
tree = html.fromstring('<div class="price">22.85</div>')
print(tree.xpath("//div[@class='price']/text()"))
"22.85"