How to Parse XML
In this article, we'll explain about XML parsing. We'll start by defining XML files, their format and how to navigate them for data extraction.
To select elements by text using XPath we can either match the text()
value or use it in a contains()
function.
For example, to select <a>websites</a>
we would use //a[contains(text(), "website")]
selector. See this interactive example:
Note that contains()
method is case sensitive.
For case-insensitive selections we can use matches
(sometimes called re:test()
) function:
This knowledgebase is provided by Scrapfly — a web scraping API that allows you to scrape any website without getting blocked and implements a dozens of other web scraping conveniences. Check us out 👇