In this article, we'll explain about XML parsing. We'll start by defining XML files, their format and how to navigate them for data extraction.
To select HTML elements by attribute value the @ syntax can be used together with = or contains() functions. Here's how.
To select sibling elements in XPath the preceding-sibling and following-sibling axis can be used. Here's how and why it's so useful.
To select last element in XPath we cannot use indexing as -1 index is not supported. Instead, last() function can be used. Here's how.
To select elements of a specific position the position() function can be used in a selection predicate. Here's how.
To select any element the wildcard "*" axis selector can be used which will select any HTML element of any name within the current context.
To select elements by ID attribute in XPath we can directly match it using = operator in a predicate or contains() function. Here's how.
To select an element with name matching one from an array of names the name() method can be used. Here's how.
To reverse expressions and predicates in XPath the not() function can be used. Here's how and why it's so useful.
To join values in XPath the concat() function can be used to concatenate strings into one string. Here's how.
To find the name of a selected HTML element with XPath the name() function can be used. Here's how and why is this useful.
To count number of selected elements by an XPath selector the count() function can be used. Here's how to do it and why it's useful.
To select all elements between two different elements preceding-sibling or following-sibling axis selectors can be used. Here's how.
CSS selectors and XPath are both path languages for HTML parsing. Xpath is more powerful but CSS is more approachable - which is one is better?
Python has several options for executing XPath selectors against HTML. The most popular ones are lxml and parsel. Here's how to use them.
To select HTML elements by class name in XPath we can use the @ attribute selector and comparison function contains(). Here's how to do it.
In this article, we'll explain about XML parsing. We'll start by defining XML files, their format and how to navigate them for data extraction.
Ultimate companion for HTML parsing using XPath selectors. This cheatsheet contains all syntax explanations with interactive examples.
Introduction to web scraping with Ruby. How to handle http connections, parse html files for data, best practices, tips and an example project.
Introduction to xpath in the context of web-scraping. How to extract data from HTML documents using xpath, best practices and available tools.
Introduction to web scraping with PHP. How to handle http connections, parse html files for data, best practices, tips and an example project.
Tutorial on web scraping with scrapy and Python through a real world example project. Best practices, extension highlights and common challenges.