JSON Parsing

JSON is a native data structure of Javascript though it's been widely used in other programming languages as well. The key feature of JSON is its simplicity and ease of use. It's similar to HTML — as it's a tree structure of key-value pairs — though JSON keys can only have 1 value.

Just like with HTML, there are several powerful JSON tools and query languages in particular. Let's take a look at some.

JSON Parsing Tools

JSON is a very simple data structure and can be parsed with any programming language natively. Though in web scraping we often need to deal with large and complex JSON datasets so additional query tools can be very useful.

The two most popular JSON parsing clients in scraper programming are jmespath and jsonpath.

Generally speaking, JmesPath is great for reshaping and filtering datasets and for more advance parsing functionality like recursive key selections JsonPath is the more feature-rich option.

Next up - Data Processing

Now that we know how to extract data from HTML and JSON documents we can move on to the next step - data processing and validation. In the next section, we'll take a look at popular data validation and cleanup techniques.

< >

Summary