How to Web Scrape with HTTPX and Python
Intro to using Python's httpx library for web scraping. Proxy and user agent rotation and common web scraping challenges, tips and tricks.
Modern web scraping often involved a lot of JSON parsing through hidden web data scraping or backend API scraping in particular. There are several ways to parse JSON data in Python.
JMESPath is a popular JSON query language and library available in many languages:
JSONPath is another popular JSON query language and library available in many languages:
Both of these tools are a great way to parse JSON datasets within Python. As for which one is better - generally, JSONPath is more powerful by offering recursive selectors (e.g.
$..book will select key
book anywhere in the dataset) while Jmespath has a more intuitive syntax and better data reshaping capabilities (e.g. renaming keys and flattening nested data structures).