     [Blog](https://scrapfly.io/blog)   /  [data-parsing](https://scrapfly.io/blog/tag/data-parsing)   /  [Introduction to Parsing JSON with Python JSONPath](https://scrapfly.io/blog/posts/parse-json-jsonpath-python)   # Introduction to Parsing JSON with Python JSONPath

 by [Bernardas Alisauskas](https://scrapfly.io/blog/author/bernardas) Apr 10, 2026 10 min read [\#data-parsing](https://scrapfly.io/blog/tag/data-parsing) [\#python](https://scrapfly.io/blog/tag/python) 

 [  ](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fparse-json-jsonpath-python "Share on LinkedIn")    

 

 

   

JSONPath is a path expression language to parse JSON data. It's used to query data from JSON objects using a similar syntax to the XPath query language used to parse XML documents.

As the name implies, JSONPath syntax is heavily inspired by XPath and offers similar atomic expressions and querying capabilities:

- Array slicing
- Filtering and wildcard matching
- Function calls and custom function extensions
- Recursive data lookup, such as finding specific data points across the entire dataset nodes

In this tutorial, we'll explore how to use the JSONPath expressions for web scraping. For this, we'll be using the Python client, but the same concepts can be applied to other JSONPath implementations. Let's get started!

## Key Takeaways

Master jsonpath python for extracting structured data from JSON APIs using advanced query expressions with array slicing, filtering, and recursive data lookup techniques.

- Implement JSONPath expressions with array slicing and wildcard matching for complex JSON data extraction
- Use recursive data lookup techniques to find specific data points across nested JSON structures
- Configure custom function extensions and filtering expressions for advanced JSON querying
- Parse JSON responses from web APIs using JSONPath for structured data extraction workflows
- Debug JSONPath queries by testing expressions against sample JSON data structures
- Handle complex JSON structures with nested arrays and objects using advanced JSONPath syntax

**Get web scraping tips in your inbox**Trusted by 100K+ developers and 30K+ enterprises. Unsubscribe anytime.







## JSONPath Setup

JSONPath's query specification lacks a centralized standard, resulting in slight differences for each language implementation across different projects. Below are common clients for different languages:

| Language | Implementation |
|---|---|
| Python | - \[jsonpath-ng\](https://github.com/h2non/jsonpath-ng) - \[jsonpath2\](https://github.com/pacifica/python-jsonpath2/) - \[jsonpath-rw\](https://github.com/kennknowles/python-jsonpath-rw) |
| JavaScript | - \[jsonpath-plus\](https://www.npmjs.com/package/jsonpath-plus) |
| Ruby | - \[jsonpath\](https://github.com/joshbuddy/jsonpath) |
| R | - \[rjsonpath\](https://github.com/blmoore/rjsonpath) |
| Go | - \[ojg\](https://github.com/ohler55/ojg) - \[jsonpath\](https://github.com/kubernetes/client-go/tree/master/util/jsonpath) |

For this guide, we'll be using the JSONPath Python implementation, specifically [jsonpath-rw](https://github.com/kennknowles/python-jsonpath-rw). Install it using the following `pip` command:

shell```shell
pip install jsonpath-rw
```





## Introduction to Python JSONPath

Let's start out with simple JSONPath Python parser. We'll parse the below JSON object using simple string operations:

python```python
import jsonpath_ng.ext as jp

data = {
    "products": [
        {"name": "Apple", "price": 12.88, "tags": ["fruit", "red"]},
        {"name": "Peach", "price": 27.25, "tags": ["fruit", "yellow"]},
        {"name": "Cake", "tags": ["pastry", "sweet"]},
    ]
}

# find all product names:
query = jp.parse("products[*].name")
for match in query.find(data):
    print(match.value)

# find all products with price > 20
query = jp.parse("products[?price>20].name")
for match in query.find(data):
    print(match.value)
```



The above JSONPath finder uses the `products[*].name` expression to iterate over the `products` key as the root object where its elements are first-class objects using the `products[*].name` expressions.

Then, we return the name property of each element. Since JSONPath supports filtering expressions, we use the `[?price>20]` parse method to return all elements with a price greater than 20.

### JSONPath Expressions

The below table illustrates some of the common binary comparison operators used with a JSONPath finder:

| operator | function |
|---|---|
| `$` | object root selector |
| `@` or `this` | current object selector |
| `..` | recursive descendant selector |
| `*` | wildcard, selects any key of an object or index of an array |
| `[]` | subscript operator |
| `[start:end:step]` | array slice operator |
| `[?<predicate>]` or `(?<predicate>)` | filter operator where predicate is some evaluation rule like `[?price>20]`, more examples: |
|  | `[?price > 20 & price < 10]` multiple |
|  | `[?address.city = "Boston"]` for exact matches |
|  | `[?description.text =~ "house"]` for containing values |

The above expression examples open the door for powerful JSON parsing options. Let's explore using them in the context of web scraping!



Scrapfly

#### Extract structured data automatically?

Scrapfly's Extraction API uses AI to turn any webpage into structured data — no selectors needed.

[Try Free →](https://scrapfly.io/register)## Web Scraper Example

Let's JSONPath in a real example scraper by taking a look at how it would be used in web scraping with Python.

We'll be scraping real estate property data from [realtor.com](https://www.realtor.com/) which is a popular portal for renting and selling real estate properties.

This website like many modern websites uses Javascript to render its pages which means we can't just scrape the HTML code. Instead, we'll find the JSON variable data that is used by the frontend to render the page. This is called [hidden web data](https://scrapfly.io/blog/posts/how-to-scrape-hidden-web-data#what-is-hidden-web-data).

Hidden web data can often be found in the HTML code and extracted using HTML parser however this data is often filled with keywords, ids and other non-data fields that's why we'll be using JSONPath to extract only the useful data.

Let's take a look at any random example property like [this one](https://www.realtor.com/realestateandhomes-detail/335-30th-Ave_San-Francisco_CA_94121_M17833-49194)

If we take a look at the page source we can see the JSON data set hidden in a `<script>` tag:



We can see entire property dataset hidden in a script elementTo scrape this, we'll be using a few Python packages:

- [httpx](https://pypi.org/project/httpx/) - HTTP client library to retrieve the page.
- [parsel](https://pypi.org/project/parsel/) - HTML parsing library to extract `<script>` element data.
- [jsonpath-ng](https://pypi.org/project/jsonpath-ng/) - To parse the JSON data for property data fields.

All of these can be installed using `pip install` command:

shell```shell
$ pip install jsonpath-ng httpx parsel
```



So, we'll retrieve the HTML page, find the `<script>` element that contains the hidden web data and then use JSONPath to extract the most important property data fields:

python```python
import json
import httpx
from parsel import Selector
import jsonpath_ng as jp

# establish HTTP client and to prevent being instantly banned lets set some browser-like headers
session = httpx.Client(
    headers={
        "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
        "Accept-Language": "en-US,en;q=0.9",
        "Accept-Encoding": "gzip, deflate, br",
    },
)

# 1. Scrape the page and parse hidden web data
response = session.get(
    "https://www.realtor.com/realestateandhomes-detail/335-30th-Ave_San-Francisco_CA_94121_M17833-49194"
)
assert response.status_code == 200, "response is banned - try ScrapFly? 😉"
selector = Selector(text=response.text)
# find <script id="__NEXT_DATA__"> node and select it's text:
data = selector.css("script#__NEXT_DATA__::text").get()
# load the hidden JSON as python dictionary:
data = json.loads(data)

# here we define our JSONPath helpers: one to select first match and one to select all matches:
jp_first = lambda query, data: jp.parse(query).find(data)[0].value
jp_all = lambda query, data: [match.value for match in jp.parse(query).find(data)]

prop_data = jp_first("$..propertyDetails", data)
result = {
    # for some fields we don't need complex queries:
    "id": prop_data["listing_id"],
    "url": prop_data["href"],
    "status": prop_data["status"],
    "price": prop_data["list_price"],
    "price_per_sqft": prop_data["price_per_sqft"],
    "date": prop_data["list_date"],
    "details": prop_data["description"],
    # to reduce complex datafields we can use jsonpath again:
    # e.g. we can select by key anywhere in the data structure:
    "estimate_high": jp_first("$..estimate_high", prop_data),
    "estimate_low": jp_first("$..estimate_low", prop_data),
    "post_code": jp_first("$..postal_code", prop_data),
    # or iterate through arrays:
    "features": jp_all("$..details[*].text[0]", prop_data),
    "photos": jp_all("$..photos[*].href", prop_data),
    "buyer_emails": jp_all("$..buyers[*].email", prop_data),
    "buyer_phones": jp_all("$..buyers[*].phones[*].number", prop_data),
}
print(result)
```



Example Outputpython```python
{
  "id": "2950457253",
  "url": "https://www.realtor.com/realestateandhomes-detail/335-30th-Ave_San-Francisco_CA_94121_M17833-49194",
  "status": "sold",
  "price": 2995000,
  "price_per_sqft": 982,
  "date": "2022-12-04T23:43:42Z",
  "details": {
    "baths": 4,
    "baths_3qtr": null,
    "baths_full": 3,
    "baths_full_calc": 3,
    "baths_half": 1,
    "baths_max": null,
    "baths_min": null,
    "baths_partial_calc": 1,
    "baths_total": null,
    "beds": 4,
    "beds_max": null,
    "beds_min": null,
    "construction": null,
    "cooling": null,
    "exterior": null,
    "fireplace": null,
    "garage": null,
    "garage_max": null,
    "garage_min": null,
    "garage_type": null,
    "heating": null,
    "logo": null,
    "lot_sqft": 3000,
    "name": null,
    "pool": null,
    "roofing": null,
    "rooms": null,
    "sqft": 3066,
    "sqft_max": null,
    "sqft_min": null,
    "stories": null,
    "styles": [
      "craftsman_bungalow"
    ],
    "sub_type": null,
    "text": "With four bedrooms, three and one-half baths, and over 3, 000 square feet of living space, 335 30th avenue offers a fantastic modern floor plan with classic finishes in the best family-friendly neighborhood in San Francisco. Originally constructed in 1908, the house underwent a total gut renovation and expansion in 2014, with an upgraded foundation, all new plumbing and electrical, double-pane windows and all new energy efficient appliances. Interior walls were removed on the main level to create a large flowing space. The home is detached on three sides (East, South, and West) and enjoys an abundance of natural light. The top floor includes the primary bedroom with two gorgeous skylights and an en-suite bath; two kids bedrooms and a shared hall bath. The main floor offers soaring ten foot ceilings and a modern, open floor plan perfect for entertaining. The combined family room - kitchen space is the heart of the home and keeps everyone together in one space. Just outside the breakfast den, the back deck overlooks the spacious yard and offers indoor/outdoor living. The ground floor encompasses the garage, a laundry room, and a suite of rooms that could serve as work-from-home space, AirBnB, or in-law unit.",
    "type": "single_family",
    "units": null,
    "year_built": 1908,
    "year_renovated": null,
    "zoning": null,
    "__typename": "HomeDescription"
  },
  "estimate_high": 3253200,
  "estimate_low": 2824400,
  "post_code": "94111",
  "features": [
    "Bedrooms: 4",
    "Total Rooms: 11",
    "Total Bathrooms: 4",
    "Built-In Gas Oven",
    "Breakfast Area",
    "Fireplace Features: Brick, Family Room, Wood Burning",
    "Interior Amenities: Dining Room, Family Room, Guest Quarters, Kitchen, Laundry, Living Room, Primary Bathroom, Primary Bedroom, Office, Workshop",
    "Balcony",
    "Lot Description: Adjacent to Golf Course, Landscape Back, Landscape Front, Low Maintenance, Manual Sprinkler Rear, Zero Lot Line",
    "Driveway: Gated, Paved Sidewalk, Sidewalk/Curb/Gutter",
    "View: Bay, Bridges, City, San Francisco",
    "Association: No",
    "Source Listing Status: Closed",
    "Total Square Feet Living: 3066",
    "Sewer: Public Sewer, Septic Connected"
  ],
  "photos": [
    "http://ap.rdcpix.com/f707c59fa49468fde4999bbd9e2d433bl-m872089375s.jpg",
    "http://ap.rdcpix.com/f707c59fa49468fde4999bbd9e2d433bl-m872089375s.jpg",
    "http://ap.rdcpix.com/f707c59fa49468fde4999bbd9e2d433bl-m872089375s.jpg"
  ],
  "buyer_emails": [
    "REDACTED_FOR_BLOG@REDACTED_FOR_BLOG.com",
  ],
  "buyer_phones": [
    "415296XXXX",
    "415901XXXX",
    "415901XXXX",
  ]
}
```



Just as we'd use XPath to parse HTML datasets we can use JSONPath to parse JSON datasets. JSONPath is a powerful yet simple language that works especially well when working with hidden web data.



## FAQ

What's the difference between JMESPath and JSONPath?JMESPath is another popular JSON query language that is available in more programming languages. Main difference is that JSONPath follows XPath syntax allowing recursive selectors and easy extendability while JMESPath allows easier dataset mutation and filtering. We recommend JSONPath for extracting nested data while JMESPath is better for processing more complex but predictable datasets.







Is JSONPath Slow?Since JSON data is translated to native objects JSONPath can be very fast depending on the implementation and used algorithms. Since JSONPath is just a query specification not an individual project speed varies by each implementation but generally, it should be as fast as XPath for XML or even faster.









## JsonPath in Web Scraping Summary

In this introduction tutorial, we've taken a look at JSONPath query language for JSON in Python. This path language is heavily inspired by XPath and allows us to extract nested data from JSON datasets which means it fits web scraping stack perfectly as we can use two similar technologies for extracting data from HTML and JSON.

Finally, we've taken a look at a real life example of using `jsonpath-ng` library for parsing hidden web data from realtor.com where we extracted main property listing data fields in just a few lines of code.



 

    Table of Contents- [Key Takeaways](#key-takeaways)
- [JSONPath Setup](#jsonpath-setup)
- [Introduction to Python JSONPath](#introduction-to-python-jsonpath)
- [JSONPath Expressions](#jsonpath-expressions)
- [Web Scraper Example](#web-scraper-example)
- [FAQ](#faq)
- [JsonPath in Web Scraping Summary](#jsonpath-in-web-scraping-summary)
 
    Join the Newsletter  Get monthly web scraping insights 

 

  



Scale Your Web Scraping

Anti-bot bypass, browser rendering, and rotating proxies, all in one API. Start with 1,000 free credits.

  No credit card required  1,000 free API credits  Anti-bot bypass included 

 [Start Free](https://scrapfly.io/register) [View Docs](https://scrapfly.io/docs/onboarding) 

 Not ready? Get our newsletter instead. 

 

## Explore this Article with AI

 [ ChatGPT ](https://chat.openai.com/?q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fparse-json-jsonpath-python) [ Gemini ](https://www.google.com/search?udm=50&aep=11&q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fparse-json-jsonpath-python) [ Grok ](https://x.com/i/grok?text=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fparse-json-jsonpath-python) [ Perplexity ](https://www.perplexity.ai/search/new?q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fparse-json-jsonpath-python) [ Claude ](https://claude.ai/new?q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fparse-json-jsonpath-python) 



 ## Related Articles

 [  

 python data-parsing 

### How to Parse Datetime Strings with Python and Dateparser

Dateparser is a popular Python package for parsing datetime strings. Here's how it can be used in web scraping and how t...

 

 ](https://scrapfly.io/blog/posts/parsing-datetime-strings-with-python-and-dateparser) [  

 http data-parsing 

### Web Scraping With PHP 101

Introduction to web scraping with PHP. How to handle http connections, parse html files for data, best practices, tips a...

 

 ](https://scrapfly.io/blog/posts/web-scraping-with-php-101) [  

 python data-parsing 

### Quick Intro to Parsing JSON with JMESPath in Python

Introduction to JMESPath - JSON query language which is used in web scraping to parse JSON datasets for scrape data.

 

 ](https://scrapfly.io/blog/posts/parse-json-jmespath-python) 

  ## Related Questions

- [ Q How to select elements by attribute value in XPath? ](https://scrapfly.io/blog/answers/how-to-select-elements-by-attribute-value)
- [ Q What are some ways to parse JSON datasets in Python? ](https://scrapfly.io/blog/answers/what-are-some-ways-to-parse-json-datasets-in-python)
- [ Q How to select dictionary key recursively in Python? ](https://scrapfly.io/blog/answers/how-to-select-dictionary-key-recursively-in-python)
 
  



   



 Extract structured data with AI, **1,000 free credits** [Start Free](https://scrapfly.io/register)