How to Scrape StockX e-commerce Data with Python

article feature image

StockX is an online marketplace for buying and selling authentic sneakers, streetwear, watches, and designer handbags. The most interesting part about StockX is that it treats apparel items as a commodity and tracks their value over time. This makes StockX a prime target for web scraping as tracking market movement and product data is a great way to build a data-driven business.

In this web scraping tutorial, we'll be taking a look at how to scrape StockX using Python. We'll be scraping StockX's hidden web data which is an incredibly easy way to scrape e-commerce websites with just a few lines of code.

We'll start with a quick Python environment setup and tool overview, then scrape some products and product search pages. Let's dive in!

Latest Stockx.com Scraper Code

https://github.com/scrapfly/scrapfly-scrapers/

Why scrape StockX?

Just like most e-commerce targets StockX public data provides an important overview of the market. So, scraping StockX is a great way to get a competitive advantage through data-driven decision-making.

Additionally, since StockX treats its products as commodities by scraping the data we can perform various data analysis tasks to keep on top with market trends outbidding and outmaneuvering our competitors.

StockX Dataset Preview

Since we'll be using hidden web data scraping we'll be extracting the whole product datasets which contain fields like:

  • Product data like descriptions, images, sizes, ids, traits and everything visible on the page.
  • Product market performance like sale numbers, prices, asks and bids.
Example Product Dataset
{
    "id": "7cfe0c22-7e77-4e54-89ca-c03007ecbfd1",
    "listingType": "STANDARD",
    "deleted": false,
    "merchandising": {
        "title": "StockX Verified Sneakers",
        "subtitle": "We Verify Every Item. Every Time.",
        "image": {
            "alt": null,
            "url": "https: //images-cs.stockx.com/v3/assets/blt818b0c67cf450811/bltc3258254704231c0/62a8faa88f6a4950536d049f/Merchandising_Modules_EN_-_Image_02.jpg"
        },
        "body": "",
        "trackingEvent": "06-08-22 Verified Authentic Sneakers",
        "link": {
            "title": "StockX Verified Sneakers",
            "url": "https://stockx.com/about/verification/",
            "urlType": "EXTERNAL"
        }
    },
    "productCategory": "sneakers",
    "urlKey": "nike-air-max-90-se-running-club",
    "market": {
        "bidAskData": {
            "lowestAsk": 114,
            "numberOfAsks": 130,
            "highestBid": 128,
            "numberOfBids": 79
        },
        "statistics": {
            "lastSale": {
                "amount": 204,
                "changePercentage": 0.211539,
                "changeValue": 36,
                "sameFees": false
            }
        },
        "salesInformation": {
            "lastSale": 204,
            "salesLast72Hours": 3
        }
    },
    "variants": [
        {
            "id": "1f07d59f-988e-48ac-8c44-24efa4543118",
            "market": {
                "bidAskData": {
                    "lowestAsk": 237,
                    "numberOfAsks": 1,
                    "highestBid": 70,
                    "numberOfBids": 2
                },
                "statistics": {
                    "lastSale": {
                        "amount": 278,
                        "changePercentage": 0.456539,
                        "changeValue": 88,
                        "sameFees": false
                    }
                },
                "salesInformation": {
                    "lastSale": 278,
                    "salesLast72Hours": 0
                }
            },
            "hidden": false,
            "traits": {
                "size": "6"
            },
            "gtins": [
                {
                    "type": "UPC",
                    "identifier": "195242087411"
                }
            ],
            "sizeChart": {
                "baseSize": "6",
                "baseType": "us m",
                "displayOptions": [
                    {
                        "size": "US M 6",
                        "type": "us m"
                    },
                    {
                        "size": "UK 5.5",
                        "type": "uk"
                    },
                    {
                        "size": "JP 24 (US M 6)",
                        "type": "jp"
                    },
                    {
                        "size": "KR 240 (US M 6)",
                        "type": "kr"
                    },
                    {
                        "size": "EU 38.5",
                        "type": "eu"
                    },
                    {
                        "size": "US W 7.5",
                        "type": "us w"
                    }
                ]
            },
            "group": null
        },
        {
            "id": "80701fe2-0488-4c57-9d38-9ac15988b87b",
            "market": {
                "bidAskData": {
                    "lowestAsk": 173,
                    "numberOfAsks": 4,
                    "highestBid": 57,
                    "numberOfBids": 2
                },
                "statistics": {
                    "lastSale": {
                        "amount": 189,
                        "changePercentage": 0.35,
                        "changeValue": 49,
                        "sameFees": false
                    }
                },
                "salesInformation": {
                    "lastSale": 189,
                    "salesLast72Hours": 0
                }
            },
            "hidden": false,
            "traits": {
                "size": "6.5"
            },
            "gtins": [
                {
                    "type": "UPC",
                    "identifier": "195242087428"
                }
            ],
            "sizeChart": {
                "baseSize": "6.5",
                "baseType": "us m",
                "displayOptions": [
                    {
                        "size": "US M 6.5",
                        "type": "us m"
                    },
                    {
                        "size": "UK 6 (EU 39)",
                        "type": "uk"
                    },
                    {
                        "size": "JP 24.5",
                        "type": "jp"
                    },
                    {
                        "size": "KR 245",
                        "type": "kr"
                    },
                    {
                        "size": "EU 39",
                        "type": "eu"
                    },
                    {
                        "size": "US W 8",
                        "type": "us w"
                    }
                ]
            },
            "group": null
        },
        {
            "id": "72f405a3-2788-470d-9258-bad6cd56ad7e",
            "market": {
                "bidAskData": {
                    "lowestAsk": 164,
                    "numberOfAsks": 3,
                    "highestBid": 20,
                    "numberOfBids": 1
                },
                "statistics": {
                    "lastSale": {
                        "amount": 204,
                        "changePercentage": 0.505296,
                        "changeValue": 69,
                        "sameFees": false
                    }
                },
                "salesInformation": {
                    "lastSale": 204,
                    "salesLast72Hours": 0
                }
            },
            "hidden": false,
            "traits": {
                "size": "7"
            },
            "gtins": [
                {
                    "type": "UPC",
                    "identifier": "195242087435"
                }
            ],
            "sizeChart": {
                "baseSize": "7",
                "baseType": "us m",
                "displayOptions": [
                    {
                        "size": "US M 7",
                        "type": "us m"
                    },
                    {
                        "size": "UK 6 (EU 40)",
                        "type": "uk"
                    },
                    {
                        "size": "JP 25",
                        "type": "jp"
                    },
                    {
                        "size": "KR 250",
                        "type": "kr"
                    },
                    {
                        "size": "EU 40",
                        "type": "eu"
                    },
                    {
                        "size": "US W 8.5",
                        "type": "us w"
                    }
                ]
            },
            "group": null
        },
        {
            "id": "7fa2dc2e-de81-4c3b-95c2-14fd020c5377",
            "market": {
                "bidAskData": {
                    "lowestAsk": 126,
                    "numberOfAsks": 8,
                    "highestBid": 40,
                    "numberOfBids": 3
                },
                "statistics": {
                    "lastSale": {
                        "amount": 120,
                        "changePercentage": 0.00175,
                        "changeValue": 1,
                        "sameFees": true
                    }
                },
                "salesInformation": {
                    "lastSale": 120,
                    "salesLast72Hours": 0
                }
            },
            "hidden": false,
            "traits": {
                "size": "7.5"
            },
            "gtins": [
                {
                    "type": "UPC",
                    "identifier": "195242087442"
                }
            ],
            "sizeChart": {
                "baseSize": "7.5",
                "baseType": "us m",
                "displayOptions": [
                    {
                        "size": "US M 7.5",
                        "type": "us m"
                    },
                    {
                        "size": "UK 6.5",
                        "type": "uk"
                    },
                    {
                        "size": "JP 25.5",
                        "type": "jp"
                    },
                    {
                        "size": "KR 255",
                        "type": "kr"
                    },
                    {
                        "size": "EU 40.5",
                        "type": "eu"
                    },
                    {
                        "size": "US W 9",
                        "type": "us w"
                    }
                ]
            },
            "group": null
        },
        {
            "id": "bed7428d-31f9-46b3-9a5b-18e5e4fbe6d1",
            "market": {
                "bidAskData": {
                    "lowestAsk": 117,
                    "numberOfAsks": 10,
                    "highestBid": 63,
                    "numberOfBids": 10
                },
                "statistics": {
                    "lastSale": {
                        "amount": 168,
                        "changePercentage": 0.084122,
                        "changeValue": 14,
                        "sameFees": false
                    }
                },
                "salesInformation": {
                    "lastSale": 168,
                    "salesLast72Hours": 0
                }
            },
            "hidden": false,
            "traits": {
                "size": "8"
            },
            "gtins": [
                {
                    "type": "UPC",
                    "identifier": "195242087459"
                }
            ],
            "sizeChart": {
                "baseSize": "8",
                "baseType": "us m",
                "displayOptions": [
                    {
                        "size": "US M 8",
                        "type": "us m"
                    },
                    {
                        "size": "UK 7",
                        "type": "uk"
                    },
                    {
                        "size": "JP 26",
                        "type": "jp"
                    },
                    {
                        "size": "KR 260",
                        "type": "kr"
                    },
                    {
                        "size": "EU 41",
                        "type": "eu"
                    },
                    {
                        "size": "US W 9.5",
                        "type": "us w"
                    }
                ]
            },
            "group": null
        },
        {
            "id": "674e5c93-a8f6-41f7-a9a0-beabb7788c5a",
            "market": {
                "bidAskData": {
                    "lowestAsk": 120,
                    "numberOfAsks": 6,
                    "highestBid": 79,
                    "numberOfBids": 12
                },
                "statistics": {
                    "lastSale": {
                        "amount": 118,
                        "changePercentage": -0.077334,
                        "changeValue": -9,
                        "sameFees": false
                    }
                },
                "salesInformation": {
                    "lastSale": 118,
                    "salesLast72Hours": 0
                }
            },
            "hidden": false,
            "traits": {
                "size": "8.5"
            },
            "gtins": [
                {
                    "type": "UPC",
                    "identifier": "195242087466"
                },
                {
                    "type": "EAN-13",
                    "identifier": "2460002040899"
                }
            ],
            "sizeChart": {
                "baseSize": "8.5",
                "baseType": "us m",
                "displayOptions": [
                    {
                        "size": "US M 8.5",
                        "type": "us m"
                    },
                    {
                        "size": "UK 7.5",
                        "type": "uk"
                    },
                    {
                        "size": "JP 26.5",
                        "type": "jp"
                    },
                    {
                        "size": "KR 265",
                        "type": "kr"
                    },
                    {
                        "size": "EU 42",
                        "type": "eu"
                    },
                    {
                        "size": "US W 10",
                        "type": "us w"
                    }
                ]
            },
            "group": null
        },
        {
            "id": "fe862238-749b-47ff-8913-b8f299beb9c4",
            "market": {
                "bidAskData": {
                    "lowestAsk": 114,
                    "numberOfAsks": 15,
                    "highestBid": 79,
                    "numberOfBids": 4
                },
                "statistics": {
                    "lastSale": {
                        "amount": 160,
                        "changePercentage": 0.873659,
                        "changeValue": 75,
                        "sameFees": false
                    }
                },
                "salesInformation": {
                    "lastSale": 160,
                    "salesLast72Hours": 0
                }
            },
            "hidden": false,
            "traits": {
                "size": "9"
            },
            "gtins": [
                {
                    "type": "UPC",
                    "identifier": "195242087473"
                }
            ],
            "sizeChart": {
                "baseSize": "9",
                "baseType": "us m",
                "displayOptions": [
                    {
                        "size": "US M 9",
                        "type": "us m"
                    },
                    {
                        "size": "UK 8",
                        "type": "uk"
                    },
                    {
                        "size": "JP 27",
                        "type": "jp"
                    },
                    {
                        "size": "KR 270",
                        "type": "kr"
                    },
                    {
                        "size": "EU 42.5",
                        "type": "eu"
                    },
                    {
                        "size": "US W 10.5",
                        "type": "us w"
                    }
                ]
            },
            "group": null
        },
        {
            "id": "9f02d4df-bd2f-4c60-a79b-df087d597bb4",
            "market": {
                "bidAskData": {
                    "lowestAsk": 119,
                    "numberOfAsks": 13,
                    "highestBid": 78,
                    "numberOfBids": 6
                },
                "statistics": {
                    "lastSale": {
                        "amount": 168,
                        "changePercentage": 0.430113,
                        "changeValue": 51,
                        "sameFees": false
                    }
                },
                "salesInformation": {
                    "lastSale": 168,
                    "salesLast72Hours": 0
                }
            },
            "hidden": false,
            "traits": {
                "size": "9.5"
            },
            "gtins": [
                {
                    "type": "UPC",
                    "identifier": "195242087480"
                }
            ],
            "sizeChart": {
                "baseSize": "9.5",
                "baseType": "us m",
                "displayOptions": [
                    {
                        "size": "US M 9.5",
                        "type": "us m"
                    },
                    {
                        "size": "UK 8.5",
                        "type": "uk"
                    },
                    {
                        "size": "JP 27.5",
                        "type": "jp"
                    },
                    {
                        "size": "KR 275",
                        "type": "kr"
                    },
                    {
                        "size": "EU 43",
                        "type": "eu"
                    },
                    {
                        "size": "US W 11",
                        "type": "us w"
                    }
                ]
            },
            "group": null
        },
        {
            "id": "79baa849-a2b7-487c-8468-6fa3314028ca",
            "market": {
                "bidAskData": {
                    "lowestAsk": 117,
                    "numberOfAsks": 11,
                    "highestBid": 92,
                    "numberOfBids": 6
                },
                "statistics": {
                    "lastSale": {
                        "amount": 156,
                        "changePercentage": -0.035684,
                        "changeValue": -5,
                        "sameFees": false
                    }
                },
                "salesInformation": {
                    "lastSale": 156,
                    "salesLast72Hours": 1
                }
            },
            "hidden": false,
            "traits": {
                "size": "10"
            },
            "gtins": [
                {
                    "type": "UPC",
                    "identifier": "195242087497"
                },
                {
                    "type": "EAN-13",
                    "identifier": "2460002040929"
                }
            ],
            "sizeChart": {
                "baseSize": "10",
                "baseType": "us m",
                "displayOptions": [
                    {
                        "size": "US M 10",
                        "type": "us m"
                    },
                    {
                        "size": "UK 9",
                        "type": "uk"
                    },
                    {
                        "size": "JP 28",
                        "type": "jp"
                    },
                    {
                        "size": "KR 280",
                        "type": "kr"
                    },
                    {
                        "size": "EU 44",
                        "type": "eu"
                    },
                    {
                        "size": "US W 11.5",
                        "type": "us w"
                    }
                ]
            },
            "group": null
        },
        {
            "id": "dbdbfc90-07fa-4574-be5c-8a53502fedd5",
            "market": {
                "bidAskData": {
                    "lowestAsk": 125,
                    "numberOfAsks": 12,
                    "highestBid": 45,
                    "numberOfBids": 3
                },
                "statistics": {
                    "lastSale": {
                        "amount": 124,
                        "changePercentage": 0.24,
                        "changeValue": 24,
                        "sameFees": true
                    }
                },
                "salesInformation": {
                    "lastSale": 124,
                    "salesLast72Hours": 0
                }
            },
            "hidden": false,
            "traits": {
                "size": "10.5"
            },
            "gtins": [
                {
                    "type": "UPC",
                    "identifier": "195242087503"
                }
            ],
            "sizeChart": {
                "baseSize": "10.5",
                "baseType": "us m",
                "displayOptions": [
                    {
                        "size": "US M 10.5",
                        "type": "us m"
                    },
                    {
                        "size": "UK 9.5",
                        "type": "uk"
                    },
                    {
                        "size": "JP 28.5",
                        "type": "jp"
                    },
                    {
                        "size": "KR 285",
                        "type": "kr"
                    },
                    {
                        "size": "EU 44.5",
                        "type": "eu"
                    },
                    {
                        "size": "US W 12",
                        "type": "us w"
                    }
                ]
            },
            "group": null
        },
        {
            "id": "6f9daafc-3309-42a0-8cdb-4ffb0a60a8ba",
            "market": {
                "bidAskData": {
                    "lowestAsk": 149,
                    "numberOfAsks": 7,
                    "highestBid": 90,
                    "numberOfBids": 5
                },
                "statistics": {
                    "lastSale": {
                        "amount": 218,
                        "changePercentage": 0.439591,
                        "changeValue": 67,
                        "sameFees": false
                    }
                },
                "salesInformation": {
                    "lastSale": 218,
                    "salesLast72Hours": 0
                }
            },
            "hidden": false,
            "traits": {
                "size": "11"
            },
            "gtins": [
                {
                    "type": "UPC",
                    "identifier": "195242087510"
                }
            ],
            "sizeChart": {
                "baseSize": "11",
                "baseType": "us m",
                "displayOptions": [
                    {
                        "size": "US M 11",
                        "type": "us m"
                    },
                    {
                        "size": "UK 10",
                        "type": "uk"
                    },
                    {
                        "size": "JP 29",
                        "type": "jp"
                    },
                    {
                        "size": "KR 290",
                        "type": "kr"
                    },
                    {
                        "size": "EU 45",
                        "type": "eu"
                    },
                    {
                        "size": "US W 12.5",
                        "type": "us w"
                    }
                ]
            },
            "group": null
        },
        {
            "id": "40f04b53-43aa-4dd2-983b-64f1e2f14642",
            "market": {
                "bidAskData": {
                    "lowestAsk": 146,
                    "numberOfAsks": 9,
                    "highestBid": 99,
                    "numberOfBids": 5
                },
                "statistics": {
                    "lastSale": {
                        "amount": 163,
                        "changePercentage": 0,
                        "changeValue": 0,
                        "sameFees": true
                    }
                },
                "salesInformation": {
                    "lastSale": 163,
                    "salesLast72Hours": 0
                }
            },
            "hidden": false,
            "traits": {
                "size": "11.5"
            },
            "gtins": [
                {
                    "type": "EAN-13",
                    "identifier": "2460002040950"
                },
                {
                    "type": "UPC",
                    "identifier": "195242087527"
                }
            ],
            "sizeChart": {
                "baseSize": "11.5",
                "baseType": "us m",
                "displayOptions": [
                    {
                        "size": "US M 11.5",
                        "type": "us m"
                    },
                    {
                        "size": "UK 10.5",
                        "type": "uk"
                    },
                    {
                        "size": "JP 29.5",
                        "type": "jp"
                    },
                    {
                        "size": "KR 295",
                        "type": "kr"
                    },
                    {
                        "size": "EU 45.5",
                        "type": "eu"
                    },
                    {
                        "size": "US W 13",
                        "type": "us w"
                    }
                ]
            },
            "group": null
        },
        {
            "id": "c9a34e69-96c5-4880-aa14-93d2d7cd1b11",
            "market": {
                "bidAskData": {
                    "lowestAsk": 169,
                    "numberOfAsks": 12,
                    "highestBid": 51,
                    "numberOfBids": 3
                },
                "statistics": {
                    "lastSale": {
                        "amount": 190,
                        "changePercentage": 0.292037,
                        "changeValue": 43,
                        "sameFees": false
                    }
                },
                "salesInformation": {
                    "lastSale": 190,
                    "salesLast72Hours": 0
                }
            },
            "hidden": false,
            "traits": {
                "size": "12"
            },
            "gtins": [
                {
                    "type": "UPC",
                    "identifier": "195242087534"
                },
                {
                    "type": "EAN-13",
                    "identifier": "2000216738184"
                }
            ],
            "sizeChart": {
                "baseSize": "12",
                "baseType": "us m",
                "displayOptions": [
                    {
                        "size": "US M 12",
                        "type": "us m"
                    },
                    {
                        "size": "UK 11",
                        "type": "uk"
                    },
                    {
                        "size": "JP 30",
                        "type": "jp"
                    },
                    {
                        "size": "KR 300",
                        "type": "kr"
                    },
                    {
                        "size": "EU 46",
                        "type": "eu"
                    },
                    {
                        "size": "US W 13.5",
                        "type": "us w"
                    }
                ]
            },
            "group": null
        },
        {
            "id": "2a080f4a-8d04-434a-9c20-46f015256bfa",
            "market": {
                "bidAskData": {
                    "lowestAsk": 214,
                    "numberOfAsks": 4,
                    "highestBid": 128,
                    "numberOfBids": 5
                },
                "statistics": {
                    "lastSale": {
                        "amount": 187,
                        "changePercentage": 0.680219,
                        "changeValue": 76,
                        "sameFees": false
                    }
                },
                "salesInformation": {
                    "lastSale": 187,
                    "salesLast72Hours": 0
                }
            },
            "hidden": false,
            "traits": {
                "size": "12.5"
            },
            "gtins": [
                {
                    "type": "UPC",
                    "identifier": "195242087541"
                }
            ],
            "sizeChart": {
                "baseSize": "12.5",
                "baseType": "us m",
                "displayOptions": [
                    {
                        "size": "US M 12.5",
                        "type": "us m"
                    },
                    {
                        "size": "UK 11.5",
                        "type": "uk"
                    },
                    {
                        "size": "JP 30.5",
                        "type": "jp"
                    },
                    {
                        "size": "KR 305",
                        "type": "kr"
                    },
                    {
                        "size": "EU 47",
                        "type": "eu"
                    },
                    {
                        "size": "US W 14",
                        "type": "us w"
                    }
                ]
            },
            "group": null
        },
        {
            "id": "6286188c-47ff-4ac5-bad6-898ea136fdeb",
            "market": {
                "bidAskData": {
                    "lowestAsk": 164,
                    "numberOfAsks": 10,
                    "highestBid": 95,
                    "numberOfBids": 6
                },
                "statistics": {
                    "lastSale": {
                        "amount": 160,
                        "changePercentage": 0.006289,
                        "changeValue": 1,
                        "sameFees": true
                    }
                },
                "salesInformation": {
                    "lastSale": 160,
                    "salesLast72Hours": 0
                }
            },
            "hidden": false,
            "traits": {
                "size": "13"
            },
            "gtins": [
                {
                    "type": "UPC",
                    "identifier": "195242087558"
                }
            ],
            "sizeChart": {
                "baseSize": "13",
                "baseType": "us m",
                "displayOptions": [
                    {
                        "size": "US M 13",
                        "type": "us m"
                    },
                    {
                        "size": "UK 12",
                        "type": "uk"
                    },
                    {
                        "size": "JP 31",

For parsing these json datasets to something smaller see our JMESPath and JSONPath tool introductions.

Project Setup

In this web scraping tutorial, we'll be using Python with three popular libraries:

  • httpx - HTTP client library which we'll use to retrieve StockX's web pages.
  • parsel - HTML parsing library which we'll use to find <script> elements containing hidden web data.
  • nested_lookup - Allows to extract nested values from a dictionary by key. Since StockX's datasets are huge and nested this library will help us find the product data quickly and reliably.

These packages can be easily installed via the pip install command:

$ pip install httpx parsel nested_lookup

Alternatively, feel free to swap httpx out with any other HTTP client package such as requests as we'll only need basic HTTP functions which are almost interchangeable in every library. As for, parsel, another great alternative is the beautifulsoup package.

Next, let's start by taking a look at how to scrape StockX's single product data.

Scraping StockX Product Data

To scrape single product data we'll be using hidden web data technique.

StockX is powered by React and Next.js technologies so we'll be looking for hidden data in the <script> elements. In particular, hidden web data is usually available in one of these two places:

<script id="__NEXT_DATA__" type="application/json">{...}</script>
<!-- or -->
<script data-name="query">window.__REACT_QUERY_STATE__ = {...};</script>

To parse this HTML for these hidden datasets we can use XPath or CSS Selectors:

import json
from parsel import Selector

def parse_hidden_Data(html: str) -> dict:
    """extract nextjs cache from page"""
    selector = Selector(html)
    data = selector.css("script#__NEXT_DATA__::text").get()
    if not data:
        data = selector.css("script[data-name=query]::text").get()
        data = data.split("=", 1)[-1].strip().strip(';')
    data = json.loads(data)
    return data

Here we're building a parsel.Selector and looking up <script> elements based on CSS selectors.

Next, let's add HTTP capabilities to complete our product scraper and let's take it for a spin:

Python
Scrapfly
import asyncio
import json
import httpx

from nested_lookup import nested_lookup
from parsel import Selector

# create HTTPX client with headers that resemble a web browser
client = httpx.AsyncClient(
    http2=True,
    follow_redirects=True,
    headers={
        "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
        "Accept-Encoding": "gzip, deflate, br",
        "Accept-Language": "en-US,en;q=0.9",
        "Cache-Control": "no-cache",
    },
)


def parse_nextjs(html: str) -> dict:
    """extract nextjs cache from page"""
    selector = Selector(html)
    data = selector.css("script#__NEXT_DATA__::text").get()
    if not data:
        data = selector.css("script[data-name=query]::text").get()
        data = data.split("=", 1)[-1].strip().strip(";")
    data = json.loads(data)
    return data


async def scrape_product(url: str) -> dict:
    """scrape a single stockx product page for product data"""
    response = await client.get(url)
    assert response.status_code == 200
    data = parse_nextjs(response.text)
    # extract all products datasets from page cache
    products = nested_lookup("product", data)
    # find the current product dataset
    try:
        product = next(p for p in products if p.get("urlKey") in str(response.url))
    except StopIteration:
        raise ValueError("Could not find product dataset in page cache", response)
    return product


# example use:
url = "https://stockx.com/nike-air-max-90-se-running-club"
print(asyncio.run(scrape_product(url)))

import asyncio
import json

from nested_lookup import nested_lookup
from scrapfly import ScrapeApiResponse, ScrapeConfig, ScrapflyClient

scrapfly = ScrapflyClient(key="YOUR SCRAPFLY API KEY", max_concurrency=10)


def parse_nextjs(result: ScrapeApiResponse) -> dict:
    """extract nextjs cache from page"""
    data = result.selector.css("script#__NEXT_DATA__::text").get()
    if not data:
        data = result.selector.css("script[data-name=query]::text").get()
        data = data.split("=", 1)[-1].strip().strip(";")
    data = json.loads(data)
    return data


async def scrape_product(url: str) -> dict:
    """scrape a single stockx product page for product data"""
    result = await scrapfly.async_scrape(
        ScrapeConfig(
            url=url,
            country="US",
            asp=True,
        )
    )
    data = parse_nextjs(result)
    # extract all products datasets from page cache
    products = nested_lookup("product", data)
    # find the current product dataset
    try:
        product = next(p for p in products if p.get("urlKey") in result.context["url"])
    except StopIteration:
        raise ValueError("Could not find product dataset in page cache", result.context)
    return product


# example use:
url = "https://stockx.com/nike-air-max-90-se-running-club"
print(asyncio.run(scrape_product(url)))

Above, in just a few lines of code, we've scraped the entire product's dataset available on StockX's website.

Now that we can scrape a single item, let's take a look at how to find products on StockX to scrape all data or just select categories.

To discover products we have two choices: sitemaps and search.

Sitemaps are ideal for discovering all products and they can usually be found by inspecting /robots.txt URL. For example, StockX's robots.txt indicates this:

Sitemap: https://stockx.com/sitemap/sitemap-index.xml
Sitemap: https://stockx.com/it-it/sitemap/sitemap-index.xml
Sitemap: https://stockx.com/de-de/sitemap/sitemap-index.xml
Sitemap: https://stockx.com/fr-fr/sitemap/sitemap-index.xml
Sitemap: https://stockx.com/ja-jp/sitemap/sitemap-index.xml
Sitemap: https://stockx.com/zh-cn/sitemap/sitemap-index.xml
Sitemap: https://stockx.com/en-gb/sitemap/sitemap-index.xml
Sitemap: https://stockx.com/ko-kr/sitemap/sitemap-index.xml
Sitemap: https://stockx.com/es-es/sitemap/sitemap-index.xml
Sitemap: https://stockx.com/es-mx/sitemap/sitemap-index.xml
Sitemap: https://stockx.com/es-us/sitemap/sitemap-index.xml
Sitemap: https://stockx.com/zh-tw/sitemap/sitemap-index.xml
Sitemap: https://stockx.com/fr-ca/sitemap/sitemap-index.xml

So we could scrape /sitemap/sitemap-index.xml where every product URL is located. However, if we want to narrow down our scope and scrape specific items then we can scrape StockX's search pages. Let's take a look how can we do that.

To start, we can see that StockX's search is capable of searching by product category and query:

screencapture of StockX search page

Each of these search pages can be further refined and sorted which results in a unique URL.
For this example, let's take the top sold apparel items that match the query "indigo":

screencapture of StockX search page with filters

Which takes us to the final url stockx.com/search/apparel?s=indigo

To scrape this we'll be using the same hidden web data approach as before. The hidden data is located in the same place so we can reuse our parse_hidden_data() function though this time around it only contains product preview data rather than the whole datasets.

Python
Scrapfly
import asyncio
import json
import math
from typing import Dict, List
import httpx

from nested_lookup import nested_lookup
from parsel import Selector

# create HTTPX client with headers that resemble a web browser
client = httpx.AsyncClient(
    http2=True,
    follow_redirects=True,
    limits=httpx.Limits(max_connections=3),  # keep this low to avoid being blocked
    headers={
        "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
        "Accept-Encoding": "gzip, deflate, br",
        "Accept-Language": "en-US,en;q=0.9",
        "Cache-Control": "no-cache",
    },
)

# From previous chapter:
def parse_nextjs(html: str) -> dict:
    """extract nextjs cache from page"""
    selector = Selector(html)
    data = selector.css("script#__NEXT_DATA__::text").get()
    if not data:
        data = selector.css("script[data-name=query]::text").get()
        data = data.split("=", 1)[-1].strip().strip(";")
    data = json.loads(data)
    return data


async def scrape_search(url: str, max_pages: int = 25) -> List[Dict]:
    """Scrape StockX search"""
    print(f"scraping first search page: {url}")
    first_page = await client.get(url)
    assert first_page.status_code == 200, "scrape was blocked"  # this should be retried, handled etc.

    # parse first page for product search data and total amount of pages:
    data = parse_nextjs(first_page.text)
    _first_page_results = nested_lookup("results", data)[0]
    _paging_info = _first_page_results["pageInfo"]
    total_pages = _paging_info["pageCount"] or math.ceil(_paging_info["total"] / _paging_info["limit"])  # note: pageCount can be missing but we can calculate it ourselves
    if max_pages < total_pages:
        total_pages = max_pages

    product_previews = [edge["node"] for edge in _first_page_results["edges"]]

    # then scrape other pages concurrently:
    print(f"  scraping remaining {total_pages - 1} search pages")
    _other_pages = [  # create GET task for each page url
        asyncio.create_task(client.get(f"{first_page.url}&page={page}"))
        for page in range(2, total_pages + 1)
    ]
    for response in asyncio.as_completed(_other_pages):  # run all tasks concurrently
        response = await response
        data = parse_nextjs(response.text)
        _page_results = nested_lookup("results", data)[0]
        product_previews.extend([edge["node"] for edge in _page_results["edges"]])
    return product_previews


# example run
result = asyncio.run(scrape_search("https://stockx.com/search?s=nike", max_pages=2))
print(json.dumps(result, indent=2))
import asyncio
import json
import math
from typing import Dict, List

from nested_lookup import nested_lookup
from scrapfly import ScrapeApiResponse, ScrapeConfig, ScrapflyClient

scrapfly = ScrapflyClient(key="YOUR SCRAPFLY KEY", max_concurrency=10)


def parse_nextjs(result: ScrapeApiResponse) -> dict:
    """extract nextjs cache from page"""
    data = result.selector.css("script#__NEXT_DATA__::text").get()
    if not data:
        data = result.selector.css("script[data-name=query]::text").get()
        data = data.split("=", 1)[-1].strip().strip(";")
    data = json.loads(data)
    return data


async def scrape_search(url: str, max_pages: int = 25) -> List[Dict]:
    """Scrape StockX search"""
    print(f"scraping first search page: {url}")
    first_page = await scrapfly.async_scrape(
        ScrapeConfig(
            url=url,
            country="US",
            asp=True,
        )
    )
    # parse first page for product search data and total amount of pages:
    data = parse_nextjs(first_page)
    _first_page_results = nested_lookup("results", data)[0]
    _paging_info = _first_page_results["pageInfo"]
    total_pages = _paging_info["pageCount"] or math.ceil(_paging_info["total"] / _paging_info["limit"])
    if max_pages < total_pages:
        total_pages = max_pages

    product_previews = [edge["node"] for edge in _first_page_results["edges"]]

    # then scrape other pages concurrently:
    print(f"  scraping remaining {total_pages - 1} search pages")
    _other_pages = [
        ScrapeConfig(
            url=f"{first_page.context['url']}&page={page}",
            country="US",
            asp=True,
        )
        for page in range(2, total_pages + 1)
    ]
    async for result in scrapfly.concurrent_scrape(_other_pages):
        data = parse_nextjs(result)
        _page_results = nested_lookup("results", data)[0]
        product_previews.extend([edge["node"] for edge in _page_results["edges"]])
    return product_previews


# example run
result = asyncio.run(scrape_search("https://stockx.com/search?s=nike"))
print(json.dumps(result, indent=2))

While this product preview data offers a lot of data we might want to scrape the entire product dataset using the product scraper we wrote in the previous chapter. See the urlKey field for the full product URL.

Bypass StockX Blocking with Scrapfly

StockX is a popular website and it's not uncommon for them to block scraping attempts. To scale up our scraper and bypass blocking we can use Scrapfly's web scraping API which fortifies scrapers against blocking and much more!

scrapfly middleware
Scrapfly service does the heavy lifting for you!

Scrapfly API is a middleware service that sits between your scraper and the target website. It handles all the heavy lifting of scraping and proxies so you can focus on building your scraper. To add Scrapfly offers a bunch of other convenient features like:

Using Python-SDK we can easily integrate Scrapfly into our Python scrapers:

from scrapfly import ScrapflyClient, ScrapeConfig
scrapfly = ScrapflyClient(key="YOUR SCRAPFLY KEY")

result = scrapfly.scrape(ScrapeConfig(
    "https://stockx.com/search/apparel/top-selling?s=indigo",
    # anti scraping protection bypass
    asp=True, 
    # proxy country selection
    country="US",
    # we can enable features like:
    # cloud headless browser use
    render_js=True,  
    # screenshot taking
    screenshots={"all": "fullpage"},
))

# full result data
print(result.content)  # html body
print(result.selector.css("h1"))  # CSS selector and XPath parser built-in

For more see the complete stockx scraper code using Scrapfly on our Github repository:

Latest Stockx.com Scraper Code
https://github.com/scrapfly/scrapfly-scrapers/

FAQ

To wrap up this scrape guide, let's take a look at frequently asked questions regarding scraping of StockX:

Yes, it is legal to scrape StockX.com. StockX e-commerce data is publically available and as long as the scraper doesn't inflict damages to the website it's perfectly legal to scrape.

Can StockX.com be crawled?

Yes, StockX.com can be crawled. Crawling is an alternative web scraping approach where the scraper is capable of discovering pages on its own. StockX offers sitemaps and recommended product areas that can be used to develop crawling logic. For more see our Crawling With Python introduction.

StockX Scraping Summary

In this guide, we've learned how to scrape StockX.com using Python and a few community packages.
For this, we've used the hidden web data scraping technique where instead of traditional HTML parsing we retrieve the product HTML pages and extracted Javascript cache data.

With just a few lines of Python code, we've extracted the entire product dataset from StockX.com.

We've also taken a look at how to discover StockX product pages using sitemaps or search pages.

To scale up our scraper we've also taken a look at Scrapfly API which fortifies scrapers against blocking and much more - try it out for free!

Related Posts

How to Scrape Reddit Posts, Subreddits and Profiles

In this article, we'll explore how to scrape Reddit. We'll extract various social data types from subreddits, posts, and user pages. All of which through plain HTTP requests without headless browser usage.

How to Scrape LinkedIn.com Profile, Company, and Job Data

In this scrape guide we'll be taking a look at one of the most popular web scraping targets - LinkedIn.com. We'll be scraping people profiles, company profiles as well as job listings and search.

How to Scrape SimilarWeb Website Traffic Analytics

In this guide, we'll explain how to scrape SimilarWeb through a step-by-step guide. We'll scrape comprehensive website traffic insights, websites comparing data, sitemaps, and trending industry domains.