How to Scrape Ebay Using Python (2025 Update)

by Bernardas Ališauskas Jan 21, 2025

#scrapeguide #python

How to Scrape Ebay Using Python (2025 Update)

Ebay is the world's biggest peer-to-peer e-commerce web market, making it an attractive target for public data collection!

In this guide, we'll explain how to scrape Ebay search and listing pages for various details, inlcuding pricing, variant information, features, and descriptions.

We'll use Python, a few community packages, and some clever parsing techniques. Let's get started!

Latest Ebay.com Scraper Code

https://github.com/scrapfly/scrapfly-scrapers/

Legal Disclaimer and Precautions

This tutorial covers popular web scraping techniques for education. Interacting with public servers requires diligence and respect and here's a good summary of what not to do:

Do not scrape at rates that could damage the website.
Do not scrape data that's not available publicly.
Do not store PII of EU citizens who are protected by GDPR.
Do not repurpose the entire public datasets which can be illegal in some countries.

Scrapfly does not offer legal advice but these are good general rules to follow in web scraping and for more you should consult a lawyer.

Why Scrape Ebay?

Ebay is one of the world's biggest product marketplaces, especially for more niche and rare items. This makes Ebay a great target for e-commerce data analytics.

Scraping Ebay data empoers various use cases, including:

Competitor analysis by gathering data on competitors' sales and reviews.
Market research by tracking product prices for hot deals or trends.
Empowered navigation through automated search patterns and custom alerts.

For further details, refer to our introduction on web scraping use cases.

Setup

Web scraping Ebay requires using a few Python community packages:

In this tutorial, we'll be using Python with two important community libraries:

httpx: To request Ebay pages and retrieve their data as HTML documents.
parsel: To parse the HTML documents using Xpath and CSS selectors selectors.
nested-lookup: To find nested keys in the Ebay JSON datasets.

The above packages can be installed using the below pip command:

$ pip install httpx[http2] parsel nested_lookup

Note that httpx can be replaced with other HTTP clients, such as requests. As for Parsel, another great alternative is BeautifulSoup.

Scraping Ebay Listings

Let's get started by scraping Ebay for single listing pages. Ebay listings consists of two types:

Multiple variant listings with different selections, like tech devices.
Single variant listings with fixed selections.

We'll be using single variants since they are more straightforward to extract. Let's take this product for example, we'll be extracting data from the below fields:

markup fields for the Ebay product data to extract — We'll capture the most important fields: pricing, description and product and seller details

In the image above we marked our fields and to build CSS selectors to select these fields we can use the Browser Developer Tools (F12 key or right click -> inspect option).

Before we start with the parsing logic, let's configure our HTTP requests' connection to prevent Ebay scraping blocking:

import httpx

session = httpx.Client(
    headers={
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36 Edg/113.0.1774.35",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
        "Accept-Language": "en-US,en;q=0.9",
        "Accept-Encoding": "gzip, deflate, br",
    },
    http2=True,
    follow_redirects=True
)

Above, we create a session using the httpx.Client class. Then, we define the below client parameters:

Basic browser-like headers: User-Agent and the Accept- headers family.
Enable the HTTP2 protocol support.
Enable the follow_redirects parameter to follow redirects automatically.

The above configuration can significantly reduce the chances of getting our Ebay scraper blocked by mimicking normal user behavior. Now that our configuration is complete, let's scrape Ebay's listings:

import json
import httpx
from parsel import Selector

# establish our HTTP2 client with browser-like headers
session = httpx.Client(
    headers={
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36 Edg/113.0.1774.35",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
        "Accept-Language": "en-US,en;q=0.9",
        "Accept-Encoding": "gzip, deflate, br",
    },
    http2=True,
    follow_redirects=True
)

def parse_product(response: httpx.Response) -> dict:
    """Parse Ebay's product listing page for core product data"""
    sel = Selector(response.text)
    # define helper functions that chain the extraction process
    css_join = lambda css: "".join(sel.css(css).getall()).strip()  # join all selected elements
    css = lambda css: sel.css(css).get("").strip()  # take first selected element and strip of leading/trailing spaces

    item = {}
    item["url"] = css('link[rel="canonical"]::attr(href)')
    item["id"] = item["url"].split("/itm/")[1].split("?")[0]  # we can take ID from the URL
    item["price_original"] = css(".x-price-primary>span::text")
    item["price_converted"] = css(".x-price-approx__price ::text")  # ebay automatically converts price for some regions
    item["name"] = css_join("h1 span::text")
    item["seller_name"] = sel.xpath("//div[contains(@class,'info__about-seller')]/a/span/text()").get()
    item["seller_url"] = sel.xpath("//div[contains(@class,'info__about-seller')]/a/@href").get().split("?")[0]
    item["photos"] = sel.css('.ux-image-filmstrip-carousel-item.image img::attr("src")').getall()  # carousel images
    item["photos"].extend(sel.css('.ux-image-carousel-item.image img::attr("src")').getall())  # main image
    # description is an iframe (independant page). We can keep it as an URL or scrape it later.
    item["description_url"] = css("iframe#desc_ifr::attr(src)")
    
    # feature details from the description table:
    features = {}
    feature_table = sel.css("div.ux-layout-section--features")
    for feature in feature_table.css("dl.ux-labels-values"):
        # iterate through each label of the table and select first sibling for value:
        label = "".join(feature.css(".ux-labels-values__labels-content > div > span::text").getall()).strip(":\n ")
        value = "".join(feature.css(".ux-labels-values__values-content > div > span *::text").getall()).strip(":\n ")
        features[label] = value
    item["features"] = features

    return item


response = session.get("https://www.ebay.com/itm/332562282948")
product_data = parse_product(response)

# print the results in JSON format
print(json.dumps(product_data, indent=2))

Here, we use our httpx client to request the target web page URL for the HTML document retrieval. Next, we use the parse_product to parse the raw data extracted using XPath and CSS selectors.

Here's what the collected data from the above Ebay scraping snippet should look like:

ebay scraping results

Next, for products with variants we'll have to go a bit further and extract the page's hidden web data. It might seem like a complex process, though we'll cover it step-by-step!

Scraping Ebay Listing Variant Data

Ebay's listings can contain multiple products through a feature called variants. For example, let's take this iPhone listing:

markup for ebay.com variant options — Listings with variants have multiple selection options

We can see several variant options: model, storage capacity, and color. These options are updated using JavaScript each time we a select one.

Ebay is using JavaScript to update the page with a different price every time we choose a different option. That means that the varaint data exist in a JavaScript variable. Extracting these data is commonly known as hidden web data.

We'll briefly mention the hidden web data extraction in this guide. For the full details, refer to our dedicated tutorial.

How to Scrape Hidden Web Data

The visible HTML doesn't always represent the whole dataset available on the page. In this article, we'll be taking a look at scraping of hidden web data. What is it and how can we scrape it using Python?

To extract data from HTML pages, we'll be using the below utility to find all JSON objects in any text strings:

import json 
import httpx

from collections import defaultdict
from nested_lookup import nested_lookup
from parsel import Selector

session = httpx.Client(
    headers={
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36 Edg/113.0.1774.35",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
        "Accept-Language": "en-US,en;q=0.9",
        "Accept-Encoding": "gzip, deflate, br",
    },
    http2=True,
    follow_redirects=True,
)

def find_json_objects(text: str, decoder=json.JSONDecoder()):
    """Find JSON objects in text, and generate decoded JSON data"""
    pos = 0
    while True:
        match = text.find("{", pos)
        if match == -1:
            break
        try:
            result, index = decoder.raw_decode(text[match:])
            yield result
            pos = match + index
        except ValueError:
            pos = match + 1
            

def parse_variants(response: httpx.Response) -> dict:
    """
    Parse variant data from Ebay's listing page of a product with variants.
    This data is located in a js variable MSKU hidden in a <script> element.
    """
    selector = Selector(response.text)
    script = selector.xpath('//script[contains(., "MSKU")]/text()').get()
    if not script:
        return {}
    all_data = list(find_json_objects(script))
    data = nested_lookup("MSKU", all_data)[0]

    # First retrieve names for all selection options (e.g. Model, Color)
    selection_names = {}
    for menu in data["selectMenus"]:
        for id_ in menu["menuItemValueIds"]:
            selection_names[id_] = menu["displayLabel"]

    # Then, find all selection combinations:
    selections = []
    for v in data["menuItemMap"].values():
        selections.append(
            {
                "name": v["valueName"],
                "variants": v["matchingVariationIds"],
                "label": selection_names[v["valueId"]],
            }
        )

    results = []
    variant_data = nested_lookup("variationsMap", data)[0]
    for id_, variant in variant_data.items():
        result = defaultdict(list)
        result["id"] = id_
        for selection in selections:
            if int(id_) in selection["variants"]:
                result[selection["label"]] = selection["name"]
        result["price_original"] = variant["binModel"]["price"]["value"]["convertedFromValue"]
        result["price_original_currency"] = variant["binModel"]["price"]["value"]["convertedFromCurrency"]
        result["price_converted"] = variant["binModel"]["price"]["value"]["value"]
        result["price_converted_currency"] = variant["binModel"]["price"]["value"]["currency"]
        result["out_of_stock"] = variant["quantity"]["outOfStock"]
        results.append(dict(result))

    return results


response = session.get("https://www.ebay.com/itm/393531906094")
item = parse_product(response) # previous parse_product function 
item['variants'] = parse_variants(response)
print(json.dumps(item, indent=2))

In the above Ebay scraper, we extract the variant listing data using the below steps:

Selecting the script tag containing the MSKU variable.
Extracting the JSON datasets using the find_json_objects utility.
Iterating over the various options and selecting the useful fields.

Here's what the retrieved Ebay scraping results should look like:

ebay scraping results

Next, let's see how to scrape Ebay search.

Scraping Ebay Search

To start scraping Ebay search results, let's reverse engineer it. When we input a search keyword we can see that Ebay is redirecting us to a different URL where the search results are located. For example, if we search for the term iphone we'll be taken to an URL similar to ebay.com/sch/i.html?_nkw=iphone&_sacat=0.

When a search query is submitted, Ebay redirects the requests to a search result document. For instance, searh the keyword iphone, and you will get reidrected to a URL similar to ebay.com/sch/i.html?_nkw=iphone&_sacat=0.

The page of the above URL uses several URL parameters to define the search query:

_nkw for search keyword.
_sacar the category restriction.
_sop sorting type.
_pgn page number.
_ipg listings per page (default is 60).

We can find more arguments by clicking around and exploring the search. To keep our Ebay web scraper short, let's stick with the below five parameters:

import json
import math
import httpx
import asyncio

from typing import Dict, List, Literal
from urllib.parse import urlencode
from parsel import Selector

SORTING_MAP = {
    "best_match": 12,
    "ending_soonest": 1,
    "newly_listed": 10,
}

session = httpx.AsyncClient(
    headers={
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36 Edg/113.0.1774.35",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
        "Accept-Language": "en-US,en;q=0.9",
        "Accept-Encoding": "gzip, deflate, br",
    },
    http2=True,
    follow_redirects=True
)


def parse_search(response: httpx.Response) -> List[Dict]:
    """parse ebay's search page for listing preview details"""
    previews = []
    # each listing has it's own HTML box where all of the data is contained
    sel = Selector(response.text)
    listing_boxes = sel.css(".srp-results li.s-item")
    for box in listing_boxes:
        # quick helpers to extract first element and all elements
        css = lambda css: box.css(css).get("").strip()
        css_all = lambda css: box.css(css).getall()
        previews.append(
            {
                "url": css("a.s-item__link::attr(href)").split("?")[0],
                "title": css(".s-item__title>span::text"),
                "price": css(".s-item__price::text"),
                "shipping": css(".s-item__shipping::text"),
                "list_date": css(".s-item__listingDate span::text"),
                "subtitles": css_all(".s-item__subtitle::text"),
                "condition": css(".s-item__subtitle .SECONDARY_INFO::text"),
                "photo": css(".s-item__image img::attr(src)"),
                "rating": css(".s-item__reviews .clipped::text"),
                "rating_count": css(".s-item__reviews-count span::text"),
            }
        )
    return previews


async def scrape_search(
    query,
    max_pages=1,
    category=0,
    items_per_page=240,
    sort: Literal["best_match", "ending_soonest", "newly_listed"] = "newly_listed",
) -> List[Dict]:
    """Scrape Ebay's search results page for product preview data for given"""

    def make_request(page):
        return "https://www.ebay.com/sch/i.html?" + urlencode(
            {
                "_nkw": query,
                "_sacat": category,
                "_ipg": items_per_page,
                "_sop": SORTING_MAP[sort],
                "_pgn": page,
            }
        )

    first_page = await session.get(make_request(page=1))
    results = parse_search(first_page)
    if max_pages == 1:
        return results
    # find total amount of results for concurrent pagination
    total_results = first_page.selector.css(".srp-controls__count-heading>span::text").get()
    total_results = int(total_results.replace(",", ""))
    total_pages = math.ceil(total_results / items_per_page)
    if total_pages > max_pages:
        total_pages = max_pages
    other_pages = [session.get(make_request(page=i)) for i in range(2, total_pages + 1)]
    for response in asyncio.as_completed(other_pages):
        response = await response
        try:
            results.extend(parse_search(response))
        except Exception as e:
            print(f"failed to scrape search page {response.url}")
    return results


data = asyncio.run(scrape_search("iphone 14 pro max"))
print(json.dumps(data, indent=2))

Here's what the extracted Ebay data looks like:

Avoiding Ebay Scraping Blocking

Creating an Ebay scraper seems straightforward. However, attempting the scale is the tricky part! Ebay can differentiate our requests as being automated, hence asking for CAPTCHA challenges or even block the scraping process entirely!

ScrapFly provides web scraping, screenshot, and extraction APIs for data collection at scale.

Anti-bot protection bypass - scrape web pages without blocking!
Rotating residential proxies - prevent IP address and geographic blocks.
JavaScript rendering - scrape dynamic web pages through cloud browsers.
Full browser automation - control browsers to scroll, input and click on objects.
Format conversion - scrape as HTML, JSON, Text, or Markdown.
Python and Typescript SDKs, as well as Scrapy and no-code tool integrations.

To avoid Ebay web scraping blocking, we'll be using scrapfly-sdk with the anti-scraping protection bypass feature. Start by installing it using pip:

$ pip install scrapfly-sdk

To take advantage of ScrapFlys API in our Ebay scraper, all we have to do is replace httpx with scrapfly-sdk client:

import httpx
response = httpx.get("some ebay.com url")

# in ScrapFly SDK becomes 👇

from scrapfly import ScrapflyClient, ScrapeConfig
client = ScrapflyClient("YOUR SCRAPFLY KEY")
result = client.scrape(ScrapeConfig(
    # some ebay URL
    "https://www.ebay.com/itm/393531906094",
    # we can select specific proxy country
    country="US",
    # and enable anti scraping protection bypass:
    asp=True,
    # enable JavaScript rendering if required
    render_js=True
))

For more on how to scrape Ebay.com using ScrapFly, see the Full Scraper Code section.

FAQ

To wrap this guide up, let's take a look at some frequently asked questions regarding how to scrape data from Ebay:

Is it legal to scrape ebay.com?

Yes. Ebay's data is publically available - scraping Ebay at slow, respectful rates would fall under the ethical scraping definition.
That being said, be aware of GDRP compliance in the EU when storing personal data such as sellers personal details like names or location. For more, see our Is Web Scraping Legal? article.

How to crawl Ebay.com?

To web crawl Ebay we can adapt the scraping techniques covered in this article. Every ebay listing contains related products which we can extract and feed into our scraping loop turning our scraper into a crawler that is capable of finding new details to crawl.

Is there an Ebay API?

No. While Ebay does have a private catalog API it contains only metadata fields like product ids. For the full product details, the only way is to scrape Ebay as described in this guide.

Latest Ebay.com Scraper Code

https://github.com/scrapfly/scrapfly-scrapers/

Ebay Scraping Summary

In this guide, we wrote a Python Ebay scraper for product listing data using nothing but Python and a few community packages: httpx for retrieving the content and parsel for parsing it.

We've scraped data from three parts of the Ebay domain:

Single variant products - using basic CSS selector parsing logic.
Multiple variant products - using hidden web data extraction.
Search pages - using search parameters and basic crawling rules.

Finally, to avoid Ebay scraping blocking, we used ScrapFly's API to automatically configure the HTTP connection. For more about ScrapFly, see our documentation and try it out for FREE!

How to Scrape Ebay Using Python (2025 Update)

Latest Ebay.com Scraper Code

Why Scrape Ebay?

Setup

Scraping Ebay Listings

Scraping Ebay Listing Variant Data

How to Scrape Hidden Web Data

Scraping Ebay Search

Avoiding Ebay Scraping Blocking

FAQ

Is it legal to scrape ebay.com?

How to crawl Ebay.com?

Is there an Ebay API?

Ebay Scraping Summary

Related Knowledgebase

How to scrape HTML table to Excel Spreadsheet (.xlsx)?

Python httpx vs requests vs aiohttp - key differences

What Python libraries support HTTP2?

How to handle popup dialogs in Playwright?

How to use proxies with Python httpx?

How to scrape images from a website?

What are some ways to parse JSON datasets in Python?

How to use cURL in Python?

How to open Python http responses in a web browser?

How to fix python requests ConnectTimeout error?

How to fix Python requests MissingSchema error?

How to fix Python requests ReadTimeout error?

Related Articles

How to Scrape YouTube in 2025

How to Scrape Reddit Posts, Subreddits and Profiles

How to Scrape LinkedIn in 2025

How to Scrape SimilarWeb Website Traffic Analytics

How to Scrape BestBuy Product, Offer and Review Data

How To Scrape TikTok in 2025