How to Scrape Ebay using Python

article feature image

In this web scraping tutorial, we'll be taking a look at how to scrape Ebay search and listing data. Ebay is the biggest peer-to-peer e-commerce web market in the world thus it's an attractive target for public data collection.

We'll be scraping listing details like pricing, variant information, features and descriptions.

To scrape Ebay data using Python we'll be using a few popular community packages and some clever parsing techniques.

We'll also take a look at how to scrape Ebay's search system to discover new item listings to be the first to know when a new deal is available.

Latest Ebay.com Scraper Code

https://github.com/scrapfly/scrapfly-scrapers/

Why Scrape Ebay?

Ebay is one of the biggest product marketplaces in the world, especially for more niche and rare items. This makes Ebay a great target for e-commerce data analytics.

Scraping Ebay data (like seller reviews) can also empower Ebay sellers allowing easy market and competitor analysis.

Available Ebay Data Fields

In this Ebay web scraping tutorial we'll be scraping common product data like pricing, stock, features and performance metadata. For more, see this example output:

Example Product Dataset
{
  "url": "https://www.ebay.com/itm/393531906094",
  "id": "393531906094",
  "price_original": "C $469.00",
  "price_converted": "US $341.55",
  "name": "Apple iPhone 11 Pro Max - Unlocked - 64GB / 256GB / 512GB - CA - Excellent",
  "seller_name": "MobileKlinik",
  "seller_url": "https://www.ebay.com/str/devicecare",
  "photos": [
    "https://i.ebayimg.com/images/g/93cAAOSwvEJgbLW8/s-l64.jpg",
    "https://i.ebayimg.com/images/g/l0sAAOSwextgbLYj/s-l64.jpg",
    "https://i.ebayimg.com/images/g/qxEAAOSwP~BgbLa5/s-l64.jpg",
    "https://i.ebayimg.com/images/g/7usAAOSwRbZgbLbE/s-l64.jpg",
    "https://i.ebayimg.com/images/g/ffMAAOSwhAxgbLbO/s-l64.jpg",
    "https://i.ebayimg.com/images/g/93cAAOSwvEJgbLW8/s-l500.jpg"
  ],
  "description_url": "https://vi.vipr.ebaydesc.com/ws/eBayISAPI.dll?ViewItemDescV4&item=393531906094&t=1631237959000&category=9355&seller=mobileklinik&excSoj=1&excTrk=1&lsite=2&ittenable=true&domain=ebay.com&descgauge=1&cspheader=1&oneClk=2&secureDesc=1",
  "features": {
    "Condition": "Excellent - Refurbished: The item is in like-new condition, backed by a one year warranty. It has ... Read moreExcellent - Refurbished: The item is in like-new condition, backed by a one year warranty. It has been professionally refurbished, inspected and cleaned to excellent condition by qualified sellers. The item includes original or new accessories and will come in new generic packaging. See the seller's listing for full details. See all condition definitions",
    "Processor": "Hexa Core",
    "Screen Size": "6.5 in",
    "Model Number": "A2161 (CDMA + GSM)",
    "Lock Status": "Factory Unlocked",
    "SIM Card Slot": "Dual SIM (SIM + eSIM)",
    "Brand": "Apple",
    "Network": "1&1, Unlocked",
    "Connectivity": "5G, Bluetooth, GPS, Lightning",
    "Operating System": "iOS",
    "Features": "4K Video Recording, Accelerometer, Bluetooth Enabled, Camera, Facial Recognition",
    "Contract": "Without Contract",
    "Camera Resolution": "12.0 MP",
    "RAM": "4 GB"
  },
  "variants": [
    {
      "id": "662315637180",
      "Model": "Apple iPhone 11 Pro Max",
      "Storage Capacity": "512 GB",
      "Color": "Midnight Green",
      "price_original": 679,
      "price_original_currency": "CAD",
      "price_converted": 494.48,
      "price_converted_currency": "USD",
      "out_of_stock": true
    },
    {
      "id": "662315637181",
      "Model": "Apple iPhone 11 Pro Max",
      "Storage Capacity": "512 GB",
      "Color": "Gold",
      "price_original": 679,
      "price_original_currency": "CAD",
      "price_converted": 494.48,
      "price_converted_currency": "USD",
      "out_of_stock": true
    },
    {
      "id": "662315637182",
      "Model": "Apple iPhone 11 Pro Max",
      "Storage Capacity": "512 GB",
      "Color": "Space Gray",
      "price_original": 679,
      "price_original_currency": "CAD",
      "price_converted": 494.48,
      "price_converted_currency": "USD",
      "out_of_stock": true
    },
    {
      "id": "662315637176",
      "Model": "Apple iPhone 11 Pro Max",
      "Storage Capacity": "256 GB",
      "Color": "Midnight Green",
      "price_original": 549,
      "price_original_currency": "CAD",
      "price_converted": 399.81,
      "price_converted_currency": "USD",
      "out_of_stock": true
    },
    {
      "id": "662315637177",
      "Model": "Apple iPhone 11 Pro Max",
      "Storage Capacity": "256 GB",
      "Color": "Gold",
      "price_original": 549,
      "price_original_currency": "CAD",
      "price_converted": 399.81,
      "price_converted_currency": "USD",
      "out_of_stock": true
    },
    {
      "id": "662315637178",
      "Model": "Apple iPhone 11 Pro Max",
      "Storage Capacity": "256 GB",
      "Color": "Space Gray",
      "price_original": 549,
      "price_original_currency": "CAD",
      "price_converted": 399.81,
      "price_converted_currency": "USD",
      "out_of_stock": true
    },
    {
      "id": "662315637179",
      "Model": "Apple iPhone 11 Pro Max",
      "Storage Capacity": "512 GB",
      "Color": "Silver",
      "price_original": 679,
      "price_original_currency": "CAD",
      "price_converted": 494.48,
      "price_converted_currency": "USD",
      "out_of_stock": true
    },
    {
      "id": "662315637172",
      "Model": "Apple iPhone 11 Pro Max",
      "Storage Capacity": "64 GB",
      "Color": "Midnight Green",
      "price_original": 469,
      "price_original_currency": "CAD",
      "price_converted": 341.55,
      "price_converted_currency": "USD",
      "out_of_stock": true
    },
    {
      "id": "662315637173",
      "Model": "Apple iPhone 11 Pro Max",
      "Storage Capacity": "64 GB",
      "Color": "Gold",
      "price_original": 469,
      "price_original_currency": "CAD",
      "price_converted": 341.55,
      "price_converted_currency": "USD",
      "out_of_stock": false
    },
    {
      "id": "662315637174",
      "Model": "Apple iPhone 11 Pro Max",
      "Storage Capacity": "64 GB",
      "Color": "Space Gray",
      "price_original": 469,
      "price_original_currency": "CAD",
      "price_converted": 341.55,
      "price_converted_currency": "USD",
      "out_of_stock": false
    },
    {
      "id": "662315637175",
      "Model": "Apple iPhone 11 Pro Max",
      "Storage Capacity": "256 GB",
      "Color": "Silver",
      "price_original": 549,
      "price_original_currency": "CAD",
      "price_converted": 399.81,
      "price_converted_currency": "USD",
      "out_of_stock": true
    },
    {
      "id": "662315637171",
      "Model": "Apple iPhone 11 Pro Max",
      "Storage Capacity": "64 GB",
      "Color": "Silver",
      "price_original": 469,
      "price_original_currency": "CAD",
      "price_converted": 341.55,
      "price_converted_currency": "USD",
      "out_of_stock": true
    }
  ]
}

We'll be also scraping Ebay's search which provides product preview datasets. See this example output:

Example Search Dataset
[
  {
    "url": "https://www.ebay.com/itm/394406593931",
    "title": "iPhone 11 Pro Max 256GB Space Gray (Unlocked) CRACKED FRONT BACK CLEAN ESN",
    "price": "$289.94",
    "shipping": "Free shipping",
    "list_date": "Jan-3 20:32",
    "subtitles": [
      "Apple iPhone 11 Pro Max",
      "256 GB",
      "Unlocked"
    ],
    "condition": "Parts Only",
    "photo": "https://i.ebayimg.com/thumbs/images/g/74AAAOSwRahjtQBg/s-l225.webp",
    "rating": "4.5 out of 5 stars.",
    "rating_count": "80 product ratings"
  },
...
]

Ebay's pages contain a lot of data and in our example Ebay web scraper we'll stick to the most important data fields but techniques covered in this guide can be applied to scrape any other part of Ebay.

Setup

In this tutorial, we'll be using Python with two important community libraries:

  • httpx - HTTP client library which will let us communicate with ebay.com's servers and retrieve raw page data. We'll use it with optional HTTP2 support to prevent being blocked.
  • parsel - HTML parsing library which will help us to parse our web scraped raw HTML data using CSS selectors or Xpath.
  • nested_lookup - allows to find any key in deeply nested JSON datasets. We'll be using this to find ebay product variant data.

These packages can be easily installed via the pip install command:

$ pip install httpx[http2] parsel nested_lookup

Note that other popular HTTP clients like requests will get you blocked as eBay checks for HTTP2 capabilities. As for, parsel, another great alternative is the beautifulsoup package.

Scraping Ebay Listings

Let's start with parsing a single Ebay listing page.
For this, we'll be using httpx to retrieve product's HTML page and parsel to parse it using CSS selectors.

To start, we can separate the Ebay listings into two types:

  • listings with multiple variants - like tech devices, clothes, shoes. Things that have multiple options.
  • listings with a single variant - usually simple products that have no options. Like toys or second hand items.

Let's begin with single variant listings as they are much more simple. For this example, let's use this product ebay.com/itm/332562282948

markup for fields that'll be scraped from Ebay.com
We'll capture the most important fields: pricing, description and product and seller details

In the image above we marked our fields and to build CSS selectors to select these fields we can use the Browser Developer Tools (F12 key or right click -> inspect option).

Though before we begin parsing let's take a look at how we'll configure our httpx connection to prevent being blocked by Ebay:

import httpx

session = httpx.Client(
    headers={
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36 Edg/113.0.1774.35",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
        "Accept-Language": "en-US,en;q=0.9",
        "Accept-Encoding": "gzip, deflate, br",
    },
    http2=True,
    follow_redirects=True
)

For this we'll be using httpx.Client which gives a session for all scrape request we'll make and we set some important default parameters here. To start, we set our headers with User-Agent string of a real web browser as well as the Accept- family of headers. We also enable HTTP2 protocol support and follow_redirects to automatically manage page redirects. All these configuration details should significantly reduce the probability of ebay blocking our scraper as it appears more like a real user.

With this we can start scraping and our single listing Ebay scraper in Python would look something like this:

from parsel import Selector
import httpx

def parse_product(response: httpx.Response) -> dict:
    """Parse Ebay's product listing page for core product data"""
    sel = Selector(response.text)
    # define helper functions that chain the extraction process
    css_join = lambda css: "".join(sel.css(css).getall()).strip()  # join all CSS selected elements
    css = lambda css: sel.css(css).get("").strip()  # take first CSS selected element and strip of leading/trailing spaces

    item = {}
    item["url"] = css('link[rel="canonical"]::attr(href)')
    item["id"] = item["url"].split("/itm/")[1].split("?")[0]  # we can take ID from the URL
    item["price"] = css('.x-price-primary>span::text')
    item["name"] = css_join("h1 span::text")
    item["seller_name"] = css_join("[data-testid=str-title] a ::text")
    item["seller_url"] = css("[data-testid=str-title] a::attr(href)").split("?")[0]
    item["photos"] = sel.css('.ux-image-filmstrip-carousel-item.image img::attr("src")').getall()  # carousel images
    item["photos"].extend(sel.css('.ux-image-carousel-item.image img::attr("src")').getall())  # main image
    # description is an iframe (independant page). We can keep it as an URL or scrape it later.
    item["description_url"] = css("div.d-item-description iframe::attr(src)")
    if not item["description_url"]:
        item["description_url"] = css("div#desc_div iframe::attr(src)")
    # feature details from the description table:
    feature_table = sel.css("div.ux-layout-section--features")
    features = {}
    for ft_label in feature_table.css(".ux-labels-values__labels"):
        # iterate through each label of the table and select first sibling for value:
        label = "".join(ft_label.css(".ux-textspans::text").getall()).strip(":\n ")
        ft_value = ft_label.xpath("following-sibling::div[1]")
        value = "".join(ft_value.css(".ux-textspans::text").getall()).strip()
        features[label] = value
    item["features"] = features
    return item

# establish our HTTP2 client with browser-like headers
session = httpx.Client(
    headers={
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36 Edg/113.0.1774.35",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
        "Accept-Language": "en-US,en;q=0.9",
        "Accept-Encoding": "gzip, deflate, br",
    },
    http2=True,
    follow_redirects=True
)
# example use: scrape this item and parse the data
response = session.get("https://www.ebay.com/itm/332562282948")
item = parse_product(response)
import json
print(json.dumps(item, indent=2))
Example Output
{
  "url": "https://www.ebay.com/itm/332562282948",
  "id": "332562282948",
  "price_original": "US $12.45",
  "price_converted": "CAD 16.93",
  "name": "Sanei Kirby 5.5\" Plush Stuffed Doll (KP01) - Kirby Adventure All Star Collection",
  "seller_name": "ToysCollections",
  "seller_url": "https://www.ebay.com/str/huskylover228",
  "photos": [
    "https://i.ebayimg.com/images/g/ITEAAOSw9p9ajK16/s-l500.jpg"
  ],
  "description_url": "https://vi.vipr.ebaydesc.com/ws/eBayISAPI.dll?ViewItemDescV4&item=332562282948&t=1678153940000&category=69528&seller=the_northeshop&excSoj=1&excTrk=1&lsite=0&ittenable=false&domain=ebay.com&descgauge=1&cspheader=1&oneClk=2&secureDesc=1",
  "features": {
    "Condition": "New: A brand-new, unused, unopened, undamaged item (including handmade items). See the seller's ... Read moreNew: A brand-new, unused, unopened, undamaged item (including handmade items). See the seller's listing for full details. See all condition definitions",
    "Brand": "unbranded",
    "Type": "Plush",
    "UPC": "4905330122810",
    "Featured Refinements": "Kirby Plush",
    "Recommended Age Range": "4+",
    "Gender": "Boys & Girls",
    "Character Family": "Kirby Adventure"
  }
}

We used our fortified HTTPX client to scrape ebay's product page HTML. Then, we've loaded it up as a parsel.Selector and ran a bunch of CSS selectors on it to extract the product details.

Next, for products with variants we'll have to go a bit further and extract hidden web data. This is a more complex scraping process so stick around and we'll try to cover it in as much detail as possible.

Scraping Ebay Listing Variant Data

Ebay's listings can contain multiple products through a feature called variants.

For example, let's take this iPhone listing: ebay.com/itm/393531906094

markup for ebay.com variant options
Listings with variants have multiple selection options

We can see several variant options: Model, Storage Capacity and Color. The price is updated every time we choose a different option. How can we scrape this?

Ebay is using javascript to update the page with a different price every time we choose a different option. That means the price data is available somewhere in a javascript variable. All we have to do is extract this variable to scrape the variant dataset.

We're not going to go in-depth on how to capture Javascript variable data, for that refer to our article:

How to Scrape Hidden Web Data

For full introduction on scraping javascript variables see our hidden web data scraping tutorial.

How to Scrape Hidden Web Data

So, to capture all product data including variant information we'll extract the hidden variant data and pair it together with our HTML scraper from before.

To start we'll be using this utility function that can find all JSON objects in any text strings. This a great tool for finding hidden datasets in HTML pages:

import json 

def find_json_objects(text: str, decoder=json.JSONDecoder()):
    """Find JSON objects in text, and generate decoded JSON data"""
    pos = 0
    while True:
        match = text.find("{", pos)
        if match == -1:
            break
        try:
            result, index = decoder.raw_decode(text[match:])
            yield result
            pos = match + index
        except ValueError:
            pos = match + 1
# example use
text = '''
one json {"foo": "bar"} and another {"price": {"currency": "usd", "price": 85.41}} example
'''
for obj in find_json_objects(text):
    print(obj)
# {'foo': 'bar'}
# {'price': {'currency': 'usd', 'price': 85.41}}

We'll use this method to find the variant JSON dataset from ebay's product page HTML. It's hiding in a key called MSKU. Let's take a look at the code:

import json
from parsel import Selector


def find_json_objects(text: str, decoder=json.JSONDecoder()):
    """Find JSON objects in text, and generate decoded JSON data"""
    pos = 0
    while True:
        match = text.find("{", pos)
        if match == -1:
            break
        try:
            result, index = decoder.raw_decode(text[match:])
            yield result
            pos = match + index
        except ValueError:
            pos = match + 1

def parse_variants(response: httpx.Response) -> dict:
    """
    Parse variant data from Ebay's listing page of a product with variants.
    This data is located in a js variable MSKU hidden in a <script> element.
    """
    selector = Selector(response.text)
    script = selector.xpath('//script[contains(., "MSKU")]/text()').get()
    if not script:
        return {}
    all_data = list(_find_json_objects(script))
    data = nested_lookup("MSKU", all_data)[0]
    # First retrieve names for all selection options (e.g. Model, Color)
    selection_names = {}
    for menu in data["selectMenus"]:
        for id_ in menu["menuItemValueIds"]:
            selection_names[id_] = menu["displayLabel"]
    # example selection name entry:
    # {0: 'Model', 1: 'Color', ...}

    # Then, find all selection combinations:
    selections = []
    for v in data["menuItemMap"].values():
        selections.append(
            {
                "name": v["valueName"],
                "variants": v["matchingVariationIds"],
                "label": selection_names[v["valueId"]],
            }
        )
    # example selection entry:
    # {'name': 'Gold', 'variants': [662315637181, 662315637177, 662315637173], 'label': 'Color'}

    # Finally, extract variants and apply selection details to each
    results = []
    variant_data = nested_lookup("variationsMap", data)[0]
    for id_, variant in variant_data.items():
        result = defaultdict(list)
        result["id"] = id_
        for selection in selections:
            if int(id_) in selection["variants"]:
                result[selection["label"]] = selection["name"]
        result["price_original"] = variant["binModel"]["price"]["value"]["convertedFromValue"]
        result["price_original_currency"] = variant["binModel"]["price"]["value"]["convertedFromCurrency"]
        result["price_converted"] = variant["binModel"]["price"]["value"]["value"]
        result["price_converted_currency"] = variant["binModel"]["price"]["value"]["currency"]
        result["out_of_stock"] = variant["quantity"]["outOfStock"]
        results.append(dict(result))
    # example variant entry:
    # {
    #     'id': '662315637173',
    #     'Model': 'Apple iPhone 11 Pro Max',
    #     'Storage Capacity': '64 GB',
    #     'Color': 'Gold',
    #     'price_original': 469,
    #     'price_original_currency': 'CAD',
    #     'price_converted': 341.55,
    #     'price_converted_currency': 'USD',
    #     'out_of_stock': False
    # }
    return results

# Example use:
session = httpx.Client(
    headers={
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36 Edg/113.0.1774.35",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
        "Accept-Language": "en-US,en;q=0.9",
        "Accept-Encoding": "gzip, deflate, br",
    },
    http2=True,
    follow_redirects=True,
)
response = session.get("https://www.ebay.com/itm/393531906094")
item = parse_product(response)
item['variants'] = parse_variants(response)
import json
print(json.dumps(item, indent=2))

For an example output dataset, see the Available Data Fields section.

In this example Ebay scraper, we used hidden web data parsing technique to extract the javascript variable MSKU which contains listing's variant data. This dataset is used by Ebay's web backend so we had to do a bit of parsing and datafield joining to produce a clean variant dataset.

Next, let's take a look at how can we find listings on Ebay using the search system.

To start scraping Ebay's search let's first take a look at the way it works.

When we input a search keyword we can see that Ebay is redirecting us to a different URL where the search results are located. For example, if we search for the term iphone we'll be taken to an URL similar to ebay.com/sch/i.html?_nkw=iphone&_sacat=0.

This page is using several URL parameters to define the search query:

  • _nkw is for search keyword.
  • _sacar is the category restriction.
  • _sop is sorting type.
  • _pgn is page number.
  • _ipg is listings per page (default is 60).

We can find more arguments by clicking around and exploring the search though for this example let's stick with these 5 parameters.

# last update: 2023-10-10
import asyncio
import math
import httpx
from typing import TypedDict, List, Literal
from urllib.parse import urlencode

from parsel import Selector


session = httpx.AsyncClient(
    # for our HTTP headers we want to use a real browser's default headers to prevent being blocked
    headers={
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36 Edg/113.0.1774.35",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
        "Accept-Language": "en-US,en;q=0.9",
        "Accept-Encoding": "gzip, deflate, br",
    },
    # Enable HTTP2 version of the protocol to prevent being blocked
    http2=True,
    # enable automatic follow of redirects
    follow_redirects=True
)

# this is scrape result we'll receive
class ProductPreviewResult(TypedDict):
    """type hint for search scrape results for product preview data"""

    url: str  # url to full product page
    title: str
    price: str
    shipping: str
    list_date: str
    subtitles: List[str]
    condition: str
    photo: str  # image url
    rating: str
    rating_count: str


def parse_search(response: httpx.Response) -> List[ProductPreviewResult]:
    """parse ebay's search page for listing preview details"""
    previews = []
    # each listing has it's own HTML box where all of the data is contained
    sel = Selector(response.text)
    listing_boxes = sel.css(".srp-results li.s-item")
    for box in listing_boxes:
        # quick helpers to extract first element and all elements
        css = lambda css: box.css(css).get("").strip()
        css_all = lambda css: box.css(css).getall()
        previews.append(
            {
                "url": css("a.s-item__link::attr(href)").split("?")[0],
                "title": css(".s-item__title>span::text"),
                "price": css(".s-item__price::text"),
                "shipping": css(".s-item__shipping::text"),
                "list_date": css(".s-item__listingDate span::text"),
                "subtitles": css_all(".s-item__subtitle::text"),
                "condition": css(".s-item__subtitle .SECONDARY_INFO::text"),
                "photo": css(".s-item__image img::attr(src)"),
                "rating": css(".s-item__reviews .clipped::text"),
                "rating_count": css(".s-item__reviews-count span::text"),
            }
        )
    return previews


SORTING_MAP = {
    "best_match": 12,
    "ending_soonest": 1,
    "newly_listed": 10,
}


async def scrape_search(
    query,
    max_pages=1,
    category=0,
    items_per_page=240,
    sort: Literal["best_match", "ending_soonest", "newly_listed"] = "newly_listed",
) -> List[ProductPreviewResult]:
    """Scrape Ebay's search for product preview data for given"""

    def make_request(page):
        return "https://www.ebay.com/sch/i.html?" + urlencode(
            {
                "_nkw": query,
                "_sacat": category,
                "_ipg": items_per_page,
                "_sop": SORTING_MAP[sort],
                "_pgn": page,
            }
        )

    first_page = await session.get(make_request(page=1))
    results = parse_search(first_page)
    if max_pages == 1:
        return results
    # find total amount of results for concurrent pagination
    total_results = first_page.selector.css(".srp-controls__count-heading>span::text").get()
    total_results = int(total_results.replace(",", ""))
    total_pages = math.ceil(total_results / items_per_page)
    if total_pages > max_pages:
        total_pages = max_pages
    other_pages = [session.get(make_request(page=i)) for i in range(2, total_pages + 1)]
    for response in asyncio.as_completed(other_pages):
        response = await response
        try:
            results.extend(parse_search(response))
        except Exception as e:
            print(f"failed to scrape search page {response.url}")
    return results

# Example run:
if __name__ == "__main__":
    import asyncio
    asyncio.run(scrape_search("iphone 14 pro max"))
Example Output
[
    {
        "url": "https://www.ebay.com/itm/354493525522",
        "title": "Apple iPhone 11 - 128GB - Black (Unlocked) A2111 (CDMA + GSM)",
        "price": "$1,200.99",
        "shipping": "+$25.00 shipping",
        "list_date": "Jan-3 04:32",
        "subtitles": [
            "Apple iPhone 11",
            "128 GB",
            "Unlocked"
        ],
        "condition": "Pre-Owned",
        "photo": "https://i.ebayimg.com/thumbs/images/g/m5QAAOSwrsxjtB~R/s-l225.webp",
        "rating": "4.5 out of 5 stars.",
        "rating_count": "68 product ratings"
    },
    ...  # trucated for the blog
]

In the example above, we wrote a small scraper for Ebay's search. We built a search URL using Python's urlencode function to turn dictionary parameters into URL parameters.

Then, we parsed the scraped data using CSS selectors. First, we've selected all of the listing box containers and iterated through them to safely extract each listing's details.

We could further use our listing scraper from the previous section to extract full listing details if we'd like to expand this search dataset.

There are a lot of listings on Ebay and when we scale our scraper up to thousands of listing scrapes we might start encountering blocking. Next, let's take a look at how to avoid being blocked by eBay.

Avoiding Ebay Blocking

Web scraping Ebay is not too difficult, however when scaling up our scraper beyond a few listing scrapes we might start to run into captchas and scraper blocking.

To scale up our ebay crawler, let's take advantage of ScrapFly API which offers several powerful features that can help us to scale our web scrapers and avoid Ebay's blocking:

For this, we'll be using the scrapfly-sdk python package and the Anti Scraping Protection Bypass feature. To start, let's install scrapfly-sdk using pip:

$ pip install scrapfly-sdk

To take advantage of ScrapFly's API in our Ebay web scraper all we need to do is change our httpx session code with scrapfly-sdk client requests:

import httpx

response = httpx.get("some ebay.com url")
# in ScrapFly SDK becomes
from scrapfly import ScrapflyClient, ScrapeConfig
client = ScrapflyClient("YOUR SCRAPFLY KEY")
result = client.scrape(ScrapeConfig(
    # some ebay URL
    "https://www.ebay.com/itm/393531906094",
    # we can select specific proxy country
    country="US",
    # and enable anti scraping protection bypass:
    asp=True,
))

For more on how to scrape Ebay.com using ScrapFly, see the Full Scraper Code section.

FAQ

To wrap this guide up, let's take a look at some frequently asked questions regarding how to scrape data from ebay:

Yes. Ebay's data is publically available - scraping Ebay at slow, respectful rates would fall under the ethical scraping definition.
That being said, be aware of GDRP compliance in the EU when storing personal data such as sellers personal details like names or location. For more, see our Is Web Scraping Legal? article.

How to crawl Ebay.com?

To web crawl Ebay we can adapt the scraping techniques covered in this article. Every ebay listing contains related products which we can extract and feed into our scraping loop turning our scraper into a crawler that is capable of finding new details to crawl.

Is there an Ebay API?

No. While Ebay does have a private catalog API it contains only metadata fields like product ids. For product prices and other details, the only way is to scrape Ebay as described in this guide.

Latest Ebay.com Scraper Code
https://github.com/scrapfly/scrapfly-scrapers/

Ebay Scraping Summary

In this guide, we wrote a Python Ebay scraper for product listing data using nothing but Python and a few community packages: httpx for retrieving the content and parsel for parsing it.

We've discovered two types of product listings: single variant and multiple variant ones. For the former, we used CSS selectors to parse listing data from the HTML. For the latter, however, we had to employ hidden web data scraping techniques to extract variation data from hidden javascript variables.

To find listings on Ebay we've taken a look at how the search system works and how can we scrape it by replicating its behavior.

Finally, to avoid being blocked we used ScrapFly's API which smartly configures every web scraper connection to avoid being blocked. For more about ScrapFly, see our documentation and try it out for FREE!

Related Posts

How to Scrape SimilarWeb Website Traffic Analytics

In this guide, we'll explain how to scrape SimilarWeb through a step-by-step guide. We'll scrape comprehensive website traffic insights, websites comparing data, sitemaps, and trending industry domains.

How to Scrape BestBuy Product, Offer and Review Data

Learn how to scrape BestBuy, one of the most popular retail stores for electronic stores in the United States. We'll scrape different data types from product, search, review, and sitemap pages using different web scraping techniques.

How To Scrape TikTok in 2024

In this tutorial, we'll explain how to scrape TikTok. We'll extract data from various TikTok sources, such as posts, comments, profiles and search pages. Moreover, we'll scrape these data through hidden TikTok APIs or hidden JSON datasets.