Ebay is the world's biggest peer-to-peer e-commerce web market, making it an attractive target for public data collection!
In this guide, we'll explain how to scrape Ebay search and listing pages for various details, inlcuding pricing, variant information, features, and descriptions.
We'll use Python, a few community packages, and some clever parsing techniques. Let's get started!
Key Takeaways
Master ebay scraper development using Python with httpx and parsel, extracting product data from hidden web data and handling anti-bot measures for comprehensive e-commerce data collection.
- Reverse engineer eBay's search API endpoints by intercepting browser network requests and analyzing JSON response structures
- Parse dynamic JSON data embedded in HTML using XPath selectors for product details and variants
- Bypass eBay's anti-scraping measures with realistic headers, user agents, and request spacing
- Extract structured product data including titles, prices, descriptions, and seller information
- Implement exponential backoff retry logic with 403 status code detection for rate limiting
- Handle multi-variant products and dynamic pricing through advanced JSON parsing and data extraction techniques
Latest Ebay.com Scraper Code
Why Scrape Ebay?
Ebay is one of the world's biggest product marketplaces, especially for more niche and rare items. This makes Ebay a great target for e-commerce data analytics.
Scraping Ebay data empoers various use cases, including:
- Competitor analysis by gathering data on competitors' sales and reviews.
- Market research by tracking product prices for hot deals or trends.
- Empowered navigation through automated search patterns and custom alerts.
For further details, refer to our introduction on web scraping use cases.
Setup
Web scraping Ebay requires using a few Python community packages:
In this tutorial, we'll be using Python with two important community libraries:
- scrapfly-sdk: A Python SDK for ScrapFly, a web scraping API that bypasses web scraping blocking
- jmespath: For refining and parsing JSON datasets
- nested-lookup: To find nested keys in the Ebay JSON datasets
The above packages can be installed using the below pip command:
$ pip install scrapfly-sdk jmespath nested-lookup
Scraping Ebay Listings
Let's get started by scraping Ebay for single listing pages. Ebay listings consists of two types:
- Single variant listings with fixed selections
- Multiple variant listings with different selections, like tech devices
First, we'll start scraping single variants since they are more straightforward to extract.
We'll be using single variants since they are more straightforward to extract. Let's take this product for example, we'll be extracting data from the below fields:
In the image above we marked our fields and to build CSS selectors to select these fields we can use the Browser Developer Tools (F12 key or right click -> inspect option).
To scrape the above Ebay listing data, we'll be using CSS and XPath selectors:
Let's break down the above Ebay scraping code. We start by defining a new Scrapfly client and define two functions:
parse_product: to parse the product HTML pages using CSS and XPath selectorsscrape_product: To request Ebay product pages using Scrapfly to bypass its antibot and retrieve the HTML
Below is example output of the Ebay data retrieved
Next, for products with variants we'll have to go a bit further and extract the page's hidden web data. It might seem like a complex process, though we'll cover it step-by-step!
Scraping Ebay Listing Variant Data
Ebay's listings can contain multiple products through a feature called variants. For example, let's take this iPhone listing:
We can see several variant options: model, storage capacity, and color. These options are updated using JavaScript each time we a select one.
Ebay is using JavaScript to update the page with a different price every time we choose a different option. That means that the varaint data exist in a JavaScript variable. Extracting these data is commonly known as hidden web data.
We'll briefly mention the hidden web data extraction in this guide. For the full details, refer to our dedicated tutorial.
To scrape the product variant data, we'll extract them as JSON under hidden script tags:
In the above Ebay scraper, we extract the variant listing data using the below steps:
- Selecting the
scripttag containing theMSKUvariable. - Extracting the JSON datasets using the
find_json_objectsutility. - Iterating over the various options and selecting the useful fields.
Here's what the retrieved Ebay scraping results should look like:
Next, let's see how to scrape Ebay search.
Scraping Ebay Search
To start scraping Ebay search results, let's reverse engineer it. When we input a search keyword we can see that Ebay is redirecting us to a different URL where the search results are located. For example, if we search for the term iphone we'll be taken to an URL similar to ebay.com/sch/i.html?_nkw=iphone&_sacat=0.
When a search query is submitted, Ebay redirects the requests to a search result document. For instance, searh the keyword iphone, and you will get reidrected to a URL similar to ebay.com/sch/i.html?_nkw=iphone&_sacat=0.
The page of the above URL uses several URL parameters to define the search query:
_nkwfor search keyword._sacarthe category restriction._sopsorting type._pgnpage number._ipglistings per page (default is 60).
We can find more arguments by clicking around and exploring the search. To keep our Ebay web scraper short, let's stick with the below five parameters:
Here's what the extracted Ebay data looks like:
Avoiding Ebay Scraping Blocking
Creating an Ebay scraper seems straightforward. However, attempting the scale is the tricky part! Ebay can differentiate our requests as being automated, hence asking for CAPTCHA challenges or even block the scraping process entirely!
ScrapFly provides web scraping, screenshot, and extraction APIs for data collection at scale.
- Anti-bot protection bypass - scrape web pages without blocking!
- Rotating residential proxies - prevent IP address and geographic blocks.
- JavaScript rendering - scrape dynamic web pages through cloud browsers.
- Full browser automation - control browsers to scroll, input and click on objects.
- Format conversion - scrape as HTML, JSON, Text, or Markdown.
- Python and Typescript SDKs, as well as Scrapy and no-code tool integrations.
To avoid Ebay web scraping blocking, we'll be using scrapfly-sdk with the anti-scraping protection bypass feature. Start by installing it using pip:
$ pip install scrapfly-sdk
To take advantage of ScrapFlys API in our Ebay scraper, all we have to do is replace httpx with scrapfly-sdk client:
import httpx
response = httpx.get("some ebay.com url")
# in ScrapFly SDK becomes 👇
from scrapfly import ScrapflyClient, ScrapeConfig
client = ScrapflyClient("YOUR SCRAPFLY KEY")
result = client.scrape(ScrapeConfig(
# some ebay URL
"https://www.ebay.com/itm/393531906094",
# we can select specific proxy country
country="US",
# and enable anti scraping protection bypass:
asp=True,
# enable JavaScript rendering if required
render_js=True
))
For more on how to scrape Ebay.com using ScrapFly, see the Full Scraper Code section.
FAQ
To wrap this guide up, let's take a look at some frequently asked questions regarding how to scrape data from Ebay:
Is it legal to scrape ebay.com?
Yes. Ebay's data is publically available - scraping Ebay at slow, respectful rates would fall under the ethical scraping definition.
That being said, be aware of GDRP compliance in the EU when storing personal data such as sellers personal details like names or location. For more, see our Is Web Scraping Legal? article.
How to crawl Ebay.com?
To web crawl Ebay we can adapt the scraping techniques covered in this article. Every ebay listing contains related products which we can extract and feed into our scraping loop turning our scraper into a crawler that is capable of finding new details to crawl.
Is there an Ebay API?
No. While Ebay does have a private catalog API it contains only metadata fields like product ids. For the full product details, the only way is to scrape Ebay as described in this guide.
Ebay Scraping Summary
In this guide, we wrote a Python Ebay scraper for product listing data using Python.
We've scraped data from three parts of the Ebay domain:
- Single variant products - using basic CSS selector parsing logic.
- Multiple variant products - using hidden web data extraction.
- Search pages - using search parameters and basic crawling rules.
Finally, to avoid Ebay scraping blocking, we used ScrapFly's API to automatically configure the HTTP connection. For more about ScrapFly, see our documentation and try it out for FREE!
Legal Disclaimer and Precautions
This tutorial covers popular web scraping techniques for education. Interacting with public servers requires diligence and respect and here's a good summary of what not to do:- Do not scrape at rates that could damage the website.
- Do not scrape data that's not available publicly.
- Do not store PII of EU citizens who are protected by GDPR.
- Do not repurpose the entire public datasets which can be illegal in some countries.