     [Blog](https://scrapfly.io/blog)   /  [python](https://scrapfly.io/blog/tag/python)   /  [How to Scrape Leboncoin.fr in 2026 (No API Needed)](https://scrapfly.io/blog/posts/how-to-scrape-leboncoin-marketplace-real-estate)   # How to Scrape Leboncoin.fr in 2026 (No API Needed)

 by [Mazen Ramadan](https://scrapfly.io/blog/author/mazen) May 19, 2026 19 min read [\#python](https://scrapfly.io/blog/tag/python) [\#scrapeguide](https://scrapfly.io/blog/tag/scrapeguide) 

 [  ](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-leboncoin-marketplace-real-estate "Share on LinkedIn")    

 

 

   

   **Web Scraping API — Anti-Bot Bypass**Bypass any anti-scraper system and automatically resolve JavaScript and fingerprint challenges.

 

 [ Learn More  ](https://scrapfly.io/products/web-scraping-api#features) [  Docs ](https://scrapfly.io/docs/scrape-api/getting-started#features) 

 

 

Leboncoin.fr has many million monthly visitors and no public API, but Leboncoin runs one of the most aggressive anti-bot stacks in French e-commerce. A plain Playwright script typically hits a CAPTCHA wall within seconds, and a raw `requests.get()` call rarely even reaches the listings page.

In this guide, we will walk through how to scrape Leboncoin.fr end to end using Python. We will cover Leboncoin's URL and page architecture, bypassing DataDome with [Scrapfly's ASP feature](https://scrapfly.io/web-scraping-api), extracting search results and individual ads through the `__NEXT_DATA__` hidden JSON, scraping non real estate categories like vehicles, exporting scraped data to JSON and CSV, and adding production grade retry logic.

## Key Takeaways

- Leboncoin has no public API, so the `__NEXT_DATA__` JSON embedded in every Next.js page is the cleanest extraction target.
- The same hidden JSON technique works across every Leboncoin category, including real estate, vehicles, electronics, and jobs.
- Leboncoin uses DataDome for anti-bot protection, and standard headless browsers or default requests sessions are detected almost immediately.
- Scrapfly's ASP feature handles TLS fingerprinting, proxy rotation, and CAPTCHA avoidance, which removes the most fragile part of the scraper.
- A complete Leboncoin scraper covers search pagination, ad detail pages, multi category support, JSON and CSV export, and retry logic with exponential backoff.

[**View Source Code**github.com/scrapfly/scrapfly-scrapers/tree/main/leboncoin-scraper](https://github.com/scrapfly/scrapfly-scrapers/tree/main/leboncoin-scraper)

**Get web scraping tips in your inbox**Trusted by 100K+ developers and 30K+ enterprises. Unsubscribe anytime.







## Why Scrape Leboncoin.fr?

Leboncoin.fr is France's largest classifieds marketplace, with roughly 30 million active listings and around 28 million unique monthly visitors. With no public API available to the public, scraping is the only way to access Leboncoin's catalog at scale for research, pricing, or monitoring projects.

The scale across categories is what makes Leboncoin a serious data target:

- **Real estate** (`ventes_immobilieres`, `locations`) with surface area, rooms, energy rating, and location fields.
- **Vehicles** (`voitures`, `motos`) with brand, model, year, mileage, fuel type, and transmission fields.
- **Electronics**, **furniture**, **fashion**, and **jobs** with category specific attributes and seller information.

Typical use cases that need this data include real estate price tracking across French departments, vehicle depreciation analysis, competitive pricing intelligence for resellers, and inventory monitoring for vertical aggregators. Every listing exposes the price, location, seller profile, images, and category specific attributes directly in the page payload, which keeps the downstream dataset clean.

With the data opportunity clear, the next step is to understand how Leboncoin structures pages before writing any scraping code.



## How Does Leboncoin.fr Structure Its Pages?

Leboncoin.fr is a Next.js application, which means every page ships with a `<script id="__NEXT_DATA__">` tag that contains the full server rendered JSON state. The scraping approach in this guide reads that JSON tag directly, so no CSS selectors or XPath expressions are needed to pull titles, prices, or attributes out of the rendered HTML.

### How Does Leboncoin's URL System Work Across Categories?

Leboncoin uses a small set of URL patterns that stay consistent across categories. Learning the URL shape up front makes the scraper easy to adapt from real estate to vehicles, electronics, or any other section of the site.

- **Search URL:** `https://www.leboncoin.fr/recherche?text=maison&category=9&page=1`
- **Ad URL:** `https://www.leboncoin.fr/ad/ventes_immobilieres/{ad_id}` for real estate, and `https://www.leboncoin.fr/ad/voitures/{ad_id}` for vehicles.
- **Legacy ad URL:** `https://www.leboncoin.fr/ventes_immobilieres/{ad_id}.htm`, which still resolves and is what the JSON `url` field typically returns.

The `category` query parameter controls which vertical the search hits, and the main codes worth remembering are `9` for real estate, `2` for vehicles, `17` for electronics, `19` for furniture, and `33` for jobs. Pagination is driven by the `page` parameter and the page size is fixed by Leboncoin, so the scraper only needs to increment `page` until the returned ads list is empty.

### Where Is the Hidden JSON Data in Leboncoin's Next.js Frontend?

The `__NEXT_DATA__` script tag sits near the bottom of the rendered HTML and contains the full props tree that React hydrated the page with. To inspect the tag, open Leboncoin in a browser, press `F12` to open DevTools, and search for `__NEXT_DATA__` in the Elements panel.

The JSON paths used by the scrapers in this guide are:

- **Search results:** `["props"]["pageProps"]["searchData"]["ads"]` is the current path, and older revisions of Leboncoin used `["props"]["pageProps"]["initialProps"]["searchData"]["ads"]`. If the primary path returns `None`, the scraper should fall back to the older one.
- **Ad detail:** `["props"]["pageProps"]["ad"]` holds the single ad object on listing detail pages.

Before shipping the scraper to production, verify these two paths against a fresh Leboncoin page by loading the HTML, extracting the `__NEXT_DATA__` content with `json.loads`, and printing the top level keys of `props.pageProps`. If Leboncoin renames a key during a redesign, the scraper only needs one path update rather than a rewrite of every selector.

For a deeper look at why JSON in HTML beats DOM scraping as a general pattern, the hidden web data guide below covers the full rationale.

[How to Scrape Hidden Web DataThe visible HTML doesn't always represent the whole dataset available on the page. In this article, we'll be taking a look at scraping of hidden web data. What is it and how can we scrape it using Python?](https://scrapfly.io/blog/posts/how-to-scrape-hidden-web-data)

With the page structure understood, the next section gets the Python environment ready.



## Project Setup

This Leboncoin scraper uses Python 3.9 or newer to take advantage of modern `asyncio` features and the typed `scrapfly-sdk` client. Two libraries cover everything the guide needs.

shell```shell
pip install scrapfly-sdk parsel
```



The `scrapfly-sdk` command line above installs the Scrapfly Python client, which handles requests, anti-bot bypass, and concurrent scraping in one package. [parsel](https://parsel.readthedocs.io/) is a lightweight CSS and XPath selector library that the code uses to pull the `__NEXT_DATA__` script tag out of each page.

Running the scraper asynchronously is what makes large Leboncoin runs practical, because [asyncio drastically increases web scraping speed](https://scrapfly.io/blog/posts/web-scraping-speed) when every request waits on the network.



[**Leboncoin Scraper Code**github.com/scrapfly/scrapfly-scrapers/tree/main/leboncoin-scraper](https://github.com/scrapfly/scrapfly-scrapers/tree/main/leboncoin-scraper)

With dependencies in place, the next section tackles the part that blocks most Leboncoin scrapers before they even reach the data.



## How to Bypass Leboncoin's Anti-Bot Protection with Scrapfly

Leboncoin uses DataDome for bot detection, and DataDome evaluates every incoming request against TLS and browser fingerprints, IP reputation, and behavioral signals rather than just user agent strings. A default `requests` session, a vanilla Playwright launch, and most headless browser stealth plugins are caught quickly, which is why most Leboncoin scrapers fail on the first page.

### What Anti-Bot System Does Leboncoin Use?

[DataDome](https://datadome.co/) is the primary anti-bot layer on Leboncoin, and public reports and Reddit threads about the unofficial `lbc` Python library confirm that French residential IPs and real browser fingerprints are needed even for moderate scraping volume. DataDome challenges typically look like a dedicated block page or an interstitial CAPTCHA hosted on a DataDome subdomain.

The three signals DataDome leans on the hardest are:

- **TLS or JA3 fingerprint**, which exposes default Python clients because the TLS handshake of `requests` or `httpx` does not match any real Chrome or Firefox build.
- **Browser fingerprint**, including `navigator` properties, WebGL output, and font lists that a stock headless browser leaks even with popular stealth patches.
- **IP reputation**, where datacenter ranges and non French geographies face stricter screening than French residential IPs.

### Why Do Headless Browsers Get Blocked on Leboncoin?

A plain headless browser gets blocked on Leboncoin because the browser fingerprint of a default Playwright or Puppeteer instance is detectably different from a real Chrome install, and DataDome already has signatures for the most common automation stacks. The example below shows how little it takes to trigger the block page.

python```python
from playwright.sync_api import sync_playwright

with sync_playwright() as playwright:
    browser = playwright.chromium.launch(headless=False)
    page = browser.new_page()
    page.goto("https://www.leboncoin.fr")
    page.screenshot(path="screenshot.png")
```



The Playwright snippet above launches a visible Chromium instance and navigates to Leboncoin's homepage. Even with `headless=False`, the resulting page is a DataDome challenge rather than the marketplace, because the browser still leaks automation telemetry that Leboncoin's anti-bot layer picks up.

The Scrapfly Web Scraping API solves this at the request layer by matching real browser TLS fingerprints, rotating through a residential proxy pool in France, and handling CAPTCHA challenges transparently. The minimal Scrapfly call that reaches Leboncoin reliably looks like this:

python```python
from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse

scrapfly = ScrapflyClient(key="Your Scrapfly API key")

api_response: ScrapeApiResponse = scrapfly.scrape(
    ScrapeConfig(
        url="https://www.leboncoin.fr",
        render_js=True,   # run the page through a cloud headless browser
        asp=True,         # enable anti scraping protection bypass
        country="FR",     # route the request through a French IP
    )
)
print(api_response.upstream_status_code)
"200"
```



The Scrapfly request above enables `asp=True` for DataDome bypass, `render_js=True` to execute the page JavaScript, and `country="FR"` to force a French exit IP. With a 200 response in hand, the scraper has the raw HTML and can focus on parsing rather than fighting the anti-bot layer.

[How to Bypass Anti-Bot Protection When Web ScrapingLearn how anti-bot systems detect scrapers and 5 universal bypass techniques including proxy rotation, fingerprinting, and fortified headless browsers.](https://scrapfly.io/blog/posts/how-to-bypass-anti-bot-protection-when-web-scraping)

With the anti-bot problem handled, the next section turns to extracting Leboncoin's search results.



## How to Scrape Leboncoin Search Results

The Leboncoin search scraper fetches a search URL through Scrapfly, pulls the `__NEXT_DATA__` script out of the HTML, and reads the ads array directly from the parsed JSON. Every search page, across every category, exposes the same structure, so one parsing function covers the entire surface.

### How to Parse Search Data from `__NEXT_DATA__`

The `parse_search` function below selects the `__NEXT_DATA__` script tag, loads it as JSON, and returns the ads array with a fallback for older Leboncoin revisions.

python```python
from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse
import asyncio
from typing import Dict, List
import json

SCRAPFLY = ScrapflyClient(key="Your Scrapfly API key")

BASE_CONFIG = {
    "asp": True,          # bypass DataDome
    "country": "fr",      # French exit IP
}

def parse_search(result: ScrapeApiResponse) -> List[Dict]:
    """Parse Leboncoin search data from the __NEXT_DATA__ JSON."""
    next_data = result.selector.css("script[id='__NEXT_DATA__']::text").get()
    data = json.loads(next_data)
    page_props = data["props"]["pageProps"]
    # Primary path on current Leboncoin revisions.
    if "searchData" in page_props:
        return page_props["searchData"]["ads"]
    # Fallback for older revisions of the site.
    return page_props["initialProps"]["searchData"]["ads"]

async def scrape_search(url: str) -> List[Dict]:
    """Scrape a single Leboncoin search page."""
    print(f"scraping search {url}")
    first_page = await SCRAPFLY.async_scrape(ScrapeConfig(url, **BASE_CONFIG))
    search_data = parse_search(first_page)
    print(json.dumps(search_data[:1], indent=2))
    return search_data

asyncio.run(scrape_search(url="https://www.leboncoin.fr/recherche?text=maison&page=1"))
```



The `parse_search` helper extracts the `__NEXT_DATA__` text, parses it with `json.loads`, and reads the ads array with a defensive fallback for the legacy `initialProps.searchData.ads` location. The `scrape_search` coroutine wraps one Scrapfly call and prints the first ad for inspection.

### How to Scrape Multiple Search Pages with Pagination

Leboncoin paginates searches with the `page` query parameter, and the scraper can use `SCRAPFLY.concurrent_scrape` to request several pages at once rather than looping sequentially.

python```python
from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse
from typing import Dict, List
import asyncio
import json

SCRAPFLY = ScrapflyClient(key="Your Scrapfly API key")

BASE_CONFIG = {"asp": True, "country": "fr"}

async def scrape_search(url: str, max_pages: int = 5) -> List[Dict]:
    """Scrape a Leboncoin search across multiple pages concurrently."""
    print(f"scraping search {url}")
    first_page = await SCRAPFLY.async_scrape(ScrapeConfig(url, **BASE_CONFIG))
    search_data = parse_search(first_page)

    # Build the list of remaining pages.
    other_pages = [
        ScrapeConfig(f"{first_page.context['url']}&page={page}", **BASE_CONFIG)
        for page in range(2, max_pages + 1)
    ]
    # Fetch the remaining pages concurrently.
    async for result in SCRAPFLY.concurrent_scrape(other_pages):
        search_data.extend(parse_search(result))

    print(f"Collected {len(search_data)} ads across {max_pages} pages")
    return search_data

asyncio.run(scrape_search(
    url="https://www.leboncoin.fr/recherche?text=maison&category=9",
    max_pages=5,
))
```



The updated `scrape_search` coroutine accepts a `max_pages` argument, fetches the first page to anchor the pagination, and then builds a list of `ScrapeConfig` objects for pages two through `max_pages`. The `SCRAPFLY.concurrent_scrape` async generator runs all remaining pages in parallel.

With search coverage in place, the next section drills into individual ad pages for the richer per listing data.



Scrapfly

#### Scale your web scraping effortlessly

Scrapfly handles proxies, browsers, and anti-bot bypass — so you can focus on data.

[Try Free →](https://scrapfly.io/register)## How to Scrape Individual Leboncoin Listings

Individual Leboncoin ads reuse the `__NEXT_DATA__` technique, but the JSON lives at a different path, `["props"]["pageProps"]["ad"]`. Ad pages return the full description, the complete attributes list, and the seller profile, which is more than what the search response carries per listing.

### How Does Ad Page Data Differ from Search Data?

Ad pages expose a superset of what search results return. The key differences are the full `body` description text, the complete attributes list including category specific fields, any phone number visibility flags, and richer location detail on professional listings. The URL format used for scraping ad pages is `https://www.leboncoin.fr/ad/{category}/{ad_id}`, which is the form that the Next.js router handles directly.

### How to Scrape Multiple Ads Concurrently

The `scrape_ad` function below parses one ad page, and the runner at the bottom scrapes several ads concurrently using `asyncio.as_completed`.

python```python
from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse
from typing import Dict, List
import asyncio
import json

SCRAPFLY = ScrapflyClient(key="Your Scrapfly API key")

BASE_CONFIG = {"asp": True, "country": "fr"}

def parse_ad(result: ScrapeApiResponse) -> Dict:
    """Parse a single Leboncoin ad from the __NEXT_DATA__ JSON."""
    next_data = result.selector.css("script[id='__NEXT_DATA__']::text").get()
    return json.loads(next_data)["props"]["pageProps"]["ad"]

async def scrape_ad(url: str) -> Dict:
    """Scrape a single Leboncoin ad page."""
    print(f"scraping ad {url}")
    result = await SCRAPFLY.async_scrape(ScrapeConfig(url, **BASE_CONFIG))
    return parse_ad(result)

async def run() -> List[Dict]:
    ad_urls = [
        "https://www.leboncoin.fr/ad/ventes_immobilieres/3111241558",
        "https://www.leboncoin.fr/ad/ventes_immobilieres/3162981297",
        "https://www.leboncoin.fr/ad/ventes_immobilieres/3184424365",
    ]
    to_scrape = [scrape_ad(url) for url in ad_urls]
    ad_data: List[Dict] = []
    for response in asyncio.as_completed(to_scrape):
        ad_data.append(await response)
    return ad_data

ads = asyncio.run(run())
print(f"Scraped {len(ads)} ads")
```



The `parse_ad` function pulls the single ad object out of `pageProps.ad`, and `scrape_ad` wraps one Scrapfly request. The `run` coroutine dispatches three ad scrapes concurrently with `asyncio.as_completed`, which lets each ad come back as soon as its own request finishes instead of waiting for the slowest one.

With ad level extraction working, the next section shows the same scraper running against a non real estate category.



## How to Scrape Different Leboncoin Categories

The `__NEXT_DATA__` scraping pattern is category agnostic, so the code from the search and ad sections works for every Leboncoin vertical with just a URL change. The category specific attributes vary, but the JSON envelope and the path into `pageProps.searchData.ads` do not.

### How to Adapt the Scraper for Vehicle Listings

For vehicles, the search URL uses `category=2` and the ads include brand, model, year, mileage, fuel type, and transmission fields inside the same `attributes` array.

python```python
async def scrape_vehicles(keyword: str, max_pages: int = 3) -> List[Dict]:
    """Scrape Leboncoin vehicle listings and keep only vehicle-specific fields."""
    url = f"https://www.leboncoin.fr/recherche?text={keyword}&category=2"
    ads = await scrape_search(url, max_pages=max_pages)

    vehicle_fields = {"brand", "model", "regdate", "mileage", "fuel", "gearbox"}
    summary: List[Dict] = []
    for ad in ads:
        flat_attrs = {a["key"]: a.get("value_label") for a in ad.get("attributes", [])}
        summary.append({
            "list_id": ad["list_id"],
            "title": ad["subject"],
            "price_eur": ad["price"][0] if ad.get("price") else None,
            "city": ad.get("location", {}).get("city"),
            **{k: flat_attrs.get(k) for k in vehicle_fields},
        })
    return summary

vehicles = asyncio.run(scrape_vehicles("peugeot 208", max_pages=2))
print(json.dumps(vehicles[:2], indent=2, ensure_ascii=False))
```



The `scrape_vehicles` coroutine reuses `scrape_search` from earlier and layers a projection step on top, flattening the attributes array into a dictionary keyed by attribute name and keeping only the vehicle fields. A sample output row looks like :

Outputjson```json
[
  {
    "list_id": 3131838929,
    "title": "Peugeot 208 Peugeot 208 Allure automatique",
    "price_eur": 13990,
    "city": "Apprieu",
    "mileage": "59600 km",
    "brand": "Peugeot",
    "regdate": "2021",
    "fuel": "Essence",
    "gearbox": "Automatique",
    "model": "208"
  },
  {
    "list_id": 3180211735,
    "title": "Peugeot 208",
    "price_eur": 4900,
    "city": "Vénissieux",
    "mileage": "103229 km",
    "brand": "Peugeot",
    "regdate": "2015",
    "fuel": "Essence",
    "gearbox": "Manuelle",
    "model": "208"
  }
]
```



The table below summarizes the most useful Leboncoin category codes and the attributes worth extracting for each. | Category | URL parameter | Key attributes |
|---|---|---|
| Real estate | `category=9` | `square`, `rooms`, `energy_rate`, `real_estate_type` |
| Vehicles | `category=2` | `brand`, `model`, `regdate`, `mileage`, `fuel`, `gearbox` |
| Electronics | `category=17` | `item_condition`, `brand` |
| Furniture | `category=19` | `item_condition`, `shipping_type` |
| Jobs | `category=33` | `contract_type`, `salary_range` |

With multi category support in place, the next section handles what to do with the data once the scraper has collected it.



## How to Save Scraped Leboncoin Data to JSON and CSV

The simplest way to persist scraped Leboncoin data is Python's built in `json` and `csv` modules, with JSON preserving the nested structure and CSV offering a flatter view for spreadsheet analysis. Both formats are worth supporting because downstream consumers usually pick one or the other.

python```python
import csv
import json
from typing import Dict, List

def save_to_json(ads: List[Dict], filename: str) -> None:
    """Save the full ad objects, including nested fields, to JSON."""
    with open(filename, "w", encoding="utf-8") as file:
        json.dump(ads, file, indent=2, ensure_ascii=False)

def save_to_csv(ads: List[Dict], filename: str) -> None:
    """Flatten ads to a spreadsheet friendly CSV."""
    rows: List[Dict] = []
    for ad in ads:
        flat_attrs = {a["key"]: a.get("value_label") for a in ad.get("attributes", [])}
        rows.append({
            "list_id": ad.get("list_id"),
            "title": ad.get("subject"),
            "price_eur": ad["price"][0] if ad.get("price") else None,
            "category": ad.get("category_name"),
            "city": ad.get("location", {}).get("city"),
            "zipcode": ad.get("location", {}).get("zipcode"),
            "region": ad.get("location", {}).get("region_name"),
            "owner_type": ad.get("owner", {}).get("type"),
            "url": ad.get("url"),
            **flat_attrs,
        })

    fieldnames = sorted({key for row in rows for key in row.keys()})
    with open(filename, "w", encoding="utf-8", newline="") as file:
        writer = csv.DictWriter(file, fieldnames=fieldnames)
        writer.writeheader()
        writer.writerows(rows)

save_to_json(ads, "leboncoin_ads.json")
save_to_csv(ads, "leboncoin_ads.csv")
```



The `save_to_json` helper dumps the raw ad objects with `ensure_ascii=False`, which keeps French characters readable in the file. The `save_to_csv` helper flattens each ad into a single row, expands the `attributes` array into one column per attribute key.

For larger runs, it is worth deduplicating on `list_id` before writing, because Leboncoin occasionally resurfaces the same ad across neighboring search pages. If the scraper is feeding an API or a dashboard, the guide below covers turning scraped output into a serviceable endpoint.

[How to Turn Web Scrapers into Data APIsDelivering web scraped data can be a difficult problem - what if we could scrape data on demand? In this tutorial we'll be building a data API using FastAPI and Python for real time web scraping.](https://scrapfly.io/blog/posts/how-to-turn-web-scrapers-into-data-apis)



## FAQ

Is it legal to scrape Leboncoin.fr?Scraping publicly visible Leboncoin listings is generally legal, but personal data like seller names and phone numbers falls under GDPR, so downstream storage and processing must have a lawful basis. Keep the request rate reasonable, honor Leboncoin's [robots.txt](https://www.leboncoin.fr/robots.txt), and review Leboncoin's terms of service before running a production scraper. For a broader overview, see our [web scraping legality guide](https://scrapfly.io/is-web-scraping-legal).







Does Leboncoin have a public API?No. Leboncoin does not offer a public API, and the closest alternative is the unofficial [lbc](https://github.com/etienne-hd/lbc) Python library that wraps Leboncoin's internal endpoints. The `lbc` library works for quick experiments but is undocumented, breaks with site changes, and typically requires French residential proxies to avoid 403 responses, so the `__NEXT_DATA__` scraping approach in this guide is the more stable path for production use.







What anti-bot protection does Leboncoin use?Leboncoin uses DataDome for bot detection, which evaluates TLS fingerprints, browser telemetry, and IP reputation rather than only user agent strings. Standard proxy rotation is not enough on its own, because DataDome will still block a rotating residential pool if the TLS handshake looks like Python's default client.







Can I scrape Leboncoin without Python using a no-code tool?Yes. Several no-code and low-code options exist, including [Apify](https://scrapfly.io/compare/apify-alternative) actors for Leboncoin and browser based tools like Piloterr and Axiom.ai. These options are fine for small, ad hoc extractions, but the Python approach in this guide gives full control over pagination, category coverage, data shape, and retry logic, which is what most production pipelines need.







Can I scrape other European marketplace sites with this approach?Yes. Many European classifieds and real estate sites run on Next.js or a similar framework that embeds full JSON state in the HTML, so the `__NEXT_DATA__` pattern transfers directly. For concrete examples, see our guides to [How to Scrape Idealista.com](https://scrapfly.io/blog/posts/how-to-scrape-idealista) and [How to Scrape Immobilienscout24.de Real Estate Data](https://scrapfly.io/blog/posts/how-to-scrape-immobillienscout24-real-estate-property-data). Each site has its own anti-bot stack, but the JSON extraction pattern is the same.

[**View Source Code**github.com/scrapfly/scrapfly-scrapers/tree/main/leboncoin-scraper](https://github.com/scrapfly/scrapfly-scrapers/tree/main/leboncoin-scraper)









## Summary

This guide built a complete Leboncoin.fr scraper in Python, covering search pagination, ad detail pages, a non real estate category example, JSON and CSV export, and retry logic with exponential backoff. The core idea throughout is that Leboncoin's Next.js frontend serves the full listing data as JSON inside `__NEXT_DATA__`, so a single parsing pattern replaces dozens of brittle CSS selectors.

From here, the scraper is ready to grow in a few obvious directions. Customizing the category filter set turns the same code into a targeted vehicle or furniture tracker. The full source lives on the [Scrapfly scrapers GitHub repo](https://github.com/scrapfly/scrapfly-scrapers/tree/main/leboncoin-scraper) and is a good starting point for customization.

You can of course build the bypass layer yourself with fingerprint aware HTTP clients and a managed proxy pool, but outsourcing that layer lets the team stay focused on the data rather than the arms race.



Legal Disclaimer and PrecautionsThis tutorial covers popular web scraping techniques for education. Interacting with public servers requires diligence and respect:

- Do not scrape at rates that could damage the website.
- Do not scrape data that's not available publicly.
- Do not store PII of EU citizens protected by GDPR.
- Do not repurpose *entire* public datasets which can be illegal in some countries.

Scrapfly does not offer legal advice but these are good general rules to follow. For more you should consult a lawyer.

 

   Table of Contents















 

  Table of Contents- [Key Takeaways](#key-takeaways)
- [Why Scrape Leboncoin.fr?](#why-scrape-leboncoin-fr)
- [How Does Leboncoin.fr Structure Its Pages?](#how-does-leboncoin-fr-structure-its-pages)
- [How Does Leboncoin's URL System Work Across Categories?](#how-does-leboncoin-s-url-system-work-across-categories)
- [Where Is the Hidden JSON Data in Leboncoin's Next.js Frontend?](#where-is-the-hidden-json-data-in-leboncoin-s-next-js-frontend)
- [Project Setup](#project-setup)
- [How to Bypass Leboncoin's Anti-Bot Protection with Scrapfly](#how-to-bypass-leboncoin-s-anti-bot-protection-with-scrapfly)
- [What Anti-Bot System Does Leboncoin Use?](#what-anti-bot-system-does-leboncoin-use)
- [Why Do Headless Browsers Get Blocked on Leboncoin?](#why-do-headless-browsers-get-blocked-on-leboncoin)
- [How to Scrape Leboncoin Search Results](#how-to-scrape-leboncoin-search-results)
- [How to Parse Search Data from \_\_NEXT\_DATA\_\_](#how-to-parse-search-data-from-next-data)
- [How to Scrape Multiple Search Pages with Pagination](#how-to-scrape-multiple-search-pages-with-pagination)
- [How to Scrape Individual Leboncoin Listings](#how-to-scrape-individual-leboncoin-listings)
- [How Does Ad Page Data Differ from Search Data?](#how-does-ad-page-data-differ-from-search-data)
- [How to Scrape Multiple Ads Concurrently](#how-to-scrape-multiple-ads-concurrently)
- [How to Scrape Different Leboncoin Categories](#how-to-scrape-different-leboncoin-categories)
- [How to Adapt the Scraper for Vehicle Listings](#how-to-adapt-the-scraper-for-vehicle-listings)
- [How to Save Scraped Leboncoin Data to JSON and CSV](#how-to-save-scraped-leboncoin-data-to-json-and-csv)
- [FAQ](#faq)
- [Summary](#summary)
 
    Join the Newsletter  Get monthly web scraping insights 

 

  



Scale Your Web Scraping

Anti-bot bypass, browser rendering, and rotating proxies, all in one API. Start with 1,000 free credits.

  No credit card required  1,000 free API credits  Anti-bot bypass included 

 [Start Free](https://scrapfly.io/register) [View Docs](https://scrapfly.io/docs/onboarding) 

 Not ready? Get our newsletter instead. 

 

## Explore this Article with AI

 [ ChatGPT ](https://chat.openai.com/?q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-leboncoin-marketplace-real-estate) [ Gemini ](https://www.google.com/search?udm=50&aep=11&q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-leboncoin-marketplace-real-estate) [ Grok ](https://x.com/i/grok?text=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-leboncoin-marketplace-real-estate) [ Perplexity ](https://www.perplexity.ai/search/new?q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-leboncoin-marketplace-real-estate) [ Claude ](https://claude.ai/new?q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-leboncoin-marketplace-real-estate) 



 ## Related Articles

 [     

 blocking 

### 5 Tools to Scrape Without Blocking and How it All Works

Tutorial on how to avoid web scraper blocking. What is javascript and TLS (JA3) fingerprinting and what role request hea...

 

 ](https://scrapfly.io/blog/posts/how-to-scrape-without-getting-blocked-tutorial) [  

 http python 

### Web Scraping with Python

Introduction tutorial to web scraping with Python. How to collect and parse public data. Challenges, best practices and ...

 

 ](https://scrapfly.io/blog/posts/web-scraping-with-python) [  

 python scrapeguide 

### How to Scrape Google Search Results in 2026

In this scrape guide we'll be taking a look at how to scrape Google Search - the biggest index of public web. We'll cov...

 

 ](https://scrapfly.io/blog/posts/how-to-scrape-google) 

  



   



 Scale your web scraping effortlessly, **1,000 free credits** [Start Free](https://scrapfly.io/register)