     [Blog](https://scrapfly.io/blog)   /  [beautifulsoup](https://scrapfly.io/blog/tag/beautifulsoup)   /  [How to Scrape Imovelweb Without Getting Blocked in 2026](https://scrapfly.io/blog/posts/how-to-scrape-imovelweb)   # How to Scrape Imovelweb Without Getting Blocked in 2026

 by [Ziad Shamndy](https://scrapfly.io/blog/author/ziad) May 29, 2026 11 min read [\#beautifulsoup](https://scrapfly.io/blog/tag/beautifulsoup) [\#python](https://scrapfly.io/blog/tag/python) [\#requests](https://scrapfly.io/blog/tag/requests) [\#scrapeguide](https://scrapfly.io/blog/tag/scrapeguide) [\#tools](https://scrapfly.io/blog/tag/tools) 

 [  ](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-imovelweb "Share on LinkedIn")    

 

 

         

Imovelweb is one of Brazil's biggest real estate marketplaces. If you're comparing prices, tracking supply, or building a lead pipeline, scraping it can save days of manual work. The catch is that Imovelweb sits behind DataDome and geo-prefers Brazilian IPs. Plain HTTP requests from outside Brazil typically return 403 or a DataDome challenge before any listing data is reachable.

In this guide, [Scrapfly's Web Scraping API](https://scrapfly.io/web-scraping-api) clears DataDome and routes through Brazilian IPs. The `data-qa` hooks give us stable listings. JSON-LD `House` blocks give us property detail. Let's get started.



[**Latest imovelweb scraper code**github.com/scrapfly/scrapfly-scrapers/tree/main/imovelweb-scraper](https://github.com/scrapfly/scrapfly-scrapers/tree/main/imovelweb-scraper)

## Key Takeaways

- Imovelweb's listing cards stay stable across deploys through `data-qa` analytics hooks (`POSTING_CARD_PRICE`, `POSTING_CARD_FEATURES`, `POSTING_CARD_DESCRIPTION`) plus the `data-to-posting` URL attribute. Anchor your selectors on those, not the obfuscated CSS class names that change between releases.
- Property detail pages ship a JSON-LD `House` block with the canonical title, BRL price from `offers.price`, the structured Brazilian address, and a gallery image list. Parse JSON-LD first and fall back to the `h1` only for the cases where it's missing.
- DataDome blocks the first request from non-Brazilian IPs with a 403 or a `geo.captcha-delivery.com` challenge page. Localized `Accept-Language: pt-BR` headers help against soft checks, but TLS fingerprint parity and a Brazilian IP are what carry a session across multiple listing pages.
- Scrapfly's `asp=True` solves DataDome's TLS, behavior, and cookie checks in one call. `country="br"` routes through a Brazilian residential IP, and `render_js=True` waits for the listing grid to hydrate before returning the HTML.
- For production runs, [Scrapfly's DataDome bypass](https://scrapfly.io/bypass/datadome) returns clean HTML so the `parse_listings` and `parse_property` functions in this guide drop straight into a concurrent crawl loop. No proxy pool to maintain, no fingerprint logic, no CSS selectors to repair after every Imovelweb deploy.

**Get web scraping tips in your inbox**Trusted by 100K+ developers and 30K+ enterprises. Unsubscribe anytime.







Scrapfly

#### Extract structured data automatically?

Scrapfly's Extraction API uses AI to turn any webpage into structured data — no selectors needed.

[Try Free →](https://scrapfly.io/register)## Why Is Imovelweb Hard to Scrape?

Imovelweb runs DataDome on top of regional access controls. A plain `requests.get()` from outside Brazil usually returns a 403 with a `geo.captcha-delivery.com` challenge page instead of property HTML.

There are three reasons scraping Imovelweb is harder than most real estate sites:

- DataDome inspects your TLS fingerprint, HTTP/2 settings, and behavior signals. A plain Python request looks nothing like a real browser, and DataDome flags it on the first hop.
- Imovelweb is a Brazilian marketplace, so non-Brazilian IPs raise the suspicion score. Datacenter ASNs from cloud providers get blocked right away.
- The listing grid hydrates client-side. Even when you clear DataDome, you need JavaScript to render before the cards appear in the DOM.

## How to Set Up Scrapfly for Imovelweb

The setup runs through the Scrapfly Python SDK. Install it first, then build a base request profile that handles DataDome and Brazilian geolocation in one place.

bash```bash
pip install scrapfly-sdk
```



The SDK wraps the Scrapfly Web Scraping API and exposes the `ScrapflyClient` and `ScrapeConfig` you need. Get an API key from the [Scrapfly dashboard](https://scrapfly.io/dashboard) and export it as `SCRAPFLY_KEY`.

python```python
import os
import json
import re
from scrapfly import ScrapeConfig, ScrapflyClient

SCRAPFLY = ScrapflyClient(key=os.environ["SCRAPFLY_KEY"])

BASE_CONFIG = {
    "asp": True,                              # bypass DataDome
    "render_js": True,                        # wait for the listing grid to hydrate
    "country": "br",                          # Brazilian residential IP
    "proxy_pool": "public_residential_pool",  # avoid datacenter ASNs
}
```



`asp=True` turns on Anti-Scraping Protection, which solves DataDome's TLS, behavior, and cookie checks. `country="br"` routes through a Brazilian residential IP that Imovelweb treats as a real visitor. `render_js=True` waits for the Vue.js listing grid to populate before Scrapfly returns the HTML. Every snippet below reuses this `BASE_CONFIG`.

## How Do You Scrape Imovelweb Listings?



A search URL like `https://www.imovelweb.com.br/casas-venda-sao-paulo-sp.html` shows a grid of property cards. Each card carries the price in BRL, the area in m², room counts, a thumbnail, and the link to the detail page. Imovelweb hashes its CSS classes between releases, but the analytics layer keeps a stable set of `data-qa` attributes on every card. Anchor your selectors there.

The fields worth pulling from each card:

- **price**: `[data-qa="POSTING_CARD_PRICE"]` text content (e.g., `R$ 660.000`).
- **features**: `[data-qa="POSTING_CARD_FEATURES"] span` (e.g., `100 m² tot.`, `2 quartos`, `1 banheiro`).
- **description link**: `[data-qa="POSTING_CARD_DESCRIPTION"] a` for the title text and URL.
- **canonical URL**: the `data-to-posting` attribute on the card itself.
- **thumbnail**: the first `img` inside `.postingGallery-module__gallery-container`.

python```python
def parse_listings(result):
    """Parse an Imovelweb search page into a list of property summaries."""
    sel = result.selector
    listings = []
    for card in sel.css(".postingCardLayout-module__posting-card-layout"):
        title = card.css('[data-qa="POSTING_CARD_DESCRIPTION"] a::text').get()
        price = " ".join(
            t.strip()
            for t in card.css('[data-qa="POSTING_CARD_PRICE"] ::text').getall()
            if t.strip()
        )

        area = beds = baths = None
        for txt in card.css('[data-qa="POSTING_CARD_FEATURES"] span::text').getall():
            t = txt.strip()
            if not area and "m²" in t:
                area = t
            elif not beds and re.search(r"\bquartos?\b", t, re.I):
                beds = t
            elif not baths and re.search(r"\bban(\.|heiros?)\b", t, re.I):
                baths = t

        link = card.css("::attr(data-to-posting)").get()
        if link and link.startswith("/"):
            link = f"https://www.imovelweb.com.br{link}"

        thumb = card.css(".postingGallery-module__gallery-container img::attr(src)").get()

        if link:
            listings.append({
                "title": title.strip() if title else None,
                "price": price or None,
                "area": area,
                "bedrooms": beds,
                "bathrooms": baths,
                "url": link,
                "thumbnail": thumb,
            })
    return listings


async def scrape_listings(url: str):
    """Fetch one Imovelweb search page through Scrapfly and parse it."""
    result = await SCRAPFLY.async_scrape(ScrapeConfig(url, **BASE_CONFIG))
    return parse_listings(result)
```



The feature loop scans each span once and matches on the Portuguese keywords (`m²`, `quartos`, `banheiro(s)`) so the same parser keeps working when Imovelweb rearranges the chip order. The `data-to-posting` attribute carries the canonical URL even when the visible anchor's `href` is partial, which keeps the URL list clean.

 Example Outputjson```json

COUNT=30
{
  "title": "Excelente sobrado localizado em Rua Tranquila, com 2 dormitórios...",
  "price": "R$ 660.000",
  "area": "100 m² tot.",
  "bedrooms": "2 quartos",
  "bathrooms": null,
  "url": "https://www.imovelweb.com.br/propriedades/sobrado-2-quartos-a-venda-em-santo-amaro-2958319180.html",
  "thumbnail": "https://imgbr.imovelwebcdn.com/avisos/2/29/58/31/91/80/360x266/4674175236.jpg"
}
{
  "title": "Sobre o imóvel: Valores: - Valor de venda: R$ Ver dados...",
  "price": "R$ 800.000",
  "area": "69 m² tot.",
  "bedrooms": "3 quartos",
  "bathrooms": "1 banheiro",
  "url": "https://www.imovelweb.com.br/propriedades/casa-a-venda-santo-amaro-3-quartos-69-m-sao-2973088102.html",
  "thumbnail": "https://imgbr.imovelwebcdn.com/avisos/2/29/73/08/81/02/360x266/3597187955.jpg"
}
  
```



### How Does Imovelweb Handle Pagination?

Imovelweb's search results paginate with a numeric path segment before the `.html` suffix. The first page is plain (`casas-venda-sao-paulo-sp.html`) and subsequent pages slot the page number in (`casas-venda-sao-paulo-sp-pagina-2.html`, `-pagina-3.html`, and so on). You can detect the next page from the `rel="next"` link in the rendered HTML, or build the URLs upfront and fan them out with `concurrent_scrape`:

python```python
async def scrape_all_pages(base_url: str, max_pages: int = 3):
    """Scrape multiple Imovelweb result pages concurrently."""
    first = await SCRAPFLY.async_scrape(ScrapeConfig(base_url, **BASE_CONFIG))
    listings = parse_listings(first)

    urls = [base_url.replace(".html", f"-pagina-{p}.html") for p in range(2, max_pages + 1)]
    others = [ScrapeConfig(u, **BASE_CONFIG) for u in urls]
    async for response in SCRAPFLY.concurrent_scrape(others):
        listings.extend(parse_listings(response))
    return listings
```



`concurrent_scrape` runs the remaining pages in parallel, so a three-page crawl finishes in roughly one page's wall time instead of multiplying out linearly. Each page returns around 30 cards, so three pages give you about 90 listings.

## How Do You Scrape Imovelweb Property Detail Pages?

Property detail pages are where the real depth is. A single URL like `https://www.imovelweb.com.br/propriedades/sobrado-2-quartos-a-venda-em-santo-amaro-2958319180.html` ships a JSON-LD `House` block with the canonical title, structured address, BRL price, and image gallery. The visible HTML uses the same hashed CSS classes as the listing grid, so the JSON-LD payload is the cleanest extraction target.

python```python
def parse_property(result):
    """Parse an Imovelweb property page from the JSON-LD House block."""
    sel = result.selector
    out = {}

    for script in sel.css('script[type="application/ld+json"]::text').getall():
        try:
            data = json.loads(script)
        except json.JSONDecodeError:
            continue
        items = data if isinstance(data, list) else [data]
        for item in items:
            if not (isinstance(item, dict) and (item.get("@type") or item.get("offers"))):
                continue
            out["type"] = item.get("@type")
            out["title"] = item.get("name") or item.get("headline")
            if isinstance(item.get("offers"), dict):
                out["price"] = item["offers"].get("price")
                out["currency"] = item["offers"].get("priceCurrency", "BRL")
            addr = item.get("address")
            if isinstance(addr, dict):
                out["address"] = ", ".join(filter(None, [
                    addr.get("streetAddress"),
                    addr.get("addressLocality"),
                    addr.get("addressRegion"),
                ]))
            images = item.get("image")
            if isinstance(images, list):
                out["images"] = images
            elif isinstance(images, str):
                out["images"] = [images]
            break

    if not out.get("title"):
        h1 = sel.css("h1::text").get()
        if h1:
            out["title"] = h1.strip()

    return out


async def scrape_properties(urls: list[str]):
    """Scrape multiple Imovelweb property detail pages concurrently."""
    configs = [ScrapeConfig(url, **BASE_CONFIG) for url in urls]
    properties = []
    async for response in SCRAPFLY.concurrent_scrape(configs):
        properties.append(parse_property(response))
    return properties
```



The `@type` field tells you whether the listing is a `House`, `Apartment`, or `SingleFamilyResidence`, so you can branch downstream logic if you're segmenting by property type. The address block carries `streetAddress`, `addressLocality`, and `addressRegion` as separate fields, which makes it easier to join against external datasets than parsing a single freeform string.

 Example Outputjson```json

{
  "type": "House",
  "title": "Casa à venda com 2 Quartos, Santo Amaro, São Paulo - R$ 660.000, 100 m2 - ID: 2958319180 - Imovelweb",
  "address": "R SABARABUÇU, São Paulo, São Paulo, Brasil, , Santo Amaro",
  "images": [
    "https://imgbr.imovelwebcdn.com/avisos/2/29/58/31/91/80/720x532/4674175236.jpg"
  ]
}
  
```



Feed the `url` field from each listing into `scrape_properties` and you have a two-stage pipeline. Cheap listing pulls handle discovery, and full detail pulls handle the properties you want to track.

## Powering Imovelweb Scraping with Scrapfly



Scrapfly provides web scraping, screenshot, and extraction APIs at scale. For Imovelweb, the [Web Scraping API](https://scrapfly.io/web-scraping-api) handles DataDome, Brazilian geolocation, and JavaScript rendering that block every DIY scraper.

- [Anti-Scraping Protection bypass](https://scrapfly.io/docs/scrape-api/anti-scraping-protection) solves DataDome's TLS, behavior, and cookie checks with `asp=True`.
- [Smart proxy rotation](https://scrapfly.io/docs/scrape-api/proxy) routes Imovelweb traffic through Brazilian residential IPs with `proxy_pool="public_residential_pool"`.
- [JavaScript rendering](https://scrapfly.io/docs/scrape-api/javascript-rendering) waits for the Vue.js listing grid to hydrate before returning HTML.
- [Smart caching](https://scrapfly.io/docs/scrape-api/getting-started#api_param_cache) keeps repeat scrapes cheap during selector development.
- [Python SDK](https://scrapfly.io/docs/sdk/python) with `concurrent_scrape` for parallel pagination and detail crawls.

For more on anti-bot strategies in general, see our guide on bypassing anti-bot protection.



### Web Scraping API

Scrape any website with our powerful API. Anti-bot bypass, JavaScript rendering, and rotating proxies built-in.



[Try Web Scraping API](https://scrapfly.io/docs/scrape-api/getting-started)



## FAQ

Why do I see a DataDome page or get 403/429?Your requests likely lack a real browser fingerprint or the IP isn't in Brazil. Scrapfly's `asp=True` plus `country="br"` clears both layers in one call.







Do I need JavaScript rendering for Imovelweb?Yes for listing pages, because the card grid hydrates client-side. Property detail pages ship the JSON-LD payload in the initial HTML, but `render_js=True` still helps with consistency.







Is HTML parsing enough if JSON-LD is missing?Yes. Target the same `data-qa` analytics hooks for the listing fields, and fall back to the `h1` plus a BRL price regex on detail pages. Review the selectors quarterly as Imovelweb evolves.









## Summary

Scraping Imovelweb comes down to two things. First, clear DataDome and route through a Brazilian residential IP to receive real HTML. Then parse the `data-qa` analytics hooks on listing cards and the JSON-LD `House` block on detail pages. Both layers stay stable across deploys because they exist for SEO and analytics, not styling.

[Scrapfly's DataDome bypass](https://scrapfly.io/bypass/datadome) handles the anti-bot layer and Brazilian residential proxies handle the geolocation. The `parse_listings` and `parse_property` functions in this guide drop straight into a concurrent crawl loop. No proxy pool to maintain, no TLS tuning, and no CSS selectors to repair after every Imovelweb deploy.



Legal Disclaimer and PrecautionsThis tutorial covers popular web scraping techniques for education. Interacting with public servers requires diligence and respect:

- Do not scrape at rates that could damage the website.
- Do not scrape data that's not available publicly.
- Do not store PII of EU citizens protected by GDPR.
- Do not repurpose *entire* public datasets which can be illegal in some countries.

Scrapfly does not offer legal advice but these are good general rules to follow. For more you should consult a lawyer.

 

   Table of Contents















 

  Table of Contents- [Key Takeaways](#key-takeaways)
- [Why Is Imovelweb Hard to Scrape?](#why-is-imovelweb-hard-to-scrape)
- [How to Set Up Scrapfly for Imovelweb](#how-to-set-up-scrapfly-for-imovelweb)
- [How Do You Scrape Imovelweb Listings?](#how-do-you-scrape-imovelweb-listings)
- [How Does Imovelweb Handle Pagination?](#how-does-imovelweb-handle-pagination)
- [How Do You Scrape Imovelweb Property Detail Pages?](#how-do-you-scrape-imovelweb-property-detail-pages)
- [Powering Imovelweb Scraping with Scrapfly](#powering-imovelweb-scraping-with-scrapfly)
- [FAQ](#faq)
- [Summary](#summary)
 
    Join the Newsletter  Get monthly web scraping insights 

 

  



Scale Your Web Scraping

Anti-bot bypass, browser rendering, and rotating proxies, all in one API. Start with 1,000 free credits.

  No credit card required  1,000 free API credits  Anti-bot bypass included 

 [Start Free](https://scrapfly.io/register) [View Docs](https://scrapfly.io/docs/onboarding) 

 Not ready? Get our newsletter instead. 

 

## Explore this Article with AI

 [ ChatGPT ](https://chat.openai.com/?q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-imovelweb) [ Gemini ](https://www.google.com/search?udm=50&aep=11&q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-imovelweb) [ Grok ](https://x.com/i/grok?text=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-imovelweb) [ Perplexity ](https://www.perplexity.ai/search/new?q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-imovelweb) [ Claude ](https://claude.ai/new?q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-imovelweb) 



 ## Related Articles

 [     

 python beautifulsoup 

### How to Scrape Allegro.pl in 2026

Scrape Allegro.pl product listings and detail pages with Scrapfly. Bypass DataDome anti-bot protection. Route through Po...

 

 ](https://scrapfly.io/blog/posts/how-to-scrape-allegro) [  

 python scrapeguide 

### How to Scrape Real Estate Property Data using Python

Introduction to scraping real estate property data. What is it, why and how to scrape it? We'll also list dozens of popu...

 

 ](https://scrapfly.io/blog/posts/how-to-scrape-real-estate-property-data-using-python) [     

 python beautifulsoup 

### How to Scrape Zoro.com Without Getting Blocked in 2026

Scrape Zoro.com product titles, prices, SKUs, and attributes with Python. Bypass DataDome anti-bot protection using stab...

 

 ](https://scrapfly.io/blog/posts/how-to-scrape-zoro-dot-com) 

  



   



 Extract structured data with AI, **1,000 free credits** [Start Free](https://scrapfly.io/register)