When it comes it comes to real estate websites in Australia, there are a few options and Realestate.com.au is biggest one. It's a popular website for real estate ads featuring thousands of different property listings across the country. However, it's a highly protected website, making it challenging to scrape.
In this article, we'll explain how to scrape realestate.com.au for real estate data from property and search pages. We'll also explain how to avoid realestate.com.au web scraping blocking. Let's dive in!
This tutorial covers popular web scraping techniques for education. Interacting with public servers requires diligence and respect and here's a good summary of what not to do:
Do not scrape at rates that could damage the website.
Do not scrape data that's not available publicly.
Do not store PII of EU citizens who are protected by GDPR.
Do not repurpose the entire public datasets which can be illegal in some countries.
Scrapfly does not offer legal advice but these are good general rules to follow in web scraping
and for more you should consult a lawyer.
Why Scrape Realestate.com.au?
Realestate.com.au includes thousands of property listing pages and manually navigating these pages can be a tedious and time-consuming task. Realestate.com.au scraping makes it easy to search and navigate through a significant amount of property listings in no time.
Web scraping realestate.com.au enables businesses, traders and buyers to analyze and study market trends, allowing for better market understanding and gaining a competitive edge, where they can make better decisions and take wise investment actions.
scrapfly-sdk: Python SDK for ScrapFly, a web scraping API that allows for scraping at scale without getting blocked.
Since asyncio comes pre-installed in Python, you will only have to install the other libraries using the following pip command:
pip install httpx parsel jmespath scrapfly-sdk
How to Scrape Realestate.com.au Propety Pages
Let's begin by scraping property pages on realestate.com.au. Go to any property listing on the website like this property listing and you will get a page similar to this:
Instead of parsing this page's data using selectors, we'll use the hidden web data method.
To view this data, open the browser developer tools by clicking the F12 key to view the page HTML and scroll down to the script tag that starts with the window.ArgonautExchange text. You will see messy JSON data that looks like this after parsing:
To scrape realtor.com.au property pages, we'll select this script and parse the inside JSON data:
Python
ScrapFly
import re
import json
import asyncio
import jmespath
from httpx import AsyncClient, Response
from parsel import Selector
from typing import List, Dict
client = AsyncClient(
# enable http2
http2=True,
# add basic browser headers to mimize blocking chancesd
headers={
"accept-language": "en-US,en;q=0.9",
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36",
"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
"accept-language": "en-US;en;q=0.9",
"accept-encoding": "gzip, deflate, br",
}
)
def parse_property_data(data: Dict) -> Dict:
"""refine property data from JSON"""
if not data:
return
result = jmespath.search(
"""{
id: id,
propertyType: propertyType.display,
description: description,
propertyLink: _links.canonical.href,
address: address,
propertySizes: propertySizes,
generalFeatures: generalFeatures,
propertyFeatures: propertyFeatures[].{featureName: displayLabel, value: value},
images: media.images[].templatedUrl,
videos: videos,
floorplans: floorplans,
listingCompany: listingCompany.{name: name, id: id, companyLink: _links.canonical.href, phoneNumber: businessPhone, address: address.display.fullAddress, ratingsReviews: ratingsReviews, description: description},
listers: listers,
auction: auction
}
""",
data,
)
return result
def parse_hidden_data(response: Response) -> Dict:
"""parse JSON data from script tag"""
selector = Selector(response.text)
script = selector.xpath(
"//script[contains(text(),'window.ArgonautExchange')]/text()"
).get()
# data needs to be parsed mutiple times
data = json.loads(re.findall(r"window.ArgonautExchange=(\{.+\});", script)[0])
data = json.loads(data["resi-property_listing-experience-web"]["urqlClientCache"])
data = json.loads(list(data.values())[0]["data"])
return data
async def scrape_properties(urls: List[str]) -> List[Dict]:
"""scrape listing data from property pages"""
# add the property pages URLs to a scraping list
to_scrape = [client.get(url) for url in urls]
properties = []
# scrape all the property pages concurrently
for response in asyncio.as_completed(to_scrape):
response = await response
assert response.status_code == 200, "request has been blocked"
data = parse_hidden_data(response)["details"]["listing"]
data = parse_property_data(data)
properties.append(data)
print(f"scraped {len(properties)} property listings")
return properties
import re
import json
import jmespath
from typing import Dict, List
from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse
SCRAPFLY = ScrapflyClient(key="Your ScrapFly API key")
def parse_property_data(data: Dict) -> Dict:
"""refine property data from JSON"""
if not data:
return
result = jmespath.search(
"""{
id: id,
propertyType: propertyType.display,
description: description,
propertyLink: _links.canonical.href,
address: address,
propertySizes: propertySizes,
generalFeatures: generalFeatures,
propertyFeatures: propertyFeatures[].{featureName: displayLabel, value: value},
images: media.images[].templatedUrl,
videos: videos,
floorplans: floorplans,
listingCompany: listingCompany.{name: name, id: id, companyLink: _links.canonical.href, phoneNumber: businessPhone, address: address.display.fullAddress, ratingsReviews: ratingsReviews, description: description},
listers: listers,
auction: auction
}
""",
data,
)
return result
def parse_hidden_data(response: ScrapeApiResponse) -> Dict:
"""parse JSON data from script tag"""
selector = response.selector
script = selector.xpath(
"//script[contains(text(),'window.ArgonautExchange')]/text()"
).get()
# data needs to be parsed mutiple times
data = json.loads(re.findall(r"window.ArgonautExchange=(\{.+\});", script)[0])
data = json.loads(data["resi-property_listing-experience-web"]["urqlClientCache"])
data = json.loads(list(data.values())[0]["data"])
return data
async def scrape_properties(urls: List[str]) -> List[Dict]:
"""scrape listing data from property pages"""
# add the property pages URLs to a scraping list
to_scrape = [ScrapeConfig(url, country="AU", asp=True) for url in urls]
properties = []
# scrape all the property pages concurrently
async for response in SCRAPFLY.concurrent_scrape(to_scrape):
data = parse_hidden_data(response)["details"]["listing"]
data = parse_property_data(data)
properties.append(data)
print(f"scraped {len(properties)} property listings")
return properties
Run the code
async def run():
data = await scrape_properties(
urls = [
"https://www.realestate.com.au/property-house-vic-tarneit-143160680",
"https://www.realestate.com.au/property-house-vic-bundoora-141557712",
"https://www.realestate.com.au/property-townhouse-vic-glenroy-143556608",
]
)
# print the data in JSON format
print(json.dumps(data, indent=2))
if __name__ == "__main__":
asyncio.run(run())
🙋 If you are getting errors while running the Python code tabs, this is likely due to getting blocked. Run the ScrapFly code tabs to avoid getting blocked.
In the above code, we use three functions, let's break them down:
parse_hidden_data() for extracting the JSON data from the script tag and parsing it as a valid JSON object.
parse_property_data() for refining the JSON data we got and excluding the unnecessary details.
scrape_properties() for scraping the property pages by adding the page URLs into a scraping list and scraping them concurrently.
Here is a sample output of the result we got:
Sample output
[
{
"id": "143160680",
"propertyType": "House",
"description": "Renowned Real Estate proudly presents this sensational opportunity with a luxury house in Tarneit.<br/><br/>This beautiful low maintenance home is situated in the well-established suburb of Tarneit.<br/>Suitable for Various Buyers: This property is ideal for young families, downsizers, and investors.<br/>Convenience: It's conveniently located a short distance from the Tarneit west shopping center, local parks, public transport, and well-known primary and secondary schools, including the Islamic College of Melbourne.<br/><br/>Spacious Layout: The house features four generous-sized bedrooms, 1 lounge, and a dining area.<br/>Master Bedroom: The master bedroom includes an ensuite and a walk-in robe for added convenience.<br/>Contemporary Kitchen: The kitchen is modern and overlooks the low maintenance backyard and formal lounge.<br/>Stainless Steel Appliances: It is equipped with stainless steel appliances and ample storage space.<br/><br/>Additional Features:<br/>➡️Open plan living area.<br/>➡️Designated meals area connected with the kitchen and formal lounge.<br/>➡️Ducted heating .<br/>➡️Split system air conditioning in the formal lounge.<br/>➡️Low maintenance front and backyard.<br/><br/>Contact Information: For more information and to schedule an inspection, please contact Himraj at 0452060566",
"propertyLink": "https://www.realestate.com.au/property-house-vic-tarneit-143160680",
"address": {
"suburb": "Tarneit",
"state": "Vic",
"postcode": "3029",
"display": {
"shortAddress": "28 Chantelle Parade",
"__typename": "AddressDisplay",
"fullAddress": "28 Chantelle Parade, Tarneit, Vic 3029",
"geocode": {
"latitude": -37.85273078,
"longitude": 144.66332821,
"__typename": "GeocodeDisplay"
}
},
"__typename": "Address"
},
"propertySizes": {
"building": null,
"land": {
"displayValue": "336",
"sizeUnit": {
"displayValue": "m²",
"__typename": "PropertySizeUnit"
},
"__typename": "PropertySize"
},
"preferred": {
"sizeType": "LAND",
"size": {
"displayValue": "336",
"sizeUnit": {
"displayValue": "m²",
"__typename": "PropertySizeUnit"
},
"__typename": "PropertySize"
},
"__typename": "PreferredPropertySize"
},
"__typename": "PropertySizes"
},
"generalFeatures": {
"bedrooms": {
"value": 4,
"__typename": "IntValue"
},
"bathrooms": {
"value": 2,
"__typename": "IntValue"
},
"parkingSpaces": {
"value": 2,
"__typename": "IntValue"
},
"studies": {
"value": 0,
"__typename": "IntValue"
},
"__typename": "GeneralFeatures"
},
"propertyFeatures": [
{
"featureName": "Built-in wardrobes",
"value": null
},
{
"featureName": "Dishwasher",
"value": null
},
{
"featureName": "Ducted heating",
"value": null
},
{
"featureName": "Ensuites",
"value": {
"__typename": "NumericFeatureValue",
"displayValue": "1"
}
},
{
"featureName": "Evaporative cooling",
"value": null
},
{
"featureName": "Floorboards",
"value": null
},
{
"featureName": "Fully fenced",
"value": null
},
{
"featureName": "Garage spaces",
"value": {
"__typename": "NumericFeatureValue",
"displayValue": "2"
}
},
{
"featureName": "Land size",
"value": {
"__typename": "MeasurementFeatureValue",
"displayValue": "336",
"sizeUnit": {
"id": "SQUARE_METRES",
"displayValue": "m²",
"__typename": "PropertySizeUnit"
}
}
},
{
"featureName": "Living areas",
"value": {
"__typename": "NumericFeatureValue",
"displayValue": "1"
}
},
{
"featureName": "Remote garage",
"value": null
},
{
"featureName": "Secure parking",
"value": null
},
{
"featureName": "Solar panels",
"value": null
},
{
"featureName": "Toilets",
"value": {
"__typename": "NumericFeatureValue",
"displayValue": "2"
}
}
],
"images": [
"https://i2.au.reastatic.net/{size}/d8d3607342301e4e1b5b4cb84e3fc3d8cf48849a6311dd38e44bf3977fc593d8/image.jpg",
"https://i2.au.reastatic.net/{size}/7d26afd862a3d1d58501a724c3532493c4fa7cd2bd297b2ab334039fd40e6c9c/image.jpg",
"https://i2.au.reastatic.net/{size}/cbd580874f3f6aedbf263d77b6de3d0e5e2504925f72502b12838b8228cfdd45/image.jpg",
"https://i2.au.reastatic.net/{size}/12d8b6d3bb5eb40170647f1b81839156eb8526b4c05392158bdbcc6e362a60af/image.jpg",
"https://i2.au.reastatic.net/{size}/c4658347028f409f3e694de3c11d8c84644d5ee4229187cc418bccc26c93dfb7/image.jpg",
"https://i2.au.reastatic.net/{size}/303f8e158603d35ea3c945c5839b437a1548cebec2b7a81eb9bad67593dcc603/image.jpg",
"https://i2.au.reastatic.net/{size}/520ad964d73b7e386c607fc052741ab5fc3b01a2b7b72dc326e614d09bc2d3a5/image.jpg",
"https://i2.au.reastatic.net/{size}/2ac18df655fa961410a2e80d239006ba3860732f1a26d0df4b1f5e51486662f2/image.jpg",
"https://i2.au.reastatic.net/{size}/f53337ce77b54ab95b1a5ea4f679550224defcacdf2344ae8652680382c424cb/image.jpg",
"https://i2.au.reastatic.net/{size}/5249ce376abccad84d0b4f3ce3254579761b4aaffc0ef09c587cf884e6008efc/image.jpg",
"https://i2.au.reastatic.net/{size}/a740d6d1e484c3ae3c51b3670f02a967929ad61771383332998271f69050460c/image.jpg",
"https://i2.au.reastatic.net/{size}/cc1255b415aaee3c4ea82a12aaf653141614dc0297ffe434726a82aeed4b6f75/image.jpg"
],
"videos": null,
"floorplans": null,
"listingCompany": {
"name": "Renowned Real Estate - CRAIGIEBURN",
"id": "PGCQAA",
"companyLink": "https://www.realestate.com.au/agency/renowned-real-estate-craigieburn-PGCQAA?cid={cid}",
"phoneNumber": "0452060566",
"address": "9 Gauja Street, CRAIGIEBURN, VIC 3064",
"ratingsReviews": {
"avgRating": null,
"totalReviews": 0,
"__typename": "AgencyRatingsReviews"
},
"description": null
},
"listers": [
{
"id": "3307736",
"name": "Him Raj Parajuli",
"photo": {
"templatedUrl": "https://i2.au.reastatic.net/{size}/03527ad948f2ec46b10b220c44fa1007b0dc0eded8119733c9135b0be21547f8/main.jpg",
"__typename": "Image"
},
"phoneNumber": {
"display": "0452060566",
"showDisclaimer": false,
"__typename": "PhoneNumber"
},
"_links": {
"canonical": {
"href": "https://www.realestate.com.au/agent/him-raj-parajuli-3307736?cid={cid}",
"__typename": "AbsoluteLinks"
},
"__typename": "ListerLinks"
},
"__typename": "Lister",
"agentId": null,
"jobTitle": "OIEC/Director",
"showInMediaViewer": false,
"listerRatingsReviews": {
"avgRating": null,
"totalReviews": 0,
"__typename": "ListerRatingsReviews"
}
},
{
"id": "3307760",
"name": "Aman Pakhrin",
"photo": {
"templatedUrl": "https://i2.au.reastatic.net/{size}/6b365a8a0ffa9ec976671759a15d136b796ba44f8b973a105b8aabac7ca857e9/main.jpg",
"__typename": "Image"
},
"phoneNumber": {
"display": "0450939749",
"showDisclaimer": false,
"__typename": "PhoneNumber"
},
"_links": {
"canonical": {
"href": "https://www.realestate.com.au/agent/aman-pakhrin-3307760?cid={cid}",
"__typename": "AbsoluteLinks"
},
"__typename": "ListerLinks"
},
"__typename": "Lister",
"agentId": null,
"jobTitle": "Sales Director",
"showInMediaViewer": false,
"listerRatingsReviews": {
"avgRating": null,
"totalReviews": 0,
"__typename": "ListerRatingsReviews"
}
}
],
"auction": null
}
]
Our realestate.com.au scraper can successfully scrape property pages. Let's scrape search pages so we can discover properties according to our preferences next!
How to Scrape Realestate.com.au Search Pages
Just like property pages, we can find the search page data as JSON under script tags. To see this data, let's take the same approach we did earlier. Search for any properties on the website, inspect the page HTML using developer tools and scroll down to the script tag with the text window.ArgonautExchange.
After parsing the data inside the script tag, the data should look like this:
The URL used for the above search page is the following:
The parameter /list-1 represents the search page number. We'll use it within our scraper to scrape multiple search pages:
Python
ScrapFly
import re
import json
import asyncio
import jmespath
from httpx import AsyncClient, Response
from parsel import Selector
from typing import List, Dict
client = AsyncClient(
# the remaining client config
)
def parse_property_data(data: Dict) -> Dict:
"""refine property data from JSON"""
# the rest of the function
def parse_hidden_data(response: Response) -> Dict:
"""parse JSON data from script tag"""
# the rest of the function
def parse_search_data(data: List[Dict]) -> List[Dict]:
"""refine search data"""
search_data = []
data = list(data.values())[0]
for listing in data["results"]["exact"]["items"]:
# refine each property listing in the search results
search_data.append(parse_property_data(listing["listing"]))
max_search_pages = data["results"]["pagination"]["maxPageNumberAvailable"]
return {"search_data": search_data, "max_search_pages": max_search_pages}
async def scrape_search(url: str, max_scrape_pages: int = None):
"""scrape property listings from search pages"""
first_page = await client.get(url)
assert first_page.status_code == 200, "request has been blocked"
print(f"scraping search page {url}")
data = parse_hidden_data(first_page)
data = parse_search_data(data)
search_data = data["search_data"]
# get the number of maximum search pages
max_search_pages = data["max_search_pages"]
# scrape all available pages if not max_scrape_pages or max_scrape_pages > max_search_pages
if max_scrape_pages and max_scrape_pages < max_search_pages:
max_scrape_pages = max_scrape_pages
else:
max_scrape_pages = max_search_pages
print(f"scraping search pagination, remaining ({max_scrape_pages - 1} more pages)")
# add the remaining search pages in a scraping list
other_pages = [client.get(str(first_page.url).split("/list")[0] + f"/list-{page}") for page in max_scrape_pages + 1]
# scrape the remaining search pages concurrently
for response in asyncio.as_completed(other_pages):
response = await response
assert response.status_code == 200, "request has been blocked"
data = parse_hidden_data(response)
search_data.extend(parse_search_data(data)["search_data"])
print(f"scraped ({len(search_data)}) from {url}")
return search_data
import re
import json
import jmespath
from typing import Dict, List
from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse
SCRAPFLY = ScrapflyClient(key="Your ScrapFly API key")
def parse_property_data(data: Dict) -> Dict:
"""refine property data from JSON"""
# the rest of the function
def parse_hidden_data(response: ScrapeApiResponse) -> Dict:
"""parse JSON data from script tag"""
# the rest of the function
def parse_search_data(data: List[Dict]) -> List[Dict]:
"""refine search data"""
search_data = []
data = list(data.values())[0]
for listing in data["results"]["exact"]["items"]:
# refine each property listing in the search results
search_data.append(parse_property_data(listing["listing"]))
max_search_pages = data["results"]["pagination"]["maxPageNumberAvailable"]
return {"search_data": search_data, "max_search_pages": max_search_pages}
async def scrape_search(url: str, max_scrape_pages: int = None):
"""scrape property listings from search pages"""
first_page = await SCRAPFLY.async_scrape(ScrapeConfig(url, country="AU", asp=True))
print(f"scraping search page {url}")
data = parse_hidden_data(first_page)
data = parse_search_data(data)
search_data = data["search_data"]
# get the number of maximum search pages
max_search_pages = data["max_search_pages"]
# scrape all available pages if not max_scrape_pages or max_scrape_pages > max_search_pages
if max_scrape_pages and max_scrape_pages < max_search_pages:
max_scrape_pages = max_scrape_pages
else:
max_scrape_pages = max_search_pages
print(f"scraping search pagination, remaining ({max_scrape_pages - 1} more pages)")
# add the remaining search pages in a scraping list
other_pages = [
ScrapeConfig(
str(first_page.context["url"]).split("/list")[0] + f"/list-{page}",
country="AU", asp=True
)
for page in range(2, max_scrape_pages + 1)
]
# scrape the remaining search pages concurrently
async for response in SCRAPFLY.concurrent_scrape(other_pages):
data = parse_hidden_data(response)
search_data.extend(parse_search_data(data)["search_data"])
print(f"scraped ({len(search_data)}) from {url}")
return search_data
Run the code
async def run():
data = await scrape_search(
url="https://www.realestate.com.au/buy/in-melbourne+-+northern+region,+vic/list-1",
max_scrape_pages=3
)
# print the data in JSON format
print(json.dumps(data, indent=2))
if __name__ == "__main__":
asyncio.run(run())
This code is almost the same as the previous one, but we added two new functions:
parse_search_data() to refine the search we got using the JMESPath we created earlier.
scrape_search() to crawl over search pages by scraping the first search first then scraping the remaining search pages concurrently.
The result is a list containing property listings found on three search pages, similar to this:
Sample output
[
{
"id": "143029712",
"propertyType": "House",
"description": "Set in the sought-after Aurora Estate and in a prime location close to all amenities including the newly opened Aurora Village and Edgars Creek Secondary School, Epping plaza, Northern Hospital and easy freeway access, everything you need is just a stone’s throw away!<br/><br/>This spacious home comprises of four generous sized bedrooms all with built in robes (master with walk-in robe and full en-suite), light filled kitchen with 900mm stainless steel appliances, stone benchtops, open plan generous sized meals/living area, multiple living zones, central bathroom with separate shower/bath and stone benchtop, ample storage space, ducted heating, alarm system, double garage with internal access and low maintenance front and rear yards.<br/><br/>This home is sure to impress, inspections will not disappoint!<br/><br/>What's more to love?<br/>- Low maintenance<br/>- 900mm stainless steel appliances<br/>- Evaporative cooling<br/>- Central heating<br/>- Multiple living zones<br/><br/>POTENTIAL RENTAL INCOME: $550 A WEEK",
"propertyLink": "https://www.realestate.com.au/property-house-vic-wollert-143029712",
"address": {
"display": {
"shortAddress": "12 Geary Avenue",
"fullAddress": "12 Geary Avenue, Wollert, Vic 3750",
"__typename": "AddressDisplay"
},
"suburb": "Wollert",
"state": "Vic",
"postcode": "3750",
"__typename": "Address"
},
"propertySizes": {
"building": {
"displayValue": "195.1",
"sizeUnit": {
"displayValue": "m²",
"__typename": "PropertySizeUnit"
},
"__typename": "PropertySize"
},
"land": {
"displayValue": "331",
"sizeUnit": {
"displayValue": "m²",
"__typename": "PropertySizeUnit"
},
"__typename": "PropertySize"
},
"preferred": {
"sizeType": "LAND",
"size": {
"displayValue": "331",
"sizeUnit": {
"displayValue": "m²",
"__typename": "PropertySizeUnit"
},
"__typename": "PropertySize"
},
"__typename": "PreferredPropertySize"
},
"__typename": "PropertySizes"
},
"generalFeatures": {
"bedrooms": {
"value": 4,
"__typename": "IntValue"
},
"bathrooms": {
"value": 2,
"__typename": "IntValue"
},
"parkingSpaces": {
"value": 2,
"__typename": "IntValue"
},
"studies": {
"value": 0,
"__typename": "IntValue"
},
"__typename": "GeneralFeatures"
},
"propertyFeatures": null,
"images": [
"https://i2.au.reastatic.net/{size}/a69720736c21a81214fb1ae5f2469bf22cd3cd90967f650013536bcb5cc00094/image.jpg",
"https://i2.au.reastatic.net/{size}/ffa1c7249947822b15a3c59a7b939792310922152aeebed7b8166fc6e1dca217/image.jpg",
"https://i2.au.reastatic.net/{size}/9f4256aecccc71331d7b8aab9a2bca15760c4e054e76290e2cf26850f260a2d3/image.jpg",
"https://i2.au.reastatic.net/{size}/fa5c52de77979f2d972b4382f45d21b89231b50c7687820941452ce8928bb69b/image.jpg",
"https://i2.au.reastatic.net/{size}/cebccbfd72ca5cb0c24161540b298cf6985532b87cc89210e41b6301eb008b77/image.jpg",
"https://i2.au.reastatic.net/{size}/0bbc9779f0ce181bf8138cddeec69e9e25639ac45eabfdd4c60a99f795c07065/image.jpg",
"https://i2.au.reastatic.net/{size}/4e18b9cd82baf5b68855edd9a247d6ba032f0099a79ed17924ae3fe11ab0db32/image.jpg",
"https://i2.au.reastatic.net/{size}/862f6671e3fb644655f0385b0b8b55bd8fd17458def73afbbda4648e1cd89072/image.jpg",
"https://i2.au.reastatic.net/{size}/af79d30f3a6a4387c71be878db32a7383b62d8bf0ab8da92a4567658756352cd/image.jpg",
"https://i2.au.reastatic.net/{size}/e7da34de1128125377c71883fedd6288ef1c65543723e16049b6c327a5e2a324/image.jpg",
"https://i2.au.reastatic.net/{size}/269a976c3c0a2a0273e1b47139c3861a3653e957b83aea3262bcd5f2a7541313/image.jpg",
"https://i2.au.reastatic.net/{size}/4f9f311dc06ffee98c7c5da82e07d26e1f80e5a17819e5bd44e53c888c01224e/image.jpg"
],
"videos": null,
"floorplans": null,
"listingCompany": {
"name": "Carvera Property",
"id": "ORNIKX",
"companyLink": "https://www.realestate.com.au/agency/carvera-property-ORNIKX?cid={cid}",
"phoneNumber": "0466229631",
"address": "G01/6-8 Montrose St, HAWTHORN EAST, VIC 3123",
"ratingsReviews": {
"avgRating": 5,
"totalReviews": 23,
"__typename": "AgencyRatingsReviews"
},
"description": null
},
"listers": [
{
"id": "3084543",
"name": "Chad Gamage",
"photo": {
"templatedUrl": "https://i2.au.reastatic.net/{size}/f8a10fa6c4ce2df0d8901c087ece63b07a32fc21362d73c6702e9fc65090d780/main.jpg",
"__typename": "Image"
},
"phoneNumber": {
"display": "0424876263",
"showDisclaimer": false,
"__typename": "PhoneNumber"
},
"_links": {
"canonical": {
"href": "https://www.realestate.com.au/agent/chad-gamage-3084543?cid={cid}",
"__typename": "AbsoluteLinks"
},
"__typename": "ListerLinks"
},
"__typename": "Lister",
"agentId": null,
"jobTitle": "Sales Manager",
"showInMediaViewer": false
},
{
"id": "3243944",
"name": "Stalon Ablahad",
"photo": {
"templatedUrl": "https://i2.au.reastatic.net/{size}/e8b77f7268a0aa114c0f3d0caed4392e4b06d13978a11644527bcf4a2cf39da5/main.jpg",
"__typename": "Image"
},
"phoneNumber": {
"display": "0466659650",
"showDisclaimer": false,
"__typename": "PhoneNumber"
},
"_links": {
"canonical": {
"href": "https://www.realestate.com.au/agent/stalon-ablahad-3243944?cid={cid}",
"__typename": "AbsoluteLinks"
},
"__typename": "ListerLinks"
},
"__typename": "Lister",
"agentId": null,
"jobTitle": "Sales Executive",
"showInMediaViewer": false
}
],
"auction": null
}
]
We can successfully scrape real estate listing data from realestate.com.au search and property pages. However, our scraper will likely get blocked after sending a few additional requests. Let's take a look at a solution!
How to Bypass Realestate.com.au Scraping Blocking
To bypass web scraping blocking, we need to pay attention to several details, including IP address, TLS handshakes, headers and cookies. This is where Scrapfly can lend you a hand!
To wrap up this guide, let's take a look at some frequently asked questions.
Is it legal to scrape realestate.com.au?
Scraping publicly available real estate data is legal however it should be confirmed with the Terms of Service agreement if it applies to you and your use case. For more see our web scraping legality page.
Is there a public API for realestate.com.au?
At the time of writing, there is no public API available for realestate.com.au. However, scraping realestate.com.au is straightforward and you can use it to create your own web scraping API.
Are there alternatives for realestate.com.au?
Yes, there are alternative websites for real estate ads in Australia. Check out our tag #realestate for more options.
Realestate.com.au is a popular website for real estate ads in Australia, which can detect and block web scrapers.
In this article, we explained how to avoid realestate.com.au web scraping blocking. We also went through a step-by-step guide on creating a realestate.com.au scraper for property and search pages using Python. Which works by extracting the property listing data directly in JSON from the HTML.
Learn about the fundamentals of parsing data, across formats like JSON, XML, HTML, and PDFs. Learn how to use Python parsers and AI models for efficient data extraction.