πŸš€ We are hiring! See open positions
How to Scrape AutoScout24
AutoScout24 is Europe's biggest car marketplace. You'll find millions of listings with prices, specs, features, and seller details. If you need data for pricing research, inventory tracking, or quick market checks, it's a solid source to scrape.

This guide shows how to scrape AutoScout24 with Python. We'll keep it practical: what works, what can break, and code you can run right away.

Why Scrape AutoScout24?

Car dealers watch prices and inventory across markets. Makers and resellers track competitors. Researchers follow availability by body type and trim. AutoScout24 exposes a lot of the info you need: title, price, mileage, year, specs, and seller details.

Understanding AutoScout24's Structure

AutoScout24 is a modern JS site, so a lot of content loads after the first HTML. It also has strong bot protection. Expect some 403s and changing selectors, and plan for that.

Project Setup

We'll use a few Python libraries:

  • requests - HTTP library for making web requests
  • BeautifulSoup - HTML parsing library
  • json - For parsing JSON data embedded in pages

Install the required dependencies:

$ pip install requests beautifulsoup4

Example 1: Scraping Car Listings by Body Type

First, we'll scrape compact car listings from a category page and pull the basics (title, price, mileage, year, etc.).

Setting Up the Listings Scraper

Set up a simple listings scraper.

1. Prerequisites

First, install the required dependencies:

$ pip install requests beautifulsoup4

2. Basic Setup and User Agent Rotation

Create a file called scrape_autoscout24_listings.py and start with the basic setup:

import requests
from bs4 import BeautifulSoup
import json
import re
import random
import time

# Simple list of user agents to rotate
user_agents = [
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.2227.0 Safari/537.36',
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.3497.92 Safari/537.36',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36',
]

# Target URL for compact cars
url = "https://www.autoscout24.com/lst/c/compact"

# Create session with random user agent
session = requests.Session()
session.headers.update({
    "User-Agent": random.choice(user_agents),
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Accept-Language": "en-US,en;q=0.5",
    "Accept-Encoding": "gzip, deflate, br",
    "Connection": "keep-alive",
    "Upgrade-Insecure-Requests": "1"
})

3. Request Handling Function

This function makes the request and checks if the page is reachable.

def make_request(url):
    """Make a request to the AutoScout24 listings page"""
    try:
        # Add random delay to avoid detection
        time.sleep(random.uniform(1, 3))
        
        response = session.get(url, timeout=15)
        
        # Check if blocked
        if response.status_code == 403:
            print("  ❌ Blocked (403 Forbidden)")
            return None
        
        # Check if successful
        if response.status_code == 200:
            print("  βœ… Successfully accessed page")
            return response
        else:
            print(f"  ❌ Error: Status code {response.status_code}")
            return None
            
    except Exception as e:
        print(f"  ❌ Error: {e}")
        return None

4. Extracting Car Listings

This pulls individual car listings from the page: title, price, link, and key details.

def extract_car_listings(soup):
    """Extract car listings from the search results page"""
    listings = []
    
    # Find all car listing containers
    # AutoScout24 uses article tags with specific classes for car listings
    car_articles = soup.find_all('article', class_='cldt-summary-full-item')
    
    print(f"  Found {len(car_articles)} car listings")
    
    for article in car_articles:
        try:
            # Extract car title from the title link
            title_link = article.find('a', class_='ListItem_title__ndA4s')
            if title_link:
                title_elem = title_link.find('h2')
                if title_elem:
                    # Combine all span elements to get full title
                    title_spans = title_elem.find_all('span')
                    title = ' '.join([span.get_text().strip() for span in title_spans if span.get_text().strip()])
                else:
                    title = title_link.get_text().strip()
            else:
                title = "N/A"
            
            # Extract price from the price element
            price_elem = article.find('p', class_='Price_price__APlgs')
            price = price_elem.get_text().strip() if price_elem else "N/A"
            
            # Extract mileage from the vehicle details table
            mileage_elem = article.find('span', attrs={'data-testid': 'VehicleDetails-mileage_road'})
            mileage = mileage_elem.get_text().strip() if mileage_elem else "N/A"
            
            # Extract registration year from the vehicle details table
            year_elem = article.find('span', attrs={'data-testid': 'VehicleDetails-calendar'})
            year = year_elem.get_text().strip() if year_elem else "N/A"
            
            # Extract fuel type from the vehicle details table
            fuel_elem = article.find('span', attrs={'data-testid': 'VehicleDetails-gas_pump'})
            fuel_type = fuel_elem.get_text().strip() if fuel_elem else "N/A"
            
            # Extract transmission from the vehicle details table
            transmission_elem = article.find('span', attrs={'data-testid': 'VehicleDetails-transmission'})
            transmission = transmission_elem.get_text().strip() if transmission_elem else "N/A"
            
            # Extract power from the vehicle details table
            power_elem = article.find('span', attrs={'data-testid': 'VehicleDetails-speedometer'})
            power = power_elem.get_text().strip() if power_elem else "N/A"
            
            # Extract link to detailed page
            link_elem = article.find('a', class_='ListItem_title__ndA4s')
            link = "https://www.autoscout24.com" + link_elem['href'] if link_elem else None
            
            # Extract seller information
            seller_name_elem = article.find('span', class_='SellerInfo_name__nR9JH')
            seller_name = seller_name_elem.get_text().strip() if seller_name_elem else "N/A"
            
            seller_address_elem = article.find('span', class_='SellerInfo_address__leRMu')
            seller_address = seller_address_elem.get_text().strip() if seller_address_elem else "N/A"
            
            listing_data = {
                'title': title,
                'price': price,
                'mileage': mileage,
                'year': year,
                'fuel_type': fuel_type,
                'transmission': transmission,
                'power': power,
                'seller_name': seller_name,
                'seller_address': seller_address,
                'link': link
            }
            
            listings.append(listing_data)
            
            print(f"    β€’ {title} - {price} - {mileage} - {year} - {fuel_type}")
            
        except Exception as e:
            print(f"    ❌ Error extracting listing: {e}")
            continue
    
    return listings

5. Main Scraping Function

This ties the request, parsing, and extraction together for the listings page.

def scrape_listings(url):
    """Main function to scrape car listings from AutoScout24"""
    print(f"\nScraping listings from: {url}")
    
    # Make request
    response = make_request(url)
    if not response:
        return None
    
    # Parse HTML
    soup = BeautifulSoup(response.content, 'html.parser')
    
    # Extract listings
    listings = extract_car_listings(soup)
    
    return listings

6. Main Execution

The main execution function manages the overall scraping workflow and handles the results.

def main():
    """Main execution function"""
    print("πŸš— Starting AutoScout24 Compact Cars Scraper")
    
    # Scrape listings
    listings = scrape_listings(url)
    
    if listings:
        print(f"\nβœ… Successfully scraped {len(listings)} car listings!")
        
        return listings
    else:
        print("❌ Failed to scrape listings")
        return None

# Run the scraper
if __name__ == "__main__":
    main()
Example Output

πŸš— Starting AutoScout24 Compact Cars Scraper

Scraping listings from: https://www.autoscout24.com/lst/c/compact
βœ… Successfully accessed https://www.autoscout24.com/lst/c/compact
Found 19 car listings
β€’ Peugeot 207 Filou MOTORSCHADEN!!!!! - € 499 - 174,000 km - 01/2008 - Gasoline
β€’ Renault Clio 1.2 RN - € 999 - 142,875 km - 07/2000 - Gasoline
β€’ Volkswagen Polo 1.4-16V Highline - € 6,350 - 116,950 km - 10/2009 - Gasoline
β€’ Peugeot 208 GTi - € 4,990 - 111,846 km - 11/2013 - Gasoline
β€’ Peugeot 208 1.2 VTi Active 1e Eigenaar,Airco,Cruise,PDC,Trekha - € 4,449 - 124,752 km - 03/2014 - Gasoline
β€’ Peugeot 207 1.4-16V Color-line - € 1,249 - 228,423 km - 02/2008 - Gasoline
β€’ Volkswagen Polo 1.0 Comfortline - € 6,450 - 182,454 km - 10/2016 - Gasoline
β€’ Fiat 500 1.2 Naked Panodak Clima Lmv Koopje! - € 1,995 - 207,112 km - 01/2008 - Gasoline
β€’ Renault Twingo 1.2 PrivilΓ¨ge | Handelsauto | Recent nieuwe distri - € 1,250 - 75,562 km - 06/2005 - Gasoline
β€’ Nissan Micra 1.2 - € 980 - 190,582 km - 03/2004 - Gasoline
β€’ Kia Picanto 1.0 CVVT ISG Comfort Pack 2e Eigenaar,Airco,Elektr - € 4,749 - 93,864 km - 08/2013 - Gasoline
β€’ Kia Picanto 1.0 CVVT Design Edition Airco 5-Deurs Origineel NL - € 4,900 - 121,292 km - 01/2013 - Gasoline
β€’ Ford Fiesta 1.6 Ghia 120PK,Stoelverwarming,Airco,ElektrischeRa - € 4,749 - 153,420 km - 03/2009 - Gasoline
β€’ Peugeot 107 1.0-12V XR | Airco | Toerenteller | 5drs | - € 2,450 - 170,306 km - 10/2009 - Gasoline
β€’ Volkswagen Golf R-line|Clima|Stoelverwarming|PDC - € 6,950 - 154,629 km - 09/2012 - Gasoline
β€’ Volkswagen Golf 1.2 TSI BlueMotion, airco, navi, bleutooth, APK 07 - € 3,995 - 240,200 km - 01/2012 - Gasoline
β€’ Volkswagen Polo 1.2 TSI BlueMotion Highline - € 4,949 - 220,565 km - 01/2014 - Gasoline
β€’ Ford Fiesta 1.25 Trend Trekhaak,Airco,Stoelverwarming,Elektris - € 3,499 - 150,133 km - 09/2010 - Gasoline
β€’ Fiat 500 0.9 TwinAir Lounge | Wit Parelmoer | PanoDak | Air - € 4,950 - 96,486 km - 11/2011 - Gasoline

βœ… Successfully scraped 19 car listings!

Example 2: Scraping Individual Car Details

Next, we'll scrape a single car page to get detailed specs, features, and seller info.

Setting Up the Individual Car Scraper

We'll create a small scraper for individual car pages to extract detailed vehicle information.

1. Prerequisites

The same dependencies as before:

$ pip install requests beautifulsoup4

2. Basic Setup for Individual Car Scraping

Create a file called scrape_autoscout24_car.py and start with the basic setup:

import requests
from bs4 import BeautifulSoup
import json
import re
import random
import time

# Simple list of user agents to rotate
user_agents = [
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.2227.0 Safari/537.36',
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.3497.92 Safari/537.36',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36',
]

# Target URL for individual car
url = "https://www.autoscout24.com/offers/peugeot-207-filou-motorschaden-gasoline-0b93e496-1f1b-475d-a972-fa4bd490031d"

# Create session with random user agent
session = requests.Session()
session.headers.update({
    "User-Agent": random.choice(user_agents),
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Accept-Language": "en-US,en;q=0.5",
    "Accept-Encoding": "gzip, deflate, br",
    "Connection": "keep-alive",
    "Upgrade-Insecure-Requests": "1"
})

3. Request Handling Function

This function requests an individual car page and checks for blocks.

def make_request(url):
    """Make a request to the AutoScout24 car detail page"""
    try:
        # Add random delay to avoid detection
        time.sleep(random.uniform(2, 4))
        
        response = session.get(url, timeout=15)
        
        # Check if blocked
        if response.status_code == 403:
            print("  ❌ Blocked (403 Forbidden)")
            return None
        
        # Check if successful
        if response.status_code == 200:
            print("  βœ… Successfully accessed page")
            return response
        else:
            print(f"  ❌ Error: Status code {response.status_code}")
            return None
            
    except Exception as e:
        print(f"  ❌ Error: {e}")
        return None

4. Extracting Basic Car Information

This reads the title, price, and a few basics.

def extract_basic_info(soup):
    """Extract basic car information from the detail page"""
    car_data = {}
    
    # Extract car title from the stage title
    title_elem = soup.find('h1', class_='StageTitle_title__ROiR4')
    if title_elem:
        # Get the make and model from the bold classified info
        make_model_elem = title_elem.find('span', class_='StageTitle_boldClassifiedInfo__sQb0l')
        model_version_elem = title_elem.find('div', class_='StageTitle_modelVersion__Yof2Z')
        
        if make_model_elem and model_version_elem:
            car_data['title'] = f"{make_model_elem.get_text().strip()} {model_version_elem.get_text().strip()}"
        elif make_model_elem:
            car_data['title'] = make_model_elem.get_text().strip()
        else:
            car_data['title'] = title_elem.get_text().strip()
        
        print(f"  Car: {car_data['title']}")
    else:
        car_data['title'] = "Not found"
        print("  Car: Not found")
    
    # Extract price from the price section
    price_elem = soup.find('span', class_='PriceInfo_price__XU0aF')
    if price_elem:
        car_data['price'] = price_elem.get_text().strip()
        print(f"  Price: {car_data['price']}")
    else:
        car_data['price'] = "Not found"
        print("  Price: Not found")
    
    # Extract mileage from the vehicle overview
    mileage_elem = soup.find('div', class_='VehicleOverview_itemContainer__XSLWi')
    if mileage_elem:
        # Find the mileage item by looking for the mileage icon and text
        mileage_items = soup.find_all('div', class_='VehicleOverview_itemContainer__XSLWi')
        for item in mileage_items:
            title_elem = item.find('div', class_='VehicleOverview_itemTitle__S2_lb')
            if title_elem and 'Mileage' in title_elem.get_text():
                text_elem = item.find('div', class_='VehicleOverview_itemText__AI4dA')
                if text_elem:
                    car_data['mileage'] = text_elem.get_text().strip()
                    print(f"  Mileage: {car_data['mileage']}")
                    break
        else:
            car_data['mileage'] = "Not found"
            print("  Mileage: Not found")
    else:
        car_data['mileage'] = "Not found"
        print("  Mileage: Not found")
    
    # Extract registration year from the vehicle overview
    registration_items = soup.find_all('div', class_='VehicleOverview_itemContainer__XSLWi')
    for item in registration_items:
        title_elem = item.find('div', class_='VehicleOverview_itemTitle__S2_lb')
        if title_elem and 'First registration' in title_elem.get_text():
            text_elem = item.find('div', class_='VehicleOverview_itemText__AI4dA')
            if text_elem:
                car_data['year'] = text_elem.get_text().strip()
                print(f"  Year: {car_data['year']}")
                break
    else:
        car_data['year'] = "Not found"
        print("  Year: Not found")
    
    return car_data

5. Extracting Technical Specifications

This collects the technical specs from the overview and technical sections.

def extract_specifications(soup):
    """Extract technical specifications from the car detail page"""
    specifications = {}
    
    # Extract specifications from the vehicle overview section
    overview_items = soup.find_all('div', class_='VehicleOverview_itemContainer__XSLWi')
    
    for item in overview_items:
        title_elem = item.find('div', class_='VehicleOverview_itemTitle__S2_lb')
        text_elem = item.find('div', class_='VehicleOverview_itemText__AI4dA')
        
        if title_elem and text_elem:
            title = title_elem.get_text().strip()
            value = text_elem.get_text().strip()
            
            if 'Fuel type' in title:
                specifications['fuel_type'] = value
                print(f"  Fuel Type: {value}")
            elif 'Gearbox' in title:
                specifications['transmission'] = value
                print(f"  Transmission: {value}")
            elif 'Power' in title:
                specifications['power'] = value
                print(f"  Power: {value}")
    
    # Extract additional specifications from the technical data section
    tech_section = soup.find('section', attrs={'data-cy': 'technical-details-section'})
    if tech_section:
        # Find all dt/dd pairs in the technical data
        dt_elements = tech_section.find_all('dt', class_='DataGrid_defaultDtStyle__soJ6R')
        dd_elements = tech_section.find_all('dd', class_='DataGrid_defaultDdStyle__3IYpG')
        
        for dt, dd in zip(dt_elements, dd_elements):
            title = dt.get_text().strip()
            value = dd.get_text().strip()
            
            if 'Engine size' in title:
                specifications['engine_size'] = value
                print(f"  Engine Size: {value}")
            elif 'Cylinders' in title:
                specifications['cylinders'] = value
                print(f"  Cylinders: {value}")
            elif 'Power' in title and 'power' not in specifications:
                specifications['power'] = value
                print(f"  Power: {value}")
            elif 'Gearbox' in title and 'transmission' not in specifications:
                specifications['transmission'] = value
                print(f"  Transmission: {value}")
    
    # Extract color information from the color section
    color_section = soup.find('section', attrs={'data-cy': 'color-section'})
    if color_section:
        dt_elements = color_section.find_all('dt', class_='DataGrid_defaultDtStyle__soJ6R')
        dd_elements = color_section.find_all('dd', class_='DataGrid_defaultDdStyle__3IYpG')
        
        for dt, dd in zip(dt_elements, dd_elements):
            title = dt.get_text().strip()
            value = dd.get_text().strip()
            
            if 'Manufacturer colour' in title:
                specifications['color'] = value
                print(f"  Color: {value}")
            elif 'Paint' in title:
                specifications['paint_type'] = value
                print(f"  Paint Type: {value}")
    
    return specifications

6. Extracting Features and Equipment

This gathers the features and equipment list.

def extract_features(soup):
    """Extract car features and equipment from the detail page"""
    features = []
    
    # Find equipment section
    equipment_section = soup.find('section', attrs={'data-cy': 'equipment-section'})
    if equipment_section:
        # Find all dt/dd pairs in the equipment section
        dt_elements = equipment_section.find_all('dt', class_='DataGrid_defaultDtStyle__soJ6R')
        dd_elements = equipment_section.find_all('dd', class_='DataGrid_defaultDdStyle__3IYpG')
        
        for dt, dd in zip(dt_elements, dd_elements):
            category = dt.get_text().strip()
            # Find all li elements in the dd
            feature_items = dd.find_all('li')
            
            if feature_items:
                print(f"    {category}:")
                for item in feature_items:
                    feature_text = item.get_text().strip()
                    if feature_text:
                        features.append(f"{category}: {feature_text}")
                        print(f"      β€’ {feature_text}")
    
    if not features:
        print("  Features: Not found")
    
    return features

7. Extracting Seller Information

This pulls seller details and location.

def extract_seller_info(soup):
    """Extract seller information from the car detail page"""
    seller_data = {}
    
    # Extract seller type from the vehicle overview
    overview_items = soup.find_all('div', class_='VehicleOverview_itemContainer__XSLWi')
    for item in overview_items:
        title_elem = item.find('div', class_='VehicleOverview_itemTitle__S2_lb')
        text_elem = item.find('div', class_='VehicleOverview_itemText__AI4dA')
        
        if title_elem and text_elem and 'Seller' in title_elem.get_text():
            seller_data['type'] = text_elem.get_text().strip()
            print(f"  Seller Type: {seller_data['type']}")
            break
    
    # Extract location from the location link
    location_link = soup.find('a', class_='LocationWithPin_locationItem__tK1m5')
    if location_link:
        seller_data['location'] = location_link.get_text().strip()
        print(f"  Location: {seller_data['location']}")
    else:
        seller_data['location'] = "Not found"
        print("  Location: Not found")
    
    # Extract seller description from the seller notes section
    seller_notes_section = soup.find('section', attrs={'data-cy': 'seller-notes-section'})
    if seller_notes_section:
        content_div = seller_notes_section.find('div', class_='SellerNotesSection_content__te2EB')
        if content_div:
            seller_data['description'] = content_div.get_text().strip()
            print(f"  Description: {seller_data['description'][:100]}...")
        else:
            seller_data['description'] = "Not found"
            print("  Description: Not found")
    else:
        seller_data['description'] = "Not found"
        print("  Description: Not found")
    
    return seller_data

8. Main Scraping Function

This combines all the extraction steps for a single car page.

def scrape_car_details(url):
    """Main function to scrape detailed information from a single car page"""
    print(f"\nScraping car details from: {url}")
    
    # Make request
    response = make_request(url)
    if not response:
        return None
    
    # Parse HTML
    soup = BeautifulSoup(response.content, 'html.parser')
    
    # Extract all data
    basic_info = extract_basic_info(soup)
    specifications = extract_specifications(soup)
    features = extract_features(soup)
    seller_info = extract_seller_info(soup)
    
    # Combine all data
    result = {
        'url': url,
        **basic_info,
        'specifications': specifications,
        'features': features,
        'seller': seller_info
    }
    
    return result

9. Main Execution

The main execution function manages the overall scraping workflow for individual car pages.

def main():
    """Main execution function"""
    print("πŸš— Starting AutoScout24 Individual Car Scraper")
    
    # Scrape car details
    car_data = scrape_car_details(url)
    
    if car_data:
        print(f"\nβœ… Successfully scraped car details!")
        
        # Save results to file
        with open('autoscout24_car_details.json', 'w') as f:
            json.dump(car_data, f, indent=2)
        print("πŸ’Ύ Results saved to autoscout24_car_details.json")
        
        return car_data
    else:
        print("❌ Failed to scrape car details")
        return None

# Run the scraper
if __name__ == "__main__":
    main()
Example Output

πŸš— Starting AutoScout24 Individual Car Scraper

Scraping car details from: https://www.autoscout24.com/offers/peugeot-207-filou-motorschaden-gasoline-0b93e496-1f1b-475d-a972-fa4bd490031d
βœ… Successfully accessed https://www.autoscout24.com/offers/peugeot-207-filou-motorschaden-gasoline-0b93e496-1f1b-475d-a972-fa4bd490031d
Car: Peugeot 207 Filou MOTORSCHADEN!!!!!
Price: € 499
Mileage: 174,000 km
Year: 01/2008
Transmission: Manual
Fuel Type: Gasoline
Power: 70 kW (95 hp)
Engine Size: 1,397 cc
Cylinders: 4
Color: BLEU NEYSHA
Paint Type: Metallic
Comfort & Convenience:
β€’ Power windows
Safety & Security:
β€’ ABS
β€’ Central door lock
β€’ Driver-side airbag
β€’ Passenger-side airbag
β€’ Power steering
β€’ Side airbag
Extras:
β€’ Alloy wheels
Seller Type: Dealer
Location: Berlin
Description: Sonderausstattung:MOTOR DREHT NICHT!!!!!!Metallic-Lackierung, ALUFELGEN, u.s.w.Weitere Ausstattung:A...

βœ… Successfully scraped car details!

Handling Anti-Bot Protection

AutoScout24 has strong anti-bot checks (IP-based rules and JS content). Here are a few simple ways to reduce blocks.

1. User Agent Rotation

Rotate a few realistic user agents to avoid sending every request with the exact same fingerprint.

user_agents = [
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.2227.0 Safari/537.36',
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.3497.92 Safari/537.36',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36',
]

session.headers.update({
    "User-Agent": random.choice(user_agents),
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Accept-Language": "en-US,en;q=0.5"
})

2. Session Management

Use a session to keep cookies and reuse connections so your traffic looks more like a real browser session.

session = requests.Session()
session.headers.update({
    "User-Agent": random.choice(user_agents),
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Accept-Language": "en-US,en;q=0.5",
    "Accept-Encoding": "gzip, deflate, br",
    "Connection": "keep-alive",
    "Upgrade-Insecure-Requests": "1"
})

3. Rate Limiting

Add small random delays between requests to avoid hammering the server.

import time

for url in urls:
    # Add random delay between requests
    time.sleep(random.uniform(1, 3))
    
    # ... scraping code ...

For more advanced anti-blocking techniques, see our guide on

5 Tools to Scrape Without Blocking and How it All Works

Tutorial on how to avoid web scraper blocking. What is javascript and TLS (JA3) fingerprinting and what role request headers play in blocking.

5 Tools to Scrape Without Blocking and How it All Works

which covers TLS fingerprinting, IP rotation, and other detection methods.

Advanced Scraping Techniques

For bigger jobs, consider these additions.

1. Proxy Rotation

For large-scale scraping, use rotating proxies. This technique helps distribute requests across multiple IP addresses to avoid blocking.

proxies = {
    'http': 'http://proxy1:port',
    'https': 'https://proxy1:port'
}

response = session.get(url, proxies=proxies, timeout=15)

2. Data Storage and Analysis

Save scraped data to files so you can process and analyze it later.

import json
import csv

def save_data_json(data, filename):
    """Save data to JSON file"""
    with open(filename, 'w') as f:
        json.dump(data, f, indent=2)

def save_data_csv(data, filename):
    """Save data to CSV file"""
    if data and len(data) > 0:
        with open(filename, 'w', newline='', encoding='utf-8') as f:
            writer = csv.DictWriter(f, fieldnames=data[0].keys())
            writer.writeheader()
            writer.writerows(data)

# Collect data
scraped_data = []
for url in urls:
    # ... scraping code ...
    car_data = {
        'title': title,
        'price': price,
        'mileage': mileage,
        'year': year,
        'location': location
    }
    scraped_data.append(car_data)

3. Error Handling and Retry Logic

Add simple retries with backoff to handle temporary errors.

import time
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

def create_session_with_retries():
    """Create a session with retry logic"""
    session = requests.Session()
    
    # Configure retry strategy
    retry_strategy = Retry(
        total=3,
        backoff_factor=1,
        status_forcelist=[429, 500, 502, 503, 504],
    )
    
    adapter = HTTPAdapter(max_retries=retry_strategy)
    session.mount("http://", adapter)
    session.mount("https://", adapter)
    
    return session

For more advanced data processing and analysis techniques, see our guide on

In this example web scraping project we'll be taking a look at monitoring E-Commerce trends using Python, web scraping and data visualization tools.

How to Observe E-Commerce Trends using Web Scraping

Scraping with Scrapfly

ScrapFly provides web scraping, screenshot, and extraction APIs for data collection at scale.


scrapfly middleware

If you don't want to manage proxies and blocks yourself, Scrapfly's API can handle the heavy lifting.

Here's how to use Scrapfly for AutoScout24:

from scrapfly import ScrapflyClient, ScrapeConfig, ScrapeApiResponse

scrapfly = ScrapflyClient(key="YOUR-SCRAPFLY-KEY")

# Scrape car listings
result: ScrapeApiResponse = scrapfly.scrape(ScrapeConfig(
    tags=["autoscout24", "car-listings"],
    format="json",
    asp=True,
    render_js=True,
    url="https://www.autoscout24.com/lst/c/compact"
))

print(result)

# Scrape individual car details
car_result: ScrapeApiResponse = scrapfly.scrape(ScrapeConfig(
    tags=["autoscout24", "car-details"],
    format="json",
    asp=True,
    render_js=True,
    url="https://www.autoscout24.com/offers/peugeot-207-filou-motorschaden-gasoline-0b93e496-1f1b-475d-a972-fa4bd490031d"
))

print(car_result)

Best Practices and Tips

A few practical tips:

  1. Respect robots.txt: Always check and follow the website's robots.txt file
  2. Implement delays: Use random delays between requests to avoid detection
  3. Handle errors gracefully: Implement proper error handling for network issues
  4. Monitor success rates: Track scraping success rates and adjust strategies accordingly
  5. Use proxies: Consider using rotating proxies for large-scale scraping
  6. Validate data: Always validate extracted data for completeness and accuracy
  7. Respect rate limits: Don't overwhelm the server with too many requests
  8. Update selectors: Regularly check and update CSS selectors as the site evolves

For more comprehensive web scraping best practices, see our

Everything to Know to Start Web Scraping in Python Today

Complete introduction to web scraping using Python: http, parsing, AI, scaling and deployment.

Everything to Know to Start Web Scraping in Python Today

If you're interested in scraping other automotive or e-commerce platforms, check out these related guides. These resources provide additional techniques and approaches for different types of websites.

  • Comprehensive guide to scraping Amazon product data

How to Scrape Amazon.com Product Data and Reviews

This scrape guide covers the biggest e-commerce platform in US - Amazon.com. We'll take a look how to scrape product data and reviews in Python, as well as some common challenges, tips and tricks.

How to Scrape Amazon.com Product Data and Reviews
  • Guide to extracting eBay listings and product information

How to Scrape Ebay Using Python (2025 Update)

In this scrape guide we'll be taking a look at Ebay.com - the biggest peer-to-peer e-commerce portal in the world. We'll be scraping product details and product search.

How to Scrape Ebay Using Python (2025 Update)
  • Techniques for scraping Walmart product pages

How to Scrape Walmart.com Product Data (2025 Update)

Tutorial on how to scrape walmart.com product and review data using Python. How to avoid blocking to web scrape data at scale and other tips.

How to Scrape Walmart.com Product Data (2025 Update)
  • Extracting product and review data from Etsy

How to Scrape Etsy.com Product, Shop and Search Data

In this scrapeguide we're taking a look at Etsy.com - a popular e-commerce market for hand crafted and vintage items. We'll be using Python and HTML parsing to scrape search and product data.

How to Scrape Etsy.com Product, Shop and Search Data

FAQ

A few common questions:

What are the main challenges when scraping AutoScout24?

AutoScout24 has strong bot protection and lots of JS-rendered content. Common issues are 403 errors, IP-based blocks, and changing selectors.

What data can I extract from individual AutoScout24 car pages?

Title, price, mileage, year, specs (fuel, transmission, power, color), features, and seller info (name, location, notes).

What can I do to avoid getting blocked?

Rotate user agents, use a session, add delays, and consider proxies. If you want an easier path, use Scrapfly's API.

Summary

We covered how the site is built, two working examples (listings and a single car), and simple anti-blocking steps. Start with requests + BeautifulSoup, add small delays and selector checks, and use proxies if needed. For hands-off scaling, try Scrapfly.

Explore this Article with AI

Related Knowledgebase

Related Articles