How to scrape images from a website?

To scrape images from a website we can use Python with HTML parsing tools like beautifulsoup to select all <img> elements and save them.

Here's an example using httpx and beautifulsoup (install using pip install httpx beautifulsoup4):

import asyncio
import httpx
from bs4 import BeautifulSoup
from pathlib import Path

async def download_image(url, filepath, client):
    response = await client.get(url)
    print(f"Downloaded {url} to {filepath}")

async def scrape_images(url):
    download_dir = Path('images')
    download_dir.mkdir(parents=True, exist_ok=True)

    async with httpx.AsyncClient() as client:
        response = await client.get(url)
        soup = BeautifulSoup(response.text, "html.parser")
        download_tasks = []
        for img_tag in soup.find_all("img"):
            img_url = img_tag.get("src")  # get image url
            if img_url:
                img_url = response.url.join(img_url)  # turn url absolute
                img_filename = download_dir / Path(str(img_url)).name
                    download_image(img_url, img_filename, client)
        await asyncio.gather(*download_tasks)

# example - scrape all scrapfly blog images:
url = ""

Above we are using httpx.AsyncClient to first retrieve the target page HTML. Then, we extract all src attributes of all <img> elements. Finally, we download all images concurrently and save them to ./images directory.

Question tagged: Python

Related Posts

How to Scrape Property Listing Data

We're taking yet another look at real estate websites. This time we're going down under! is the biggest real estate portal in Australia and let's take a look at how to scrape it.

How to Scrape Real Estate Data is a major real estate website in Germany and it's suprisingly easy to scrape. In this tutorial, we'll be using Python and hidden web data scraping technique to scrape real estate property data.

How to Scrape Real Estate Property Data

For this scrape guide we'll be taking a look at another real estate website in Switzerland - Homegate. For this we'll be using hidden web data scraping and JSON parsing.