Scraping Tools Knowledgebase

Web scraping tools are essential for building web scraping projects. They provide the necessary functionality to extract data from websites, handle requests, parse HTML, and manage data storage.

As scraping is a complex and varied task there are many tools available, each with its own strengths and weaknesses. Some tools are designed for specific tasks, while others are more general-purpose. The choice of tool often depends on the specific requirements of the scraping project.

See our web scraping tool highlights below 👇

Articles Related to Scraping Tools

Comprehensive Guide to OkHttp for Java and Kotlin

Learn how to simplify network communication in Java and Android applications using OkHttp.

HTTP
TOOLS
Comprehensive Guide to OkHttp for Java and Kotlin

Instant Data Scraper Guide - Web Scraping with No Code

Learn how to use tools like Google Sheets, Make.com, and Scrapfly to automate your data collection.

TOOLS
Instant Data Scraper Guide - Web Scraping with No Code

How to Use cURL to Download Files

Master file downloads with curl and discover advanced use cases.

CURL
TOOLS
How to Use cURL to Download Files

What is Charles Proxy and How to Use it?

Learn about of the most popular web debugging proxies — Charles Proxy and what it's capable.

TOOLS
PROXIES
What is Charles Proxy and How to Use it?

cURL vs Wget: Key Differences Explained

curl and wget are both popular terminal tools but often used for different tasks - let's take a look at the differences.

CURL
HTTP
TOOLS
cURL vs Wget: Key Differences Explained

How to Use Tor For Web Scraping

In this article, we'll explain web scraping using Tor. For this, we'll use Tor as a proxy server to change the IP address randomly in either HTTP or SOCKS, as well as using it as a rotating proxy server.

TOOLS
PROXIES
How to Use Tor For Web Scraping

How to Know What Anti-Bot Service a Website is Using?

In this article we'll take a look at two popular tools: WhatWaf and Wafw00f which can identify what WAF service is used.

BLOCKING
TOOLS
How to Know What Anti-Bot Service a Website is Using?

Selenium Wire Tutorial: Intercept Background Requests

In this guide, we'll explore web scraping with Selenium Wire. We'll define what it is, how to install it, and how to use it to inspect and manipulate background requests.

PYTHON
HEADLESS-BROWSER
SELENIUM
TOOLS
Selenium Wire Tutorial: Intercept Background Requests

Sending HTTP Requests With Curlie: A better cURL

In this guide, we'll explore Curlie, a better cURL version. We'll start by defining what Curlie is and how it compares to cURL. We'll also go over a step-by-step guide on using and configuring Curlie to send HTTP requests.

CURL
HTTP
TOOLS
Sending HTTP Requests With Curlie: A better cURL

How to Use cURL For Web Scraping

In this article, we'll go over a step-by-step guide on sending and configuring HTTP requests with cURL. We'll also explore advanced usages of cURL for web scraping, such as scraping dynamic pages and avoiding getting blocked.

HTTP
TOOLS
CURL
How to Use cURL For Web Scraping

Using API Clients For Web Scraping: Postman

In this article, we'll explore the use of API clients for web scraping. We'll start by explaining how to locate hidden API requests on websites. Then, we'll explore importing, manipulating, and exporting them using Postman to develop efficient API-based web scrapers.

HIDDEN-API
TOOLS
Using API Clients For Web Scraping: Postman

Intro to Parsing HTML and XML with Python and lxml

In this tutorial, we'll take a deep dive into lxml, a powerful Python library that allows for parsing HTML and XML effectively. We'll start by explaining what lxml is, how to install it and using lxml for parsing HTML and XML files. Finally, we'll go over a practical web scraping with lxml.

PYTHON
TOOLS
DATA-PARSING
Intro to Parsing HTML and XML with Python and lxml

Use Curl Impersonate to scrape as Chrome or Firefox

Learn how to prevent TLS fingerprinting by impersonating normal web browser configurations. We'll start by explaining what the Curl Impersonate is, how it works, how to install and use it. Finally, we'll explore using it with Python to avoid web scraping blocking.

TOOLS
BLOCKING
CURL
HTTP
Use Curl Impersonate to scrape as Chrome or Firefox

FlareSolverr Guide: Bypass Cloudflare While Scraping

In this article, we'll explore the FlareSolverr tool and how to use it to get around Cloudflare while scraping. We'll start by explaining what FlareSolverr is, how it works, how to install and use it. Let's get started!

PYTHON
TOOLS
BLOCKING
HTTP
FlareSolverr Guide: Bypass Cloudflare While Scraping

Web Scraping with CloudProxy

One of the most common challenges encountered while web scraping is IP throttling and blocking. Learn about the CloudProxy tool, how to install it and how to use it for cloud-based web scraping.

TOOLS
PROXIES
Web Scraping with CloudProxy

How to use Headless Chrome Extensions for Web Scraping

In this article, we'll explore different useful Chrome extensions for web scraping. We'll also explain how to install Chrome extensions with various headless browser libraries, such as Selenium, Playwright and Puppeteer.

PYTHON
NODEJS
TOOLS
PLAYWRIGHT
PUPPETEER
SELENIUM
How to use Headless Chrome Extensions for Web Scraping

How to Use Cache In Web Scraping for Major Performance Boost

Introduction to web scraping caches. How caching can significantly reduce scraping costs and drastically improve performance.

PYTHON
TOOLS
How to Use Cache In Web Scraping for Major Performance Boost

How to Hide Your IP Address

In this article we'll be taking a look at several ways to hide IP addresses: proxies, tor networks, vpns and other techniques.

BLOCKING
TOOLS
PROXIES
How to Hide Your IP Address

Web Scraping Without Blocking With Undetected ChromeDriver

In this tutorial we'll be taking a look at a new popular web scraping tool Undetected ChromeDriver which is a Selenium extension that allows to bypass many scraper blocking techniques.

BLOCKING
PYTHON
TOOLS
Web Scraping Without Blocking With Undetected ChromeDriver