Articles

What is Error 1015 (Cloudflare) and How to Fix it?

Discover why you're seeing Cloudflare Error 1015 and learn effective ways to resolve and prevent it.

What HTTP Error 412 Precondition Failed and How to Fix it?

Quick look at HTTP status code 412 - what does it mean, its common causes, and how it can be prevented.

What is HTTP Error 503 Service Unavailable and How to Fix it?

Understand what causes HTTP 503 errors, when they might indicate blocking, and how to effectively mitigate them.

Guide to Python requests POST method

Discover how to use Python's requests library for POST requests, including JSON, form data, and file uploads, along with response handling tips.

What is HTTP Error 429 Too Many Request and How to Fix it

HTTP 429 is an infamous response code that indicates request throttling or distribution is needed. Let's take a look at how to handle it.

Guide to Python Requests Headers

Our guide to request headers for Python requests library. How to configure and what do they mean.

Axios vs Fetch: Which HTTP Client to Choose in JS?

Explore the differences between Fetch and Axios - two essential HTTP clients in JavaScript - and discover which is best suited for your project.

What is Status Code 403 Forbidden and How to Fix it

403 Forbidden HTTP status code mean the client is not allowed to view the resources, but why? Let's take a look at reasons and how to bypass it.

cURL vs Wget: Key Differences Explained

curl and wget are both popular terminal tools but often used for different tasks - let's take a look at the differences.

What is HTTP 415 Error? (Unsupported Media Type)

Quick look at HTTP status code 415 — what does it mean and how can it be prevented and bypassed in scraping?

What is HTTP 422 Error? (Unprocessable Entity)

422 Unprocessable Entity error is usually caused by a semantically invalid request. Learn http error 422 causes and how to fix your requests.

What is HTTP 409 Error? (Conflict)

HTTP status code 409 generally means a conflict or mismatch with the server state. Learn why it happens and how to avoid it.

What is HTTP 413 Error? (Payload Too Large)

HTTP status code 413 generally means that POST or PUT data is too large. Let's take a look at how to handle this.

What is HTTP 406 Error? (Not Acceptable)

HTTP status code 406 generally means wrong Accept- header family configuration. Here's how to prevent it.

What is HTTP 405 Error? (Method Not Allowed)

Quick look at HTTP status code 405 — what does it mean and how can it be prevented and bypassed in scraping?

Web Scraping With Go

Learn web scraping with Golang, from native HTTP requests and HTML parsing to a step-by-step guide to using Colly, the Go web crawling package.

Sending HTTP Requests With Curlie: A better cURL

In this guide, we'll explore Curlie, a better cURL version. We'll start by defining what Curlie is and how it compares to cURL. We'll also go over a step-by-step guide on using and configuring Curlie to send HTTP requests.

How to Use cURL For Web Scraping

In this article, we'll go over a step-by-step guide on sending and configuring HTTP requests with cURL. We'll also explore advanced usages of cURL for web scraping, such as scraping dynamic pages and avoiding getting blocked.

Use Curl Impersonate to scrape as Chrome or Firefox

Learn how to prevent TLS fingerprinting by impersonating normal web browser configurations. We'll start by explaining what the Curl Impersonate is, how it works, how to install and use it. Finally, we'll explore using it with Python to avoid web scraping blocking.

FlareSolverr Guide: Bypass Cloudflare While Scraping

In this article, we'll explore the FlareSolverr tool and how to use it to get around Cloudflare while scraping. We'll start by explaining what FlareSolverr is, how it works, how to install and use it. Let's get started!

How to Handle Cookies in Web Scraping

Introduction to cookies in web scraping. What are they and how to take advantage of cookie process to authenticate or set website preferences.

How to Effectively Use User Agents for Web Scraping

In this article, we’ll take a look at the User-Agent header, what it is and how to use it in web scraping. We'll also generate and rotate user agents to avoid web scraping blocking.

How to Scrape in Another Language, Currency or Location

Localization allows for adapting websites content by changing language and currency. So, how do we scrape it? We'll take a look at the most common methods for changing language, currency and other locality details in web scraping.

How to Avoid Web Scraper IP Blocking?

How IP addresses are used in web scraping blocking. Understanding IP metadata and fingerprinting techniques to avoid web scraper blocks.

How Headers Are Used to Block Web Scrapers and How to Fix It

Introduction to web scraping headers - what do they mean, how to configure them in web scrapers and how to avoid being blocked.

Web Scraping Graphql with Python

Introduction to web scraping graphql powered websites. How to create graphql queries in python and what are some common challenges.

Web Scraping with Python

Introduction tutorial to web scraping with Python. How to collect and parse public data. Challenges, best practices and an example project.

Web Scraping With R Tutorial and Example Project

Introduction to web scraping with R language. How to handle http connections, parse html files, best practices, tips and an example project.

Web Scraping With Ruby

Introduction to web scraping with Ruby. How to handle http connections, parse html files for data, best practices, tips and an example project.

Web Scraping With NodeJS and Javascript

In this article we'll take a look at scraping using Javascript through NodeJS. We'll cover common web scraping libraries, frequently encountered challenges and wrap everything up by scraping etsy.com

Web Scraping With PHP 101

Introduction to web scraping with PHP. How to handle http connections, parse html files for data, best practices, tips and an example project.