cURL is a great command-line tool for sending HTTP requests, which can be a viable asset in the web scraping toolbox. However, its syntax and output can be confusing. What about a better alternative?
In this guide, we'll explore Curlie, a better cURL version. We'll start by defining what Curlie is and how it compares to cURL. We'll also go over a step-by-step guide on using and configuring Curlie to send HTTP requests. Let's get started!
What is Curlie?
Curlie is an interface for the regular cURL. Its interface is built on top of HTTPie, a CLI and HTTP client app for sending HTTP requests in a colorful formatted output.
Curlie combines the cURL features with the easy syntax and output formatting of HTTPie. It allows for writing commands in the syntax of both cURL and HTTPie.
How To Install Curlie?
Curlie can be installed for all operating systems using command lines through different package managers.
Mac
brew install curlie
# or
curl -sS https://webinstall.dev/curlie | bash
Linux
curl -sS https://webinstall.dev/curlie | bash
# or
eget rs/curlie -a deb --to=curlie.deb
sudo dpkg -i curlie.deb
Windows
curl.exe -A "MS" https://webinstall.dev/curlie | powershell
How To Use Crulie?
In the following sections, we'll explain using Curlie to send and configure HTTP requests. Curlie accepts the syntax of both cURL and HTTPie and since we covered cURL in a previous guide, we'll use the HTTPie syntax in this one.
That being said, all the cURL options used by Curlie under the hood can be viewed by adding the --curl option.
Configuring HTTP Method
All the Curlie requests start with the curlie command. By default, all the requests sent follow the GET HTTP method:
curlie https://httpbin.dev/get
Running the above command will send a GET request and return the response formatted. It will also return the response headers, which isn't enabled by default with cURL:
To change the Curlie HTTP method, we can use the -v option. For example, here is how we can send a POST request with Curlie. The same approach can be followed for other HTTP methods (HEAD, PUT, DELETE, etc.):
curlie -v POST https://httpbin.dev/anything
Adding Headers, Cookies and Body
Headers
Adding headers to Curlie is pretty straightforward. All we have to do is specify the header name and its value separated by a colon:
Lastly, let's explore adding a request body for POST requests. For this, we can simply add the data as key-value pairs, which will be converted to JSON by Curlie:
Sending HTTP requests to download binary data is a common use case. Just like the regular cURL, Curlie allows for downloading binary data using the -O option:
The above Curlie command will download a PDF from web-scraping.dev to the current directory. To change the downloaded file directory, we can use the --output-dir option:
Basic authentication requires simple credential data: username and password. For example, requesting https://httpbin.dev/basic-auth/user/passwd from the browser will require the credentials before proceeding with the request:
To set basic authentication with Curlie, we can use the --user or -u options:
From the response, we can see that the request was authenticated:
{
"authorized": true,
"user": "user"
}
Curlie can also handle different types of authentication, such as cookie and bearer token authentication. For the detailed instructions, refer to our guide on managing authentication with cURL.
Adding Proxies
Websites use IP addresses to identify potential traffic abuse with specific IP addresses, leading to blocking them.
Hence, using proxies, especially for web scraping, allows for the distribution of the traffic load across multiple IP addresses. This makes it harder for websites to detect the IP addresses, preventing their blocking.
To use proxies with Curlie, we can use the -x or --proxy options, followed by the proxy type, domain, and port:
For further details on proxies, including their types, differences, and how they compare, refer to our dedicated guide.
FAQ
To wrap up this guide on Curlie, let's have a look at some frequently asked questions.
Can I web scrape with Curlie?
Yes, but using Curlie for web scraping is limited to extracting shallow amounts of data or for development and debugging purposes. In a previous guide, we covered using cURL for web scraping, which can also be applied with Curlie as well.
Are there alternatives for Curlie?
Yes, Curl Impersonate is a modified cURL version that prevents cURL blocking by mimicking Chrome and Firefox configurations. Another alternative HTTP client for cURL is Postman. We have covered both Curl Impersonate and Postman in previous guides.
Summary
In this article, we explained Curlie, what it is, and how it differs from the regular cURL. We went through a step-by-step guide on using it to configure and send HTTP requests. We have covered:
Sending HTTP requests with different HTTP methods.
In this article, we'll go over a step-by-step guide on sending and configuring HTTP requests with cURL. We'll also explore advanced usages of cURL for web scraping, such as scraping dynamic pages and avoiding getting blocked.
Learn how to prevent TLS fingerprinting by impersonating normal web browser configurations. We'll start by explaining what the Curl Impersonate is, how it works, how to install and use it. Finally, we'll explore using it with Python to avoid web scraping blocking.