The How to Effectively Use User Agents for Web Scraping header is one of the essential headers, which identifies the request sender's device with various details, such as the device type, operating system, browser name, and version. Missing this header or misconfiguring it can lead to request blocking.
To change the cURL User-Agent, we can use the -A cURL option:
curl -A "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/113.0" https://httpbin.dev/headersThe response will include the modified cURL User-Agent:
{
"headers": {
"Accept": [
"*/*"
],
"Accept-Encoding": [
"gzip"
],
"Host": [
"httpbin.dev"
],
"User-Agent": [
"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/113.0"
]
}
}Another alternative to setting User-Agent with cURL is passing through as a How Headers Are Used to Block Web Scrapers and How to Fix It using the -H cURL option:
curl -H "User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/113.0" https://httpbin.dev/headersFinally, we can rotate the cURL User-Agent using bash:
user_agents=(
"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/113.0"
"Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/113.0"
"Mozilla/5.0 (Windows NT 6.1; rv:109.0) Gecko/20100101 Firefox/113.0"
)
# get a random user agent
get_random_user_agent() {
local random_index=$((RANDOM % ${#user_agents[@]}))
echo "${user_agents[random_index]}"
}
user_agent=$(get_random_user_agent)
curl -A "$user_agent" https://httpbin.dev/headersFor more details on cURL, refer to our dedicated guide.