Customize Request

You can customize every aspect of a scrape request. Method, Headers (cookies), Location.

Method

Simply call the API with the desired method and provide body if required / needed by mepublic_datacenter_poolthod, it will be forwarded to the upstream website.
Available methods are GET, PUT, POST, PATCH, HEAD

GET request is the most commonly used to scrape

import requests

url = "https://api.scrapfly.io/scrape?key=__API_KEY__&url=https%3A%2F%2Fhttpbin.dev%2Fanything"

response = requests.request("GET", url)

print(response.text)

# import json
# print(json.loads(response.text)['result']['content'])
# print(json.loads(response.text)['result']['status_code'])

POST Is commonly used to submit form. When using this method, if you do not set content-type header, application/x-www-form-urlencoded will be set and we assume you send urlencoded data. If you want submit json data for example, you need to specify content-type: application/json.

import requests

url = "https://api.scrapfly.io/scrape?key=__API_KEY__&url=https%3A%2F%2Fhttpbin.dev%2Fanything&headers[content-type]=application%2Fx-www-form-urlencoded&correlation_id=fa2b4f1c-e293-4770-a03d-4a8d67760d0b"
payload = "test=1"
response = requests.request("POST", url, data=payload)

print(response.text)

# import json
# print(json.loads(response.text)['result']['content'])
# print(json.loads(response.text)['result']['status_code'])
"https://api.scrapfly.io/scrape?key=&url=https%3A%2F%2Fhttpbin.dev%2Fanything&headers[content-type]=application%2Fx-www-form-urlencoded&correlation_id=fa2b4f1c-e293-4770-a03d-4a8d67760d0b"

"api.scrapfly.io"
"/scrape"

key                    = "" 
url                    = "https://httpbin.dev/anything" 
headers[content-type]  = "application/x-www-form-urlencoded" 
correlation_id         = "fa2b4f1c-e293-4770-a03d-4a8d67760d0b" 

Full example with JSON Post and configured headers

import requests

url = "https://api.scrapfly.io/scrape?key=__API_KEY__&url=https%3A%2F%2Fhttpbin.dev%2Fanything&headers[content-type]=application%2Fjson&headers[accept]=application%2Fjson&correlation_id=fa2b4f1c-e293-4770-a03d-4a8d67760d0b"
payload = "{\"test\": \"example\"}"
response = requests.request("POST", url, data=payload)

print(response.text)

# import json
# print(json.loads(response.text)['result']['content'])
# print(json.loads(response.text)['result']['status_code'])
"https://api.scrapfly.io/scrape?key=&url=https%3A%2F%2Fhttpbin.dev%2Fanything&headers[content-type]=application%2Fjson&headers[accept]=application%2Fjson&correlation_id=fa2b4f1c-e293-4770-a03d-4a8d67760d0b"

"api.scrapfly.io"
"/scrape"

key                    = "" 
url                    = "https://httpbin.dev/anything" 
headers[content-type]  = "application/json" 
headers[accept]        = "application/json" 
correlation_id         = "fa2b4f1c-e293-4770-a03d-4a8d67760d0b" 

PUT Is commonly used to submit form. When using this method, if you do not set content-type header, application/x-www-form-urlencoded will be set and we assume you send urlencoded data. If you want submit json data for example, you need to specify content-type: application/json.

import requests

url = "https://api.scrapfly.io/scrape?key=__API_KEY__&url=https%3A%2F%2Fhttpbin.dev%2Fanything"
payload = "test=1"
response = requests.request("PUT", url, data=payload)

print(response.text)

# import json
# print(json.loads(response.text)['result']['content'])
# print(json.loads(response.text)['result']['status_code'])

PATCH Is commonly used to submit form. When using this method, if you do not set content-type header, application/x-www-form-urlencoded will be set and we assume you send urlencoded data. If you want submit json data for example, you need to specify content-type: application/json.

import requests

url = "https://api.scrapfly.io/scrape?key=__API_KEY__&url=https%3A%2F%2Fhttpbin.dev%2Fanything"
payload = "test=1"
response = requests.request("PATCH", url, data=payload)

print(response.text)

# import json
# print(json.loads(response.text)['result']['content'])
# print(json.loads(response.text)['result']['status_code'])

Headers

Scrapfly API allows you to customize headers sent to the upstream website in a very simple way. The value of headers must be urlencoded to prevent side effect.

import requests

url = "https://api.scrapfly.io/scrape?key=__API_KEY__&url=https%3A%2F%2Fhttpbin.dev%2Fanything&headers[content-type]=application%2Fjson&headers[x-requested-with]=XMLHttpRequest"

response = requests.request("GET", url)

print(response.text)

# import json
# print(json.loads(response.text)['result']['content'])
# print(json.loads(response.text)['result']['status_code'])
"https://api.scrapfly.io/scrape?key=&url=https%3A%2F%2Fhttpbin.dev%2Fanything&headers[content-type]=application%2Fjson&headers[x-requested-with]=XMLHttpRequest"

"api.scrapfly.io"
"/scrape"

key                        = "" 
url                        = "https://httpbin.dev/anything" 
headers[content-type]      = "application/json" 
headers[x-requested-with]  = "XMLHttpRequest" 

Cookies

We create a dedicated section about cookies because most similar Scrape API exposes cookies customization as special parameters. Cookies are headers and should not be treated as "special."

Set-Cookie

This header should never be sent from the client’s site. It's a response header sent when upstream wants to register a cookie with some parameters (domain appliance, expiration, security, and the like.)

Cookie

When calling an upstream website as a client, you should set a Cookie header. There are different variations of Cookie header.

  • Single cookie: Cookie: test=1
  • Multiple cookie: Cookie: test=1;lang=fr;currency=USD
You can also pass multiple Cookie with a single notation headers to set multiple cookies.
import requests

url = "https://api.scrapfly.io/scrape?key=__API_KEY__&url=https%3A%2F%2Fhttpbin.dev%2Fanything&headers[cookie]=lang%3Dfr%3Bcurrency%3DUSD%3Btest%3D1"

response = requests.request("GET", url)

print(response.text)

# import json
# print(json.loads(response.text)['result']['content'])
# print(json.loads(response.text)['result']['status_code'])
"https://api.scrapfly.io/scrape?key=&url=https%3A%2F%2Fhttpbin.dev%2Fanything&headers[cookie]=lang%3Dfr%3Bcurrency%3DUSD%3Btest%3D1"

"api.scrapfly.io"
"/scrape"

key              = "" 
url              = "https://httpbin.dev/anything" 
headers[cookie]  = "lang=fr;currency=USD;test=1" 

Geo Targeting

Geo-Targeting is available. You can set the desired country via ISO 3166-1 alpha-2 of the country. Available countries are defined with your proxy pool. If the country is not available from the Public Pool you can create your own private pool with desired countries. The more you restrict countries the less ip you will have.

The API support many query expression to indicate the geo country you want:

  • Simple country selection: country=us
  • Multi country selection with random selection: country=us,ca,mx
  • Multi country selection with weighted random selection (higher have more probability): country=us:1,ca:5,mx:3
  • Country exclusion, every country except: country=-gb

If you want to know more about proxies, you can checkout our dedicated section

You can also spoof the geolocation of browser by using geolocation=48.856614,2.3522219 (latitude, longitude)

Available countries per proxy pool type

  • AL - Albania
  • AM - Armenia
  • AR - Argentina
  • AT - Austria
  • AU - Australia
  • BE - Belgium
  • BG - Bulgaria
  • BO - Bolivia
  • BR - Brazil
  • BY - Belarus
  • CA - Canada
  • CH - Switzerland
  • CL - Chile
  • CN - China
  • CO - Colombia
  • CZ - Czechia
  • DE - Germany
  • DK - Denmark
  • EC - Ecuador
  • EE - Estonia
  • ES - Spain
  • FI - Finland
  • FR - France
  • GB - United Kingdom
  • GE - Georgia
  • GR - Greece
  • HR - Croatia
  • HU - Hungary
  • IE - Ireland
  • IL - Israel
  • IN - India
  • IS - Iceland
  • IT - Italy
  • JP - Japan
  • KR - South Korea
  • LT - Lithuania
  • LV - Latvia
  • MX - Mexico
  • NL - Netherlands
  • NO - Norway
  • NZ - New Zealand
  • PE - Peru
  • PK - Pakistan
  • PL - Poland
  • PT - Portugal
  • RO - Romania
  • RU - Russia
  • SA - Saudi Arabia
  • SE - Sweden
  • SK - Slovakia
  • TR - Türkiye
  • UA - Ukraine
  • US - United States
  • AL - Albania
  • AM - Armenia
  • AR - Argentina
  • AT - Austria
  • AU - Australia
  • BE - Belgium
  • BG - Bulgaria
  • BO - Bolivia
  • BR - Brazil
  • BY - Belarus
  • CA - Canada
  • CH - Switzerland
  • CL - Chile
  • CN - China
  • CO - Colombia
  • CZ - Czechia
  • DE - Germany
  • DK - Denmark
  • EC - Ecuador
  • EE - Estonia
  • ES - Spain
  • FI - Finland
  • FR - France
  • GB - United Kingdom
  • GE - Georgia
  • GR - Greece
  • HR - Croatia
  • HU - Hungary
  • IE - Ireland
  • IL - Israel
  • IN - India
  • IS - Iceland
  • IT - Italy
  • JP - Japan
  • KR - South Korea
  • LT - Lithuania
  • LV - Latvia
  • MX - Mexico
  • NL - Netherlands
  • NO - Norway
  • NZ - New Zealand
  • PE - Peru
  • PK - Pakistan
  • PL - Poland
  • PT - Portugal
  • RO - Romania
  • RU - Russia
  • SA - Saudi Arabia
  • SE - Sweden
  • SK - Slovakia
  • TR - Türkiye
  • UA - Ukraine
  • US - United States
import requests

url = "https://api.scrapfly.io/scrape?key=__API_KEY__&url=https%3A%2F%2Ftools.scrapfly.io%2Fapi%2Finfo%2Fip&country=nl"

response = requests.request("GET", url)

print(response.text)

# import json
# print(json.loads(response.text)['result']['content'])
# print(json.loads(response.text)['result']['status_code'])
"https://api.scrapfly.io/scrape?key=&url=https%3A%2F%2Ftools.scrapfly.io%2Fapi%2Finfo%2Fip&country=nl"

"api.scrapfly.io"
"/scrape"

key      = "" 
url      = "https://tools.scrapfly.io/api/info/ip" 
country  = "nl" 

Language

Select desired lang - By default it uses the language from the proxy location. Behind the scene it configures the Accept-Language HTTP header. If the website support the language, the content will be in that lang. You can't set lang parameter and Accept-Language header.

You can pass multiple language at onc seperated by a coma, lang also support locale format {lang iso2}-{country iso2}. The order matters, the website will negotiate the content language based on this order. For example lang=fr,en-US,en will give Accept-Language: fr-{proxy country iso2},fr;q=0.9,en-US;q=0.8,en;q=0.7 It generate the correct accept language header.

Most of the time customers want to stick in english regardless of the proxy location. lang=en-US,en
import requests

url = "https://api.scrapfly.io/scrape?key=__API_KEY__&url=https%3A%2F%2Fhttpbin.dev%2Fanything&tags=project%3Adefault&lang=fr"

response = requests.request("GET", url)

print(response.text)

# import json
# print(json.loads(response.text)['result']['content'])
# print(json.loads(response.text)['result']['status_code'])
"https://api.scrapfly.io/scrape?key=&url=https%3A%2F%2Fhttpbin.dev%2Fanything&tags=project%3Adefault&lang=fr"

"api.scrapfly.io"
"/scrape"

key   = "" 
url   = "https://httpbin.dev/anything" 
tags  = "project:default" 
lang  = "fr" 

Operating System

We do not recommend to use this feature unless you have specific constraint regarding the operating system and the target behavior.

By default the Operating system is automatically selected

Operating System, if not selected it's random. You can't set os parameter and User-Agent header. Possible values are win,win10,win11,mac,linux,chromeos

import requests

url = "https://api.scrapfly.io/scrape?key=__API_KEY__&url=https%3A%2F%2Fhttpbin.dev%2Fanything&tags=project%3Adefault&os=win"

response = requests.request("GET", url)

print(response.text)

# import json
# print(json.loads(response.text)['result']['content'])
# print(json.loads(response.text)['result']['status_code'])
"https://api.scrapfly.io/scrape?key=&url=https%3A%2F%2Fhttpbin.dev%2Fanything&tags=project%3Adefault&os=win"

"api.scrapfly.io"
"/scrape"

key   = "" 
url   = "https://httpbin.dev/anything" 
tags  = "project:default" 
os    = "win" 

Integration