Monitoring

Scrapfly offers a detailed real-time monitoring dashboard that logs all scrape requests and their results. This dashboard tracks all scrape request for the selected Scrapfly project and environment. It can be filtered and inspected for overall scraping performance:

See Your Monitoring Dashboard

Logs Retention
FREE DISCOVERY PRO STARTUP ENTERPRISE
Log retention 1 weeks Log retention 1 weeks Log retention 2 weeks Log retention 3 weeks Log retention 4 weeks
Screenshots, Debug, Cache belongs to log and inherit of the same retention - As soon as the log is deleted, they are also deleted

Filters

Filters allow you to sample or search logs from the monitoring section to investigate or check that everything works as expected.

Time Frame

Eight pre-configured time frames are available (past month, past week, past day, past three hours, past hour, past 30 minutes, past 15 minutes, past 5 minutes, past 5 minutes). You can also define an arbitrary time frame.

Dimensions

You can filter the following values:

Name Value Support Multiple Description
url string Yes Filter for URL, support glob operator with.
e.g: https://*.wikipedia.org/wiki/*
success bool true or false No Filter for success or failed request - includes >= 500 and network errors
domain string Yes Filter for a domain, the domain (TLD+n) includes a subdomain, support glob operator with
e.g.: *.google.com
root_domain string yes Filter for root domain (TLD+1), do not include subdomain, support glob operator with
e.g.; goo*.com
method string Yes Filter for method, supported values : GET, PUT, POST, PATCH
status_code int Yes Filter for status code
e.g: 401 or 502,504
cost int 1 Filter based of on cost spent on API Credits
origin string Yes Filter for origin, supported values : API, SCHEDULER
retries int No Filter for retries amount
duration int No Filter for duration
error_code string Yes Filter by error codes e.g: ERR::SCRAPE::SCENARIO_EXECUTION
errored bool true or false No Filter scrape having an error

For metrics that support multiple notations, simply separate with , like status=200,204

Chaining Filters & Multi Values

You can chain multiple filters with space.

You can filter over multiple values, in this case, OR operator is applied.

Operators

Following operators are supported :

  • = Equal
  • ! Not equal
  • > Greater than
  • >= Greater or equal to
  • < Lower than
  • <= Lower or equal to

API

This feature is only available from ENTERPRISE plan

With monitoring API you can query aggregates or domain specific metrics.

  • All date are in UTC
  • JSON or MSGPACK format are available, use the header accept to control it, application/json or application/msgpack
  • Pass your API KEY via key=xxx query param or header Authorization Bearer xxxx

GET /scrape/monitoring/metrics

Retrieve metrics from the current subscription period, which include three aggregations level: account, project and targets (top 100)

Parameters
Name Description
format
  • structured default
  • prometheus
aggregation

Enable aggregations, possible values are:

  • account default
  • project
  • target

You can combine multiple aggregations at once: account,project,target for example

period
  • last5m
  • last1h
  • last7d
  • last24h default
  • subscription (use current subscription period)
group_subdomain

Enable or disable subdomain grouping when target aggregation is requested. When using subdomain grouping, provide the root domain as input. E.g: web-scraping.dev and not www.web-scraping.dev

  • true default
  • false

GET /scrape/monitoring/metrics/target

Retrieve metrics and timeseries for a given target

Parameters
Name Description
domain
required
  • httpbin.dev
  • web-scraping.dev
  • ...
period

This parameter is mutually exclusive with start and end

  • last5m default
  • last1h
  • last7d
  • last24h
  • subscription (use current subscription period)
group_subdomain

Enable or disable subdomain grouping. When using subdomain grouping, provide the root domain as input. E.g: web-scraping.dev and not www.web-scraping.dev

  • true default
  • false
start The format should be Y-m-d H:i:s see example below. It's mutually exclusive to period param, end param must be set.

Examples:
  • 2024-01-01 00:00:00
end The format should be Y-m-d H:i:s see example below. It's mutually exclusive to period param, start param must be set.

Examples:
  • 2024-01-01 00:00:00

Summary