How to pass custom parameters to scrapy spiders?

To configure scrapy spiders with custom execution parameters scrapy's CLI -a option can be used.

Scrapy sets -a CLI parameters as scrapy spider instance attributes (e.g. -a country -> self.country) when the crawl command is called.

For example, here we are passing country and proxy parameters to our scraper:

scrapy crawl myspider -a country=US -a "proxy=http://222.22.33.44:9000"
import scrapy

class MySpider(scrapy.Spider):
    name = "myspider"

    def parse(self, response):
        print(self.country)
        print(self.proxy)

This is an easy and useful feature for when specific customization is needed for each scrapy crawl command.

Additionally, the -s CLI parameter can be used to set or override any scrapy settings.

Provided by Scrapfly

This knowledgebase is provided by Scrapfly — a web scraping API that allows you to scrape any website without getting blocked and implements a dozens of other web scraping conveniences. Check us out 👇