Web Scraping With Scrapy Intro Through Examples
Tutorial on web scraping with scrapy and Python through a real world example project. Best practices, extension highlights and common challenges.
To transfer data to the scrape callback from the initial start_requests()
method the Request.meta
attribute can be used:
import scrapy
class MySpider(scrapy.Spider):
name = 'myspider'
def start_requests(self):
urls = [...]
for index, url in enumerate(urls):
yield scrapy.Request(url, meta={'index':index})
def parse(self, response):
print(response.url)
print(response.meta['index'])
In the example above we are using Request.meta
parameter and pass the index of URL that has been scheduled to be scraped.