How to pass data between scrapy callbacks in Scrapy?

Since scrapy is using callbacks for scraping transferring data between request steps can appear complicated. So, how do we fill a single item using multiple scrapy requests?

For example, if we need to scrape 3 pages - product data, reviews and shipping options - we need 3 callbacks and continuously transfer data between them:

import scrapy

class MySpider(scrapy.Spider):
    name = 'myspider'

    def parse(self, response):
        item = {"price": "123"}
        yield scrapy.Request(".../reviews", meta={"item": item})
    
    def parse_reviews(self, response):
        item = response.meta['item']
        item['reviews'] = ['awesome']
        yield scrapy.Request(".../reviews", meta={"item": item})

    def parse_shipping(self, response):
        item = response.meta['item']
        item['shipping'] = "14.22 USD"
        yield item

In this example, we're using Request.meta to preserve our scraped item through all 3 requests. In the first one we extract product details, second one review data and last one shipping price and return the final dataset.

Provided by Scrapfly

This knowledgebase is provided by Scrapfly data APIs, check us out! 👇