Scraper doesn't see the data I see in the browser - why?

When scraping we might notice that some page elements are only visible in the web browser but not in our scraper. This is called dynamic javascript data and it's being created by javascript on page load. If our scraper is not running a full browser to execut javascript it'll never see dynamic elements rendered.

There are many ways to scrape dynamic data like using web browsers:

Scraping Dynamic Websites Using Web Browsers

See our introduction tutorial article to scraping using web browsers and automation toolkits like Puppeteer, Selenium and Playwright

Scraping Dynamic Websites Using Web Browsers

Alternatively, sometimes dynamic data is already present in the HTML document but in a different location than what we see in the browser. Most commonly the data is hidden in <script> elements as javascript variables and then unpacked into the HTML on page load.

How to Scrape Hidden Web Data

For more see this introduction article which covers how to find hidden web data and popular hidden web data scenarios

How to Scrape Hidden Web Data

Related Posts

Quick Intro to Parsing JSON with JMESPath in Python

Introduction to JMESPath - JSON query language which is used in web scraping to parse JSON datasets for scrape data.

How to Scrape Hidden Web Data

The visible HTML doesn't always represent the whole dataset available on the page. In this article, we'll be taking a look at scraping of hidden web data. What is it and how can we scrape it using Python?

How to Ensure Web Scrapped Data Quality

Ensuring consitent web scrapped data quality can be a difficult and exhausting task. In this article we'll be taking a look at two populat tools in Python - Cerberus and Pydantic - and how can we use them to validate data.