Scraper doesn't see the data I see in the browser - why?

When scraping we might notice that some page elements are only visible in the web browser but not in our scraper. This is called dynamic javascript data and it's being created by javascript on page load. If our scraper is not running a full browser to execut javascript it'll never see dynamic elements rendered.

There are many ways to scrape dynamic data like using web browsers:

How to Scrape Dynamic Websites Using Headless Web Browsers

See our introduction tutorial article to scraping using web browsers and automation toolkits like Puppeteer, Selenium and Playwright

How to Scrape Dynamic Websites Using Headless Web Browsers

Alternatively, sometimes dynamic data is already present in the HTML document but in a different location than what we see in the browser. Most commonly the data is hidden in <script> elements as javascript variables and then unpacked into the HTML on page load.

How to Scrape Hidden Web Data

For more see this introduction article which covers how to find hidden web data and popular hidden web data scenarios

How to Scrape Hidden Web Data
Question tagged: Data Parsing, Headless Browsers

Related Posts

Intro to Parsing HTML and XML with Python and lxml

In this tutorial, we'll take a deep dive into lxml, a powerful Python library that allows for parsing HTML and XML effectively. We'll start by explaining what lxml is, how to install it and using lxml for parsing HTML and XML files. Finally, we'll go over a practical web scraping with lxml.

How to Parse XML

In this article, we'll explain about XML parsing. We'll start by defining XML files, their format and how to navigate them for data extraction.

Web Scraping to Google Sheets

Google sheets is an easy to store scraped data. In this tutorial we'll take a look at how to use this free online database for storing scraped data!