How to scrape HTML table to Excel Spreadsheet (.xlsx)?

by scrapecrow Aug 03, 2023

To save an HTML table to an Excel spreadsheet we can use Python with BeautifulSoup4 and xlsxwriter + HTTP client like requests.

$ pip install bs4 xlsxwriter requests

Then, we can scrape the web page, find table data using bs4 and write it to .xlsx file using `xlsxwriter``:

from bs4 import BeautifulSoup
import requests 
import xlsxwriter

# 1. Retrieve HTML and create BeautifulSoup object
response = requests.get("https://www.w3schools.com/html/html_tables.asp")
soup = BeautifulSoup(response.text)
# 2. Find the table and extract headers and rows:
table = soup.find('table', {"id": "customers"})
header = []
rows = []
for i, row in enumerate(table.find_all('tr')):
    if i == 0:
        header = [el.text.strip() for el in row.find_all('th')]
    else:
        rows.append([el.text.strip() for el in row.find_all('td')])
# 3. save to it a XLSX file:
workbook = xlsxwriter.Workbook('output.xlsx')
worksheet = workbook.add_worksheet()
worksheet.write_row(0, 0, header)
for i, row in enumerate(rows):
    worksheet.write_row(i+1, 0, row)
workbook.close()

BeautifulSoup is a very powerful HTML parser giving us full control when it comes to parsing HTML tables. Unlike many automated scripts we can direct it to extract HTML table values from any table structure.

Related Articles

Ultimate Guide to JSON Parsing in Python

Learn JSON parsing in Python with this ultimate guide. Explore basic and advanced techniques using json, and tools like ijson and nested-lookup

DATA-PARSING
PYTHON
Ultimate Guide to JSON Parsing in Python

What is Parsing? From Raw Data to Insights

Learn about the fundamentals of parsing data, across formats like JSON, XML, HTML, and PDFs. Learn how to use Python parsers and AI models for efficient data extraction.

DATA-PARSING
PYTHON
AI
What is Parsing? From Raw Data to Insights

Intro to Parsing HTML and XML with Python and lxml

In this tutorial, we'll take a deep dive into lxml, a powerful Python library that allows for parsing HTML and XML effectively. We'll start by explaining what lxml is, how to install it and using lxml for parsing HTML and XML files. Finally, we'll go over a practical web scraping with lxml.

PYTHON
TOOLS
DATA-PARSING
Intro to Parsing HTML and XML with Python and lxml

How to Parse XML

In this article, we'll explain about XML parsing. We'll start by defining XML files, their format and how to navigate them for data extraction.

PYTHON
CSS-SELECTORS
XPATH
DATA-PARSING
How to Parse XML

Web Scraping to Google Sheets

Google sheets is an easy to store scraped data. In this tutorial we'll take a look at how to use this free online database for storing scraped data!

PYTHON
PROJECT
DATA-PARSING
Web Scraping to Google Sheets

Web Scraping Emails using Python

In this tutorial we'll take a look at email scraping. How to crawl pages and extract email addresses using Python and what are some popular challenges.

PYTHON
DATA-PARSING
PROJECT
Web Scraping Emails using Python