How to scrape HTML table to Excel Spreadsheet (.xlsx)?

by scrapecrow Aug 03, 2023

To save an HTML table to an Excel spreadsheet we can use Python with BeautifulSoup4 and xlsxwriter + HTTP client like requests.

$ pip install bs4 xlsxwriter requests

Then, we can scrape the web page, find table data using bs4 and write it to .xlsx file using `xlsxwriter``:

from bs4 import BeautifulSoup
import requests 
import xlsxwriter

# 1. Retrieve HTML and create BeautifulSoup object
response = requests.get("https://www.w3schools.com/html/html_tables.asp")
soup = BeautifulSoup(response.text)
# 2. Find the table and extract headers and rows:
table = soup.find('table', {"id": "customers"})
header = []
rows = []
for i, row in enumerate(table.find_all('tr')):
    if i == 0:
        header = [el.text.strip() for el in row.find_all('th')]
    else:
        rows.append([el.text.strip() for el in row.find_all('td')])
# 3. save to it a XLSX file:
workbook = xlsxwriter.Workbook('output.xlsx')
worksheet = workbook.add_worksheet()
worksheet.write_row(0, 0, header)
for i, row in enumerate(rows):
    worksheet.write_row(i+1, 0, row)
workbook.close()

BeautifulSoup is a very powerful HTML parser giving us full control when it comes to parsing HTML tables. Unlike many automated scripts we can direct it to extract HTML table values from any table structure.

How to scrape HTML table to Excel Spreadsheet (.xlsx)?

Related Articles

Ultimate Guide to JSON Parsing in Python

What is Parsing? From Raw Data to Insights

Intro to Parsing HTML and XML with Python and lxml

How to Parse XML

Web Scraping to Google Sheets

Web Scraping Emails using Python

Products

Features

SDKs

No-Code Platforms

LLM & RAG Apps

Technical Challenges

Popular Targets

Real Estate

eCommerce

Social Media

Company & Reviews

Jobs

Search & SEO

Fashion

Travel & Hotels

Industry Solutions

How to scrape HTML table to Excel Spreadsheet (.xlsx)?

Related Articles

Ultimate Guide to JSON Parsing in Python

What is Parsing? From Raw Data to Insights

Intro to Parsing HTML and XML with Python and lxml

How to Parse XML

Web Scraping to Google Sheets

Web Scraping Emails using Python