🚀 We are hiring! See open positions

How to scrape HTML table to Excel Spreadsheet (.xlsx)?

by Bernardas Alisauskas Aug 03, 2023 1 min read

To save an HTML table to an Excel spreadsheet we can use Python with How to Parse Web Data with Python and Beautifulsoup and xlsxwriter + HTTP client like requests.

shell
$ pip install bs4 xlsxwriter requests

Then, we can scrape the web page, find table data using bs4 and write it to .xlsx file using `xlsxwriter``:

python
from bs4 import BeautifulSoup
import requests 
import xlsxwriter

# 1. Retrieve HTML and create BeautifulSoup object
response = requests.get("https://www.w3schools.com/html/html_tables.asp")
soup = BeautifulSoup(response.text)
# 2. Find the table and extract headers and rows:
table = soup.find('table', {"id": "customers"})
header = []
rows = []
for i, row in enumerate(table.find_all('tr')):
    if i == 0:
        header = [el.text.strip() for el in row.find_all('th')]
    else:
        rows.append([el.text.strip() for el in row.find_all('td')])
# 3. save to it a XLSX file:
workbook = xlsxwriter.Workbook('output.xlsx')
worksheet = workbook.add_worksheet()
worksheet.write_row(0, 0, header)
for i, row in enumerate(rows):
    worksheet.write_row(i+1, 0, row)
workbook.close()

BeautifulSoup is a very powerful HTML parser giving us full control when it comes to parsing HTML tables. Unlike many automated scripts we can direct it to extract HTML table values from any table structure.

Scale Your Web Scraping
Anti-bot bypass, browser rendering, and rotating proxies — all in one API. Start with 1,000 free credits.
No credit card required 1,000 free API credits Anti-bot bypass included
Not ready? Get our newsletter instead.