     [Blog](https://scrapfly.io/blog)   /  [python](https://scrapfly.io/blog/tag/python)   /  [How to Scrape Google Flights Data in 2026](https://scrapfly.io/blog/posts/how-to-scrape-google-flights)   # How to Scrape Google Flights Data in 2026

 by [Ziad Shamndy](https://scrapfly.io/blog/author/ziad) Jun 30, 2026 22 min read [\#python](https://scrapfly.io/blog/tag/python) [\#scrapeguide](https://scrapfly.io/blog/tag/scrapeguide) 

 [  ](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-google-flights "Share on LinkedIn")    

 

 

         

Google killed its public Flights API in 2018, and fares now move minute to minute on dynamic-pricing buckets. Scraping the consumer site is the only programmatic route, and Google's anti-bot stack drops vanilla headless browsers on the first request.

This guide builds a working Google Flights scraper with the Scrapfly Python SDK, covering URL discovery through the `tfs` parameter, search extraction, and anti-bot handling.

## Key Takeaways

A short orientation before the code starts. Google Flights does not expose an API, the search URL hides every trip detail inside a Base64 Protobuf string, and the maintenance cost lives in selector churn rather than anti-bot work once the right SDK is in place.

- Google Flights has no public API. Scraping the consumer interface at `google.com/travel/flights` is the only programmatic route.
- Every search URL encodes trip type, origin, destination, dates, and passengers into a single `tfs` Base64 Protobuf parameter.
- A single result card surfaces 15 structured fields: airline, flight number, times, airports, stops, duration, price, currency, cabin class, plane model, layovers, legroom, and in-flight extensions.
- The Scrapfly SDK handles JavaScript rendering, residential proxies, fingerprint spoofing, and Google's anti-bot checks in one `async_scrape()` call.
- Selector churn is the real maintenance tax. Isolate selectors in a config layer so updates take seconds, not hours.

**Get web scraping tips in your inbox**Trusted by 100K+ developers and 30K+ enterprises. Unsubscribe anytime.







[**Google Flights Scraper**github.com/scrapfly/scrapfly-scrapers/tree/main/google-flights-scraper](https://github.com/scrapfly/scrapfly-scrapers/tree/main/google-flights-scraper)

## Why Scrape Google Flights?

Fares change minute to minute on dynamic-pricing buckets, and no public API exposes those movements. Scraping Google Flights is what lets a price tracker, a market intelligence pipeline, or a travel app see the same fares a real user sees when they hit the search button.

A few concrete reasons teams build Google Flights scrapers in 2026:

- **Personal price tracking** that watches a specific route for a fare drop, then fires an alert through Discord or email when it crosses a threshold.
- **Travel agency competitive intelligence** that benchmarks the agency's negotiated fares against the public Google Flights price for the same route on the same day.
- **Corporate travel optimization** that pulls fares for a known set of routes on a daily cadence to enforce booking-window policies.
- **Fare comparison apps** that surface the cheapest fare across origins, dates, or cabin classes for a given destination.
- **Academic and market research** on airline pricing volatility, seat-load behavior, and CO2 emissions per route.

Manual checking falls apart almost immediately. A single round-trip search returns dozens of options, and any of them can move by 30 percent between two refreshes when an inventory bucket resets. A scraper keeps a stable record of that movement, which is what makes the downstream analysis possible.

Before any of this works, it helps to know what the page actually exposes. The next section covers the data fields that a Google Flights search returns for a typical query.



## What Data Can You Scrape from Google Flights?

A standard Google Flights search result exposes 12 to 15 structured fields per flight. Availability shifts a little by route, ticket type, and whether the card is expanded, but the core set below is consistent enough to model in a Pydantic schema or a single sqlite row.

| Field | Example | Notes |
|---|---|---|
| `airline` | `"Air France"` | Parsed from the card's `aria-label` |
| `flight_number` | `"AF 023"` | Extracted from the TravelImpact data attribute |
| `departure_airport` | `"JFK"` | IATA code from the expanded leg panel |
| `arrival_airport` | `"CDG"` | IATA code from the expanded leg panel |
| `departure_time` | `"19:30"` | Local time at origin |
| `arrival_time` | `"08:55+1"` | `+1` suffix marks next-day arrival |
| `duration` | `"Total duration 7 hr 25 min"` | Full label from the card's `aria-label` |
| `stops` | `0` | `0` means nonstop, parsed from `aria-label` |
| `layovers` | `[{airport, duration}]` | One entry per stopover, with IATA code |
| `price` | `"612"` | String, parsed from the card's `aria-label` |
| `currency` | `"USD"` | Set by the `currency` param in `build_search_url` |
| `cabin_class` | `"economy"` | From the expanded leg panel |
| `plane_model` | `"Boeing 777"` | From the expanded leg panel |
| `extensions` | `["wifi", "power outlets"]` | Amenity flags from the leg and card |
| `legroom` | `"31 in"` | Parsed from the leg's extensions list |

One-way searches omit `return_date`. Low-cost carriers often skip the plane model, and `flight_number` falls back to `None` on grouped multi-leg cards. All optional fields in `FlightResult` default to `None` when absent.

A normalized JSON record matching the `FlightSearch` shape the scraper returns:

json```json
{
  "search_date": "2026-08-01",
  "route": "JFK-CDG",
  "departure_date": "2026-09-15",
  "return_date": "2026-09-22",
  "flights": [
    {
      "airline": "Air France",
      "flight_number": "AF 023",
      "departure_time": "19:30",
      "departure_airport": "JFK",
      "arrival_time": "08:55+1",
      "arrival_airport": "CDG",
      "duration": "Total duration 7 hr 25 min",
      "stops": 0,
      "layovers": [],
      "price": "612",
      "currency": "USD",
      "cabin_class": "economy",
      "plane_model": "Boeing 777",
      "extensions": ["wifi", "power outlets", "in-flight media"],
      "legroom": "31 in"
    }
  ]
}
```



That structure is enough to power fare tracking, emissions analysis, and competitive pricing without pulling any other data source. The legwork is in getting the page to render correctly and parsing the right cards out of it. The next section covers project setup before we start writing extraction code.



## Project Setup

The full stack is Python 3.10 or newer, the Scrapfly Python SDK, `parsel` for HTML parsing, and `loguru` for logging. No Playwright install, no proxy pool, no fingerprint patches.

Install the packages with `pip`:

shell```shell
$ pip install scrapfly-sdk parsel loguru
```



The Scrapfly SDK handles JavaScript rendering, residential proxies, and bot-detection bypass on Scrapfly's side. The `parsel` library gives CSS and XPath parsing for the rendered HTML response. `loguru` is a lightweight logger the scraper uses to report parsed results.

Sign up for a free Scrapfly account at [scrapfly.io/register](https://scrapfly.io/register) to get an API key, then export it once in the shell so the code below can read it from the environment:

shell```shell
$ export SCRAPFLY_KEY="scp-live-..."
```



The full import block and Scrapfly client setup for the scraper:

python```python
import os
import re
from datetime import datetime
from pathlib import Path
from typing import List, Optional, TypedDict
from urllib.parse import quote_plus

from loguru import logger as log
from scrapfly import ScrapeApiResponse, ScrapeConfig, ScrapflyClient

SCRAPFLY = ScrapflyClient(key=os.environ["SCRAPFLY_KEY"])

BASE_CONFIG = {
    "asp": True,
    "country": "US",
    "render_js": True,
}

output = Path(__file__).parent / "results"
output.mkdir(exist_ok=True)
```



`BASE_CONFIG` holds the three flags every Google Flights request needs. `asp=True` activates Anti-Scraping Protection. `render_js=True` runs the page JavaScript on Scrapfly's side. `country="US"` anchors the residential proxy to a US exit so fare currency is consistent.

For a deeper look at how Python web scraping projects are normally laid out, the introduction guide below covers structure, sessions, and parsing fundamentals that this article does not repeat.

[Web Scraping with PythonIntroduction tutorial to web scraping with Python. How to collect and parse public data. Challenges, best practices and an example project.](https://scrapfly.io/blog/posts/web-scraping-with-python)

With the SDK installed and the API key in place, the next step is understanding how Google Flights actually encodes a search into a URL.



## How to Find Google Flights URLs

Google Flights packs every trip parameter into a single URL field called `tfs`, a Base64-encoded Protobuf message that holds the trip type, the origin and destination airports, the dates, the passenger counts, and the cabin class. Build the right `tfs` value and the search results page renders without ever touching the search form.

A real Google Flights search URL has this shape:

text```text
https://www.google.com/travel/flights/search?tfs=<base64-protobuf>&hl=en&curr=USD
```



The three parameters that matter for a scraper:

- **`tfs`** is the encoded trip definition. Origin airport, destination airport, departure date, return date, trip type, passengers, and cabin class all live inside this single field.
- **`hl`** sets the interface language. `en` for English, `fr` for French, `de` for German, and so on.
- **`curr`** sets the displayed currency. `USD`, `EUR`, `GBP`, and any other ISO 4217 code Google supports.

The Protobuf message inside `tfs` follows a fixed schema that other tutorials use without explaining. The key fields are:

- Field 2 holds the flight legs. Each leg has its own origin airport code, destination airport code, and departure date.
- Field 9 holds the trip type. 1 is one-way, 2 is round-trip, 3 is multi-city.
- Field 8 holds the passenger count, with separate sub-fields for adults, children, and infants.
- Field 5 holds the cabin class. 1 is economy, 2 is premium economy, 3 is business, 4 is first.

Building the Protobuf message by hand is possible, but it is also avoidable in 99 percent of cases. The pragmatic approach for most scrapers is to use the simpler query URL that Google Flights also accepts, which lets the search page parse the trip definition from a plain-text query string:

python```python
def build_search_url(
    origin: str,
    destination: str,
    depart: str,
    ret: Optional[str] = None,
    currency: str = "USD",
) -> str:
    if ret:
        query = f"Flights from {origin} to {destination} on {depart} through {ret}"
    else:
        query = f"one way flights from {origin} to {destination} on {depart}"
    return f"https://www.google.com/travel/flights?q={quote_plus(query)}&hl=en&curr={currency}"

if __name__ == "__main__":
    url = build_search_url("JFK", "CDG", "2026-09-15", "2026-09-22", currency="USD")
    print(url)
```



Google parses the natural-language query, constructs the `tfs` value internally, and redirects to the canonical results page. For multi-city queries or URL-level filters, building the `tfs` Protobuf with a helper library is the fallback.

The `tfs` parameter is essentially Google Flights' hidden API. Once you can produce a valid URL, every search becomes a single HTTP request. The next section covers how to send that request through Scrapfly and parse the results.



Scrapfly

#### Scale your web scraping effortlessly

Scrapfly handles proxies, browsers, and anti-bot bypass — so you can focus on data.

[Try Free →](https://scrapfly.io/register)## How to Scrape Google Flights Search

The full search pattern is three steps. Build the URL, send it through the Scrapfly SDK with JavaScript rendering and anti-scraping protection on, then parse the rendered HTML into structured records.

### Configuring the Scrapfly Request

Google Flights is a JavaScript-rendered SPA. Getting full data, including cabin class, legroom, and layovers, requires clicking "View more flights" and expanding each flight's detail panel before parsing.

The scraper uses a `js_scenario`, a sequence of browser actions Scrapfly executes before returning the HTML:

python```python
_SHOW_MORE_SCENARIO = [
    {
        "click": {
            "selector": "button[jsname='b3VHJd']",
            "ignore_if_not_visible": True,
            "ignore": True,
        }
    },
    {"wait": 1000},
    {
        "wait_for_selector": {
            "selector": 'button[aria-label="View more flights"]',
            "timeout": 15000,
        }
    },
    {
        "click": {
            "selector": 'button[aria-label="View more flights"]',
            "ignore_if_not_visible": True,
            "ignore": True,
        }
    },
    {"wait": 3000},
    {
        "execute": {
            "script": "document.querySelectorAll('button[aria-label^=\"Flight details\"]').forEach(b => b.click())",
            "timeout": 15000,
        }
    },
    {"wait": 2000}
]
```



The scenario dismisses the consent banner if present, loads additional results, then bulk-clicks every "Flight details" button so the expanded leg panels are in the DOM when parsing begins.

The main scrape function passes `BASE_CONFIG` and the scenario together:

python```python
async def scrape_flights(
    origin: str,
    destination: str,
    depart: str,
    ret: Optional[str] = None,
    currency: str = "USD",
) -> FlightSearch:
    """Scrape Google Flights results for a route and date(s)."""
    url = build_search_url(origin, destination, depart, ret, currency)
    response = await SCRAPFLY.async_scrape(
        ScrapeConfig(url, **BASE_CONFIG, js_scenario=_SHOW_MORE_SCENARIO)
    )
    year = int(depart.split("-")[0]) if depart else 0
    return FlightSearch(
        search_date=datetime.now().strftime("%Y-%m-%d"),
        route=f"{origin}-{destination}",
        departure_date=depart,
        return_date=ret,
        flights=parse_flights(response, year=year, currency=currency),
    )
```



`response.selector` is a live `parsel.Selector` on the rendered HTML. The year extracted from the departure date lets the leg parser compute overnight-arrival offsets, and the parsed flights are wrapped in a `FlightSearch` record alongside the search date, route, and dates. The parsing internals come next.

### Extracting Flight Data from Search Results

Each `li.pIav2d` card has a summary row with price, airline, and times. The expanded `div[jsname='XxAJue']` panel holds per-leg data for aircraft, cabin class, legroom, and layovers. Parsing is a per-leg pass followed by a per-card assembly pass.

The data shapes use `TypedDict` so the output is typed without adding a Pydantic dependency:

python```python
class AirportInfo(TypedDict):
    name: Optional[str]
    id: Optional[str]
    time: Optional[str]


class FlightLeg(TypedDict):
    departure_airport: Optional[AirportInfo]
    arrival_airport: Optional[AirportInfo]
    duration: Optional[str]
    airplane: Optional[str]
    airline: Optional[str]
    airline_logo: Optional[str]
    travel_class: Optional[str]
    flight_number: Optional[str]
    legroom: Optional[str]
    extensions: List[str]


class Layover(TypedDict):
    airport: str
    duration: Optional[str]


class FlightResult(TypedDict):
    airline: Optional[str]
    flight_number: Optional[str]
    departure_time: Optional[str]
    departure_airport: Optional[str]
    arrival_time: Optional[str]
    arrival_airport: Optional[str]
    duration: Optional[str]
    stops: int
    layovers: List[Layover]
    price: Optional[int]
    currency: str
    cabin_class: Optional[str]
    plane_model: Optional[str]
    extensions: List[str]
    legroom: Optional[str]


class FlightSearch(TypedDict):
    search_date: str
    route: str
    departure_date: str
    return_date: Optional[str]
    flights: List[FlightResult]
```



Helper functions for regex extraction, stop counting, overnight-arrival suffixes, layover parsing, and amenity lists:

python```python
def _find(pattern: str, text: str) -> Optional[str]:
    m = re.search(pattern, text, re.IGNORECASE)
    return m.group(1).strip() if m else None


def parse_stops(text: Optional[str]) -> int:
    if not text or "nonstop" in text.lower():
        return 0
    match = re.search(r"(\d+)", text)
    return int(match.group(1)) if match else 0


def _arrival_with_suffix(dep_full: Optional[str], arr_full: Optional[str]) -> Optional[str]:
    if dep_full and arr_full:
        try:
            dep_dt = datetime.strptime(dep_full, "%Y-%m-%d %H:%M")
            arr_dt = datetime.strptime(arr_full, "%Y-%m-%d %H:%M")
            diff = (arr_dt.date() - dep_dt.date()).days
            return arr_dt.strftime("%H:%M") + (f"+{diff}" if diff > 0 else "")
        except ValueError:
            pass
    return arr_full


def _parse_layovers(panel) -> List[Layover]:
    layovers: List[Layover] = []
    for el in panel.css("div.tvtJdb"):
        text = " ".join(el.css("::text").getall())
        code = re.search(r"\(([A-Z]{3})\)", text)
        if code:
            layovers.append(Layover(airport=code.group(1), duration=text))
    return layovers


def _extract_extensions(container) -> List[str]:
    items: List[str] = []
    for li in container.css("li.WtSsrd"):
        text = (
            li.css("span.gI4d6d::text").get()
            or li.css("span.g6UICf::text").get()
            or li.xpath("text()[normalize-space()]").get("").strip()
        )
        if text:
            items.append(text)
    return items
```



The leg parser reads each `div[jsname='lVbzR']` segment from the expanded detail panel:

python```python
def parse_flight_legs(panel, year: int) -> List[FlightLeg]:
    legs: List[FlightLeg] = []
    for leg in panel.css("div[jsname='lVbzR']"):
        logo_style = leg.css("[style*='airline_logos/70px/']::attr(style)").get() or ""
        logo = re.search(r"url\((https://[^)]+\.png)\)", logo_style)

        plain = [
            s.css("::text").get()
            for s in leg.css("div.MX5RWe span.Xsgmwe")
            if "sI2Nye" not in s.attrib.get("class", "")
            and s.attrib.get("jsname", "") != "Pvlywd"
            and "QS0io" not in s.attrib.get("class", "")
        ]
        airline = plain[0] if plain else None
        airplane = plain[-1] if len(plain) > 1 else (plain[0] if plain else None)

        fn_raw = leg.css("div.MX5RWe span.Xsgmwe.sI2Nye::text").get()
        flight_number = fn_raw.replace("\xa0", " ") if fn_raw else None

        dep_name = (leg.css("div.ZHa2lc::text").get() or "").strip()
        dep_code = (leg.css("div.ZHa2lc span[dir='ltr']::text").get() or "").strip("()")
        arr_name = (leg.css("div.FY5t7d::text").get() or "").strip()
        arr_code = (leg.css("div.FY5t7d span[dir='ltr']::text").get() or "").strip("()")

        extensions = _extract_extensions(leg)
        legroom = next(
            (
                m.group(1)
                for ext in extensions
                if (m := re.search(r"legroom\s*\((\d+ in)\)", ext, re.IGNORECASE))
            ),
            None,
        )

        legs.append(
            FlightLeg(
                departure_airport=AirportInfo(name=dep_name or None, id=dep_code or None, time=leg.css("div.ZHa2lc::text").get()),
                arrival_airport=AirportInfo(name=arr_name or None, id=arr_code or None, time=leg.css("div.FY5t7d::text").get()),
                duration=leg.css("div.P102Lb::text").get(),
                airplane=airplane,
                airline=airline,
                airline_logo=logo.group(1) if logo else None,
                travel_class=leg.css("span[jsname='Pvlywd']::text").get(),
                flight_number=flight_number,
                legroom=legroom,
                extensions=extensions,
            )
        )
    return legs
```



The top-level parser iterates the `li.pIav2d` cards, deduplicates by `aria-label`, and assembles each card's legs into a `FlightResult`:

python```python
def _flight_number_from_card(card) -> Optional[str]:
    url = (
        card.css("[data-travelimpactmodelwebsiteurl]::attr(data-travelimpactmodelwebsiteurl)").get()
        or ""
    )
    m = re.search(r"[A-Z]+-[A-Z]+-([A-Z]\w+)-(\d+)-\d{8}", url)
    return f"{m.group(1)} {m.group(2)}" if m else None


def _card_extensions(card, panel) -> List[str]:
    seen, out = set(), []
    for text in _extract_extensions(panel) + [
        t.strip() for t in card.css("div.U0scI div::text").getall() if t.strip()
    ]:
        if text not in seen:
            seen.add(text)
            out.append(text)
    return out


def parse_flights(
    response: ScrapeApiResponse, year: int = 0, currency: str = "USD"
) -> List[FlightResult]:
    flights: List[FlightResult] = []
    seen_labels: set = set()

    for card in response.selector.css("li.pIav2d"):
        label = card.css("div[role='link']::attr(aria-label)").get() or ""
        if not label or label in seen_labels:
            continue
        seen_labels.add(label)

        panel = card.css("div[jsname='XxAJue']")
        legs = parse_flight_legs(panel, year)
        first_leg = legs[0] if legs else None
        last_leg = legs[-1] if legs else None

        dep_full = (first_leg["departure_airport"] or {}).get("time") if first_leg else None
        arr_full = (last_leg["arrival_airport"] or {}).get("time") if last_leg else None
        dep_time = dep_full or card.css("span[aria-label^='Departure time']::text").get()
        arr_time = _arrival_with_suffix(dep_full, arr_full) or card.css("span[aria-label^='Arrival time']::text").get()

        price = _find(r"From (\d[\d,]*) \w+ dollars", label) or 0
        duration_label = card.css("div[aria-label^='Total duration']::attr(aria-label)").get() or ""

        flights.append(
            FlightResult(
                airline=_find(r"flight with (.+?)\.", label),
                flight_number=_flight_number_from_card(card),
                departure_time=dep_time,
                departure_airport=(first_leg["departure_airport"] or {}).get("id") if first_leg else None,
                arrival_time=arr_time,
                arrival_airport=(last_leg["arrival_airport"] or {}).get("id") if last_leg else None,
                duration=duration_label,
                stops=parse_stops(_find(r"(Nonstop|\d+ stops?)", label)),
                layovers=_parse_layovers(panel),
                price=price,
                currency=currency,
                cabin_class=(first_leg.get("travel_class") or "").lower() if first_leg else None,
                plane_model=first_leg.get("airplane") if first_leg else None,
                extensions=_card_extensions(card, panel),
                legroom=first_leg.get("legroom") if first_leg else None,
            )
        )

    log.success(f"parsed {len(flights)} flights")
    return flights
```



`aria-label` deduplication prevents double-counting during SPA updates. Cabin class and plane model come from the expanded leg panel. `_arrival_with_suffix` converts raw datetime strings into `HH:MM` or `HH:MM+1`.

Save the scraper above as `google_flights.py`. A separate `run.py` imports it, scrapes both a round-trip and a one-way search, and saves each result to a JSON file in `./results/`:

python```python
import asyncio
import json
from datetime import datetime, timedelta
from pathlib import Path

import google_flights

output = Path(__file__).parent / "results"
output.mkdir(exist_ok=True)

TODAY = datetime.now().strftime('%Y-%m-%d')
WEEK_FROM_NOW = (datetime.now() + timedelta(days=7)).strftime('%Y-%m-%d')


async def run():
    google_flights.BASE_CONFIG["cache"] = False
    google_flights.BASE_CONFIG["debug"] = False

    print("running Google Flights scrape and saving results to ./results directory")

    roundtrip = await google_flights.scrape_flights(
        origin="JFK",
        destination="CDG",
        depart=TODAY,
        ret=WEEK_FROM_NOW,
        currency="USD",
    )
    with open(output / "roundtrip.json", "w", encoding="utf-8") as f:
        json.dump(roundtrip, f, indent=2, ensure_ascii=False)

    oneway = await google_flights.scrape_flights(
        origin="JFK",
        destination="LHR",
        depart=TODAY,
        currency="USD",
    )
    with open(output / "oneway.json", "w", encoding="utf-8") as f:
        json.dump(oneway, f, indent=2, ensure_ascii=False)


if __name__ == "__main__":
    asyncio.run(run())
```



`run.py` toggles off `cache` and `debug`, then saves each search to `./results/`. An illustrative `roundtrip.json` (a `FlightSearch` object) for a JFK to CDG route, trimmed to the first two flights (live fares and aircraft vary by query):

json```json
{
  "search_date": "2026-08-01",
  "route": "JFK-CDG",
  "departure_date": "2026-09-15",
  "return_date": "2026-09-22",
  "flights": [
    {
      "airline": "Air France",
      "flight_number": "AF 023",
      "departure_time": "19:30",
      "departure_airport": "JFK",
      "arrival_time": "08:55+1",
      "arrival_airport": "CDG",
      "duration": "Total duration 7 hr 25 min",
      "stops": 0,
      "layovers": [],
      "price": "612",
      "currency": "USD",
      "cabin_class": "economy",
      "plane_model": "Boeing 777",
      "extensions": ["wifi", "power outlets", "in-flight media"],
      "legroom": "31 in"
    },
    {
      "airline": "Delta",
      "flight_number": "DL 404",
      "departure_time": "21:05",
      "departure_airport": "JFK",
      "arrival_time": "10:30+1",
      "arrival_airport": "CDG",
      "duration": "Total duration 7 hr 25 min",
      "stops": 0,
      "layovers": [],
      "price": "638",
      "currency": "USD",
      "cabin_class": "economy",
      "plane_model": "Airbus A330",
      "extensions": ["power outlets"],
      "legroom": "32 in"
    }
  ]
}
```



Google rotates CSS classes roughly once a month. Keep selectors in a config dictionary so a class change means updating one file, not refactoring the whole parser.

[Find Web Elements with ChatGPT and XPath or CSS selectorsChatGPT is becoming a popular assistant in web scraper development. In this article, we'll take a look at how to use it in HTML using it to generate XPath and CSS selectors.](https://scrapfly.io/blog/posts/finding-web-selectors-with-chatgpt)

With a working extractor in place, the next concern is keeping the scraper running. Google's anti-bot stack is what blocks vanilla scrapers long before selector churn becomes the problem.



## Bypass Google Flights Blocking with Scrapfly

Google checks three signals on every request: IP address, browser fingerprint, and behavior pattern. Vanilla scrapers fail all three. Scrapfly fixes all three with `asp=True`.



ScrapFly's [Web Scraping API](https://scrapfly.io/web-scraping-api) is a single HTTP endpoint for collecting web data at scale, with a **99.99% success rate** across **130M+ proxies in 190+ countries**.

- [Anti-Scraping Protection bypass](https://scrapfly.io/docs/scrape-api/anti-scraping-protection) - automatically defeats Cloudflare, DataDome, PerimeterX, Akamai, and 90+ other bot systems.
- [Smart proxy rotation](https://scrapfly.io/docs/scrape-api/proxy) - residential and datacenter pools with country and ASN level geo-targeting.
- [JavaScript rendering](https://scrapfly.io/docs/scrape-api/javascript-rendering) - render SPAs and dynamic pages through real cloud browsers.
- [Browser automation scenarios](https://scrapfly.io/docs/scrape-api/javascript-scenario) - scroll, click, fill forms, and wait for elements without managing a browser fleet.
- [Format conversion](https://scrapfly.io/docs/scrape-api/getting-started#api_param_format) - return pages as HTML, JSON, clean text, or LLM ready Markdown.
- [Session management](https://scrapfly.io/docs/scrape-api/session) - keep cookies, headers, and IPs consistent across multi step flows.
- [Smart caching](https://scrapfly.io/docs/scrape-api/getting-started#api_param_cache) - cache successful responses to cut cost on repeat scraping jobs.
- [Python](https://scrapfly.io/docs/sdk/python), [TypeScript](https://scrapfly.io/docs/sdk/typescript), [Scrapy](https://scrapfly.io/docs/sdk/scrapy), and [no-code integrations](https://scrapfly.io/docs/integration/getting-started) including Make, n8n, Zapier, LangChain, and LlamaIndex.

The `BASE_CONFIG` combined with the JS scenario handles every one of those in a single call:

python```python
response = await SCRAPFLY.async_scrape(
    ScrapeConfig(url, **BASE_CONFIG, js_scenario=_SHOW_MORE_SCENARIO)
)
```



Each flag in `BASE_CONFIG` maps to one of the detection layers above:

- **`asp=True`** is the umbrella flag. It enables residential proxy rotation, fingerprint spoofing, TLS normalization, and automatic retries. One flag, months of scraper uptime.
- **`country="US"`** anchors the proxy to a US residential exit. Switch the code to target a different fare display, `"gb"` for UK pricing, `"de"` for euro pricing on a German residential IP, and so on.
- **`render_js=True`** spins up a managed browser on Scrapfly's side and runs Google's JavaScript with a fingerprint that does not trigger detection.

If a particular route or geography starts returning empty pages even with the flags above, the Scrapfly anti-bot guide covers the deeper mechanics of how the detection layers stack and which secondary flags help in specific blocking patterns.

[How to Bypass Anti-Bot Protection When Web ScrapingLearn how anti-bot systems detect scrapers and 5 universal bypass techniques including proxy rotation, fingerprinting, and fortified headless browsers.](https://scrapfly.io/blog/posts/how-to-bypass-anti-bot-protection-when-web-scraping)

Selector churn is separate from anti-bot. Google rotates CSS classes regularly, so isolate selectors in a config layer. With both handled, the scraper is ready to run on a schedule.

The remaining questions tend to be specific and tactical, which is what the FAQ covers next.



## FAQ

Is There an Official Google Flights API?No. Google discontinued the public Flights API, called QPX Express, in April 2018. Since then, scraping the public Google Flights interface has been the standard programmatic route to fare and itinerary data, and a handful of third-party fare APIs have filled parts of the gap at a cost.







How Often Do Google Flights Selectors Change? Roughly once a month during active periods, sometimes more often near peak travel seasons. Plan for selector churn from the start by keeping every CSS query in a small `selectors.py` config dictionary so updates take seconds, not hours of refactoring.







Can I Use the Scrapfly SDK Without Playwright?Yes. Setting `render_js=True` on the `ScrapeConfig` tells Scrapfly to render JavaScript on its side and return fully loaded HTML. The local Python script never installs Playwright, never launches a browser, and never holds a Chromium process in memory.







Why Do My Searches Return Empty Result Pages?Two patterns explain almost every empty page. Missing `asp=True` makes Google silently serve an empty shell to a request that looks like automation. An incomplete `js_scenario` that doesn't expand the detail panels leaves cards unrendered. The full scenario in this guide resolves both.







Can I Scrape Google Flights Without Render JS?Not reliably. The result cards are injected by client-side JavaScript after the initial HTML loads, and the raw response from a plain HTTP request contains no flight data. JavaScript rendering is non-negotiable on this target.









## Summary

This guide built a working Google Flights scraper with the Scrapfly Python SDK. The four core steps are URL construction via `build_search_url`, a JS scenario that expands flight detail panels, a two-layer parser that extracts per-leg data, and `TypedDict` models that hold all 15 output fields.

Selector churn is the main maintenance cost. Keep CSS selectors in a config dictionary so a Google class rotation is a one-file fix.

For a production pipeline, plug `scrape_flights` into a scheduler and a sqlite store to get a price tracker that runs continuously.

The Scrapfly SDK handles JavaScript rendering, residential proxies, and fingerprint spoofing in one `async_scrape()` call. The [free tier](https://scrapfly.io/register) is enough to run the full pattern end to end.



Legal Disclaimer and PrecautionsThis tutorial covers popular web scraping techniques for education. Interacting with public servers requires diligence and respect:

- Do not scrape at rates that could damage the website.
- Do not scrape data that's not available publicly.
- Do not store PII of EU citizens protected by GDPR.
- Do not repurpose *entire* public datasets which can be illegal in some countries.

Scrapfly does not offer legal advice but these are good general rules to follow. For more you should consult a lawyer.

 

   Table of Contents















 

  Table of Contents- [Key Takeaways](#key-takeaways)
- [Why Scrape Google Flights?](#why-scrape-google-flights)
- [What Data Can You Scrape from Google Flights?](#what-data-can-you-scrape-from-google-flights)
- [Project Setup](#project-setup)
- [How to Find Google Flights URLs](#how-to-find-google-flights-urls)
- [How to Scrape Google Flights Search](#how-to-scrape-google-flights-search)
- [Configuring the Scrapfly Request](#configuring-the-scrapfly-request)
- [Extracting Flight Data from Search Results](#extracting-flight-data-from-search-results)
- [Bypass Google Flights Blocking with Scrapfly](#bypass-google-flights-blocking-with-scrapfly)
- [FAQ](#faq)
- [Summary](#summary)
 
    Join the Newsletter  Get monthly web scraping insights 

 

  



Scale Your Web Scraping

Anti-bot bypass, browser rendering, and rotating proxies, all in one API. Start with 1,000 free credits.

  No credit card required  1,000 free API credits  Anti-bot bypass included 

 [Start Free](https://scrapfly.io/register) [View Docs](https://scrapfly.io/docs/onboarding) 

 Not ready? Get our newsletter instead. 

 

## Explore this Article with AI

 [ ChatGPT ](https://chat.openai.com/?q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-google-flights) [ Gemini ](https://www.google.com/search?udm=50&aep=11&q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-google-flights) [ Grok ](https://x.com/i/grok?text=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-google-flights) [ Perplexity ](https://www.perplexity.ai/search/new?q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-google-flights) [ Claude ](https://claude.ai/new?q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-google-flights) 



 ## Related Articles

 [  

 python scrapeguide 

### How to Scrape Google Search Results in 2026

In this scrape guide we'll be taking a look at how to scrape Google Search - the biggest index of public web. We'll cov...

 

 ](https://scrapfly.io/blog/posts/how-to-scrape-google) [     

 python blocking 

### How to Scrape Air France Flights with Python in 2026

Scrape Air France round-trip flight offers with Python and the Scrapfly Cloud Browser API: walk the booking widget, capt...

 

 ](https://scrapfly.io/blog/posts/how-to-scrape-air-france-flights) [  

 python playwright 

### How to Scrape Google Maps

We'll take a look at to find businesses through Google Maps search system and how to scrape their details using either S...

 

 ](https://scrapfly.io/blog/posts/how-to-scrape-google-maps) 

  



   



 Scale your web scraping effortlessly, **1,000 free credits** [Start Free](https://scrapfly.io/register)