     [Blog](https://scrapfly.io/blog)   /  [python](https://scrapfly.io/blog/tag/python)   /  [How to Scrape LinkedIn Profiles, Companies, and Jobs in 2026](https://scrapfly.io/blog/posts/how-to-scrape-linkedin)   # How to Scrape LinkedIn Profiles, Companies, and Jobs in 2026

 by [Mazen Ramadan](https://scrapfly.io/blog/author/mazen) Jun 05, 2026 30 min read [\#python](https://scrapfly.io/blog/tag/python) [\#scrapeguide](https://scrapfly.io/blog/tag/scrapeguide) 

 [  ](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-linkedin "Share on LinkedIn")    

 

 

   

Ever wondered why your LinkedIn scraper gets stopped dead in its tracks, even when you’re only accessing public data? You get away with viewing a few profiles then suddenly, You've been hit by a login wall, while your regular browser can still browse LinkedIn fine.

That’s not just bad luck. LinkedIn’s defenses go far beyond simple IP blocks; they combine advanced fingerprinting, behavioral detection, and real-time fraud scoring to permanently target bots and automated scripts.

In this guide, you’ll learn how LinkedIn blocks scrapers and how ScrapFly’s tools bypass those blocks, with step-by-step Python code for scraping profiles, companies, jobs, and searches.

Let’s break through LinkedIn’s walls safely, efficiently, and at scale.

**Want to skip the technical details?**

Clone ScrapFly's production-ready LinkedIn scraper:

bash```bash
git clone https://github.com/scrapfly/scrapfly-scrapers.git
cd scrapfly-scrapers/linkedin-scraper
```



[**View Source Code**github.com/scrapfly/scrapfly-scrapers/tree/main/linkedin-scraper](https://github.com/scrapfly/scrapfly-scrapers/tree/main/linkedin-scraper)



## Why Scrape LinkedIn?

consider what makes LinkedIn data uniquely valuable. open LinkedIn profiles and company data fuel smarter business strategies and give organizations a real edge.

Here’s what you can achieve by accessing professional data effectively:

**Recruitment Intelligence:** Gain insights into job market trends, salary benchmarks, and analyze skill demand.

**Sales Prospecting:** Identify decision-makers, understand company org charts, and improve contact details.

**Market Research:** Discover industry growth patterns, track competitor hiring, and study talent migration.

**Competitive Analysis:** Spot company expansion signals, examine team structures, and monitor strategic moves.

LinkedIn holds professional data that's publicly visible but defensively protected. The challenge isn't accessing data, it's accessing it at scale without detection. Below we break down the specific defenses you’ll hit when you try to scale a scraper.

## Why LinkedIn Scraping Fails Without Proper Tools

LinkedIn has an advanced security system in place to stop web scrapers who try to access its public data. It rapidly creates authentication constraints requiring users to log in after only a few profile views.

In addition to behavioral analysis, LinkedIn uses advanced request fingerprinting techniques. This involves evaluating factors like the quality and origin of IP addresses, browser specific headers and cookies, and device attributes. LinkedIn synthesizes all of these signals to generate a fraud score for each visitor.

Understanding the nature of these constraints will give context for the hands on strategies and technical workarounds covered in the following sections.

### 1. Authentication Wall

LinkedIn aggressively limits access to its vast trove of data unless users are authenticated. Most profile, company, and job information is locked behind a login barrier after just a few page views.

anonymous visitors are hit with sign-in prompts and blocked from seeing further details. This strict authentication wall is LinkedIn’s primary line of defense against unauthorized scraping. here are some common examples:

- **Profile views:** After viewing just 3–5 profiles, LinkedIn requires you to log in.
- **Company data:** Much of it is hidden unless you’re logged in.
- **Search:** Public users are completely blocked from using search.

**About Logging In:** Scraping while logged into an account might seem like an easy workaround. However, this goes against LinkedIn’s Terms of Service and can quickly lead to your accounts being permanently banned.

### 2. Behavioral Tracking

LinkedIn watches how people behave on the site and compares that to each incoming request. Scrapers usually stand out because their behavior looks different from a real user. Common differences include:

- **Request timing:** Human users don't view 100 profiles per minute, while bots do
- **Navigation patterns:** Real users click on buttons, scroll down pages, pause on random timing; bots don't
- **Mouse movements:** Real users have a natural mouse activity, for highlighting text, hovering into elements, etc. Bots don't have mouse activity.
- **Referrer and navigation flow:** Actual users move through internal links, such as search pages. On the other hand, bots request the URL directly without having common navigation profiles.

If your scraper does not mimic these behaviors, LinkedIn will notice. To avoid detection you need realistic timing, navigation, and interaction patterns. see our headless browser guide that shows how to add natural clicks, scrolls, and pauses.

[How to Scrape Dynamic Websites Using Headless Web BrowsersIntroduction to using web automation tools such as Puppeteer, Playwright, Selenium and ScrapFly to render dynamic websites for web scraping](https://scrapfly.io/blog/posts/scraping-using-browsers)

### 3. Request Fingerprinting

Beyond behavior, LinkedIn checks the technical fingerprint of each request. This includes several signals that are hard for simple scripts to fake:

- **IP address quality:** Whether the requested IP address is residential coming from a real home internet provider, or a datacenter associated with a hosting network.
- **TLS analysis:** When a request is sent to the application's web server, it establishes a TLS handshake. This handshake leads to creating a fingerprint called JA3. LinkedIn uses this fingerprint and compares it with those of normal users.
- **Headers and Cookies:** Whether the request uses the standard browser headers or it has a cookie a normal user might have. Such cookie chains are often obtained through natural navigation through the web app.
- **Device fingerprints:** Whether the used browser's attributes meet real browser requirements, these include the web browser metadata, device hardware capabilities, operating system and version, and other related specifications that couldn't meet real browser values.

**Fraud Scoring Logic**

LinkedIn combines all the signals above to calculate a **fraud score** for each incoming client. This score is then compared against the average patterns of real users. Based on that comparison, LinkedIn decides whether to:

- **Approve** the request as legitimate and allow it through.
- **Flag or block** the request if it appears automated or suspicious.

In short, every request to LinkedIn is evaluated for authenticity before any data is served.

If your score looks different from typical users, you will be forced to log in, or blocked. For technical deep dives and defensive techniques, check our guide on browser fingerprint impersonation.

[Bypass Proxy Detection with Browser Fingerprint ImpersonationStop proxy blocks with browser fingerprint impersonation using this guide for Playwright, Selenium, curl-impersonate &amp; Scrapfly](https://scrapfly.io/blog/posts/bypass-proxy-detection-with-browser-fingerprint-impersonation)

## How LinkedIn Data is Loaded and How to Scrape it?

LinkedIn is a single-page application that loads its data mainly through a couple of methods. Our LinkedIn scraper will extract LinkedIn data through multiple approaches. So let's briefly explain each, how it works under the hood, how to inspect it in the browser, and how to extract it in code!

### Rendering an HTML Template

This is the oldest and simplest way websites load data. In this method, the server sends a full HTML page that already includes all the data you need. When your browser opens a page, it just receives and displays that pre-built HTML template.

To scrape data from a page like this on LinkedIn, you only need to mimic what a browser does, then extract the information you care about. Here’s the general flow:

1. **Send a request** to the page URL and wait for a successful response.
2. **Read the response body**, which contains the full HTML of the page.
3. **Parse the HTML** and extract the fields you want using **CSS selectors** or **XPath**.

This approach is straightforward because the data is already there in the HTML so no need to load extra scripts or wait for background requests.

### Hydrating the Page Using Script Tags

Many modern websites load data inside `<script>` tags instead of directly in the HTML. This is common in web apps that use **server-side** or **client-side rendering**. In these cases, the page may include a script tag that holds all the data in JSON format, which the browser later uses to build the visible content.



To scrape LinkedIn data from pages like this, you can:

1. **Request the page URL** and get the full HTML response.
2. **Parse the HTML** and find the `<script>` tag that contains the JSON data.
3. **Extract and load** that JSON into an object for further processing.

This technique is often called **hidden data scraping** because the useful information is buried inside script tags that most users and basic scrapers ignore.

### Loading The Data from XHR Calls

Modern websites don’t always load everything in one go. Instead, they often use small background requests called **XHR calls** or **fetch requests** to get extra data as you browse.

For example, when you open a LinkedIn search page, the site first loads the layout, then sends hidden requests to get the actual profile or job results in JSON format. Once the data arrives, the page updates automatically.

To collect this kind of data, you can do the following:

1. **Start a headless browser**
2. **Open the target page** and watch its network activity
3. **Wait for or trigger** the XHR
4. **Capture and read** the response from the browser’s network logs

These XHR calls are often called hidden APIs because they work behind the scenes. In web scraping, getting data from them is known as hidden API scraping, since you’re using the same background requests that a browser already makes.

## ScrapFly's Solution

We know that managing LinkedIn scraping can be a real overhead. That's why we've built [ScrapFly's LinkedIn-scraper](https://github.com/scrapfly/scrapfly-scrapers/tree/main/linkedin-scraper) an open source tool ready to use! It's configured to use ScrapFly's web scraping API, which manages:



ScrapFly provides [web scraping](https://scrapfly.io/docs/scrape-api/getting-started), [screenshot](https://scrapfly.io/docs/screenshot-api/getting-started), and [extraction](https://scrapfly.io/docs/extraction-api/getting-started) APIs for data collection at scale.

- [Anti-bot protection bypass](https://scrapfly.io/docs/scrape-api/anti-scraping-protection) - scrape web pages without blocking!
- [Rotating residential proxies](https://scrapfly.io/docs/scrape-api/proxy) - prevent IP address and geographic blocks.
- [JavaScript rendering](https://scrapfly.io/docs/scrape-api/javascript-rendering) - scrape JavaScript-heavy web pages through cloud browsers.
- [Full browser automation](https://scrapfly.io/docs/scrape-api/javascript-scenario) - control browsers to scroll, input, and click on objects.
- [Format conversion](https://scrapfly.io/docs/scrape-api/getting-started#api_param_format) - scrape as HTML, JSON, Text, or Markdown.
- [Python](https://scrapfly.io/docs/sdk/python) and [Typescript](https://scrapfly.io/docs/sdk/typescript) SDKs, as well as [Scrapy](https://scrapfly.io/docs/sdk/scrapy) and [no-code tool integrations](https://scrapfly.io/docs/integration/getting-started).

Learn more about ScrapFly's anti-bot capabilities on our [Web Scraping API page](https://scrapfly.io/web-scraping-api).

For pricing details, visit our [pricing page](https://scrapfly.io/pricing).

### What LinkedIn Data Can You Scrape?

Web scraping is about collecting raw web data, avoiding detection, and turning it into clean and structured datasets. Our **LinkedIn scraper** handles this entire process automatically, from fetching the data to parsing it, and delivers the output directly in **JSON format** that you can use right away.

With ScrapFly, you can collect different types of LinkedIn data such as:

- **Profiles:** Name, headline, current position, work history, education, skills, number of connections (if visible), and profile picture URL
- **Companies:** Name, industry, size, headquarters, specialties, follower count, recent posts, and related pages
- **Jobs:** Title, location, description, number of applicants, posted date, seniority level, employment type, and required skills
- **Search results:** People search (name, headline, location) and job search (title, company, location, posted date)

Scraping LinkedIn data with ScrapFly is simple and efficient. Here is an example:

python```python
from scrapfly import ScrapflyClient, ScrapeConfig

client = ScrapflyClient(key="YOUR_API_KEY")
result = client.scrape(ScrapeConfig(
    "https://linkedin.com/in/username",
    asp=True,        # Enable anti-scraping protection
    render_js=True,  # Handle dynamic content
    country="US"
))
```



This setup takes care of the complex parts such as JavaScript rendering, location targeting, and anti-bot protection, so you can focus on analyzing the data instead of fighting detection systems.

## Scraping LinkedIn Code Examples

So far, we have covered all the key details needed to scrape LinkedIn data. Now it is time to move on to some practical examples. In the following sections, we will show ready-to-use Python code snippets for extracting data from different parts of LinkedIn.

These examples use **ScrapFly's Python SDK**, which takes care of all the complex parts for you. If you prefer using libraries like [HTTPX](https://pypi.org/project/httpx/) or [requests](https://pypi.org/project/requests/) directly, you will need to manage proxy rotation, fingerprint randomization, and session handling on your own. Without these measures, your scraper will likely get detected after just a few requests.

To follow along with the examples, install the ScrapFly SDK with this command:

bash```bash
$ pip install scrapfly-sdk
```



### Scraping LinkedIn Profile Data

In this section, we'll extract data from publicly available data on LinkedIn user profiles. If we take a look at one of the public LinkedIn profiles (like the one for [Bill Gates](https://www.linkedin.com/in/williamhgates)) we can see loads of valuable public data:



LinkedIn public profile pageBefore we start scraping LinkedIn profiles, let's identify the HTML parsing approach. We can manually parse each data point from the HTML or **extract data from hidden script tags**.

[How to Scrape Hidden Web DataThe visible HTML doesn't always represent the whole dataset available on the page. In this article, we'll be taking a look at scraping of hidden web data. What is it and how can we scrape it using Python?](https://scrapfly.io/blog/posts/how-to-scrape-hidden-web-data)

To locate this hidden data, we can follow these steps:

- Open the [browser developer tools](https://scrapfly.io/blog/answers/browser-developer-tools-in-web-scraping) by pressing the `F12` key.
- Search for the selector: `//script[@type='application/ld+json']`.

This will lead to a `script` tag with the following details:



This gets us the core details available on the page, though a few fields like the job title are missing, as the page is viewed publicly. To scrape it, we'll extract the script and parse it:

python```python
import json
from typing import Dict, List
from parsel import Selector
from loguru import logger as log
from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse

SCRAPFLY = ScrapflyClient(key="Your ScrapFly API key")

BASE_CONFIG = {
    # bypass linkedin.com web scraping blocking
    "asp": True,
    # set the proxy country to US
    "country": "US",
    "headers": {
        "Accept-Language": "en-US,en;q=0.5"
    }
}

def refine_profile(data: Dict) -> Dict: 
    """refine and clean the parsed profile data"""
    parsed_data = {}
    profile_data = [key for key in data["@graph"] if key["@type"]=="Person"][0]
    profile_data["worksFor"] = [profile_data["worksFor"][0]]
    articles = [key for key in data["@graph"] if key["@type"]=="Article"]
    for article in articles:
        selector = Selector(article["articleBody"])
        article["articleBody"] = "".join(selector.xpath("//p/text()").getall())
    parsed_data["profile"] = profile_data
    parsed_data["posts"] = articles
    return parsed_data


def parse_profile(response: ScrapeApiResponse) -> Dict:
    """parse profile data from hidden script tags"""
    selector = response.selector
    data = json.loads(selector.xpath("//script[@type='application/ld+json']/text()").get())
    refined_data = refine_profile(data)
    return refined_data


async def scrape_profile(urls: List[str]) -> List[Dict]:
    """scrape public linkedin profile pages"""
    to_scrape = [ScrapeConfig(url, **BASE_CONFIG) for url in urls]
    data = []
    # scrape the URLs concurrently
    async for response in SCRAPFLY.concurrent_scrape(to_scrape):
        profile_data = parse_profile(response)
        data.append(profile_data)
    log.success(f"scraped {len(data)} profiles from Linkedin")
    return data
```



Run the codepython```python
async def run():
    profile_data = await scrape_profile(
        urls=[
            "https://www.linkedin.com/in/williamhgates"
        ]
    )
    # save the data to a JSON file
    with open("profile.json", "w", encoding="utf-8") as file:
        json.dump(profile_data, file, indent=2, ensure_ascii=False)


if __name__ == "__main__":
    asyncio.run(run())
```



In the above LinkedIn profile scraper, we define three functions. Let's break them down:

- `scrape_profile()`: To request LinkedIn account URLs concurrently and use the parsing logic to extract each profile data.
- `parse_profile()`: To parse the `script` tag containing the profile data.
- `refine_profile()`: To refine and organize the extracted data.

Here's a sample output of the LinkedIn profile data retrieved



With this LinkedIn lead scraper, we can successfully gather detailed information on potential leads, given their job titles, companies, industries, and contact information from LinkedIn profiles. This contact data allows for more personalized and strategic outreach efforts.

For the full retrieved JSON output, you can view the example on [our example dataset on ScrapFly's LinkedIn scraper.](https://github.com/scrapfly/scrapfly-scrapers/blob/main/linkedin-scraper/results/profile.json)

Next, let's explore how to scrape **company data**!



### Scraping LinkedIn Company Data

LinkedIn company profiles include various valuable data points like the company's industry, addresses, number of employees, jobs, and related company businesses. Also, the company profiles are public, meaning that we can scrape their full details!

Let's start by taking a look at a company profile page on LinkedIn such as [Microsoft](https://www.linkedin.com/company/microsoft):



LinkedIn company pageJust like with people pages, the LinkedIn company page data can also be found in hidden `script` tags:



From the above image, we can see that the `script` tag doesn't contain the full company details. So to extract the entire company dataset we'll use a bit of HTML parsing as well:

python```python
import json
import jmespath
from typing import Dict, List
from loguru import logger as log
from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse

SCRAPFLY = ScrapflyClient(key="Your ScrapFly API key")

BASE_CONFIG = {
    "asp": True,
    "country": "US",
    "headers": {
        "Accept-Language": "en-US,en;q=0.5"
    }
}

def strip_text(text):
    """remove extra spaces while handling None values"""
    return text.strip() if text != None else text


def parse_company(response: ScrapeApiResponse) -> Dict:
    """parse company main overview page"""
    selector = response.selector
    script_data = json.loads(selector.xpath("//script[@type='application/ld+json']/text()").get())
    script_data = jmespath.search(
        """{
        name: name,
        url: url,
        mainAddress: address,
        description: description,
        numberOfEmployees: numberOfEmployees.value,
        logo: logo
        }""",
        script_data
    )
    data = {}
    for element in selector.xpath("//div[contains(@data-test-id, 'about-us')]"):
        name = element.xpath(".//dt/text()").get().strip()
        value = element.xpath(".//dd/text()").get().strip()
        data[name] = value
    addresses = []
    for element in selector.xpath("//div[contains(@id, 'address') and @id != 'address-0']"):
        address_lines = element.xpath(".//p/text()").getall()
        address = ", ".join(line.replace("\n", "").strip() for line in address_lines)
        addresses.append(address)
    affiliated_pages = []
    for element in selector.xpath("//section[@data-test-id='affiliated-pages']/div/div/ul/li"):
        affiliated_pages.append({
            "name": element.xpath(".//a/div/h3/text()").get().strip(),
            "industry": strip_text(element.xpath(".//a/div/p[1]/text()").get()),
            "address": strip_text(element.xpath(".//a/div/p[2]/text()").get()),
            "linkeinUrl": element.xpath(".//a/@href").get().split("?")[0]
        })
    similar_pages = []
    for element in selector.xpath("//section[@data-test-id='similar-pages']/div/div/ul/li"):
        similar_pages.append({
            "name": element.xpath(".//a/div/h3/text()").get().strip(),
            "industry": strip_text(element.xpath(".//a/div/p[1]/text()").get()),
            "address": strip_text(element.xpath(".//a/div/p[2]/text()").get()),
            "linkeinUrl": element.xpath(".//a/@href").get().split("?")[0]
        })
    data = {**script_data, **data}
    data["addresses"] = addresses    
    data["affiliatedPages"] = affiliated_pages
    data["similarPages"] = similar_pages
    return data


async def scrape_company(urls: List[str]) -> List[Dict]:
    """scrape prublic linkedin company pages"""
    to_scrape = [ScrapeConfig(url, **BASE_CONFIG) for url in urls]
    data = []
    async for response in SCRAPFLY.concurrent_scrape(to_scrape):
        data.append(parse_company(response))
    log.success(f"scraped {len(data)} companies from Linkedin")
    return data
```



Run the codepython```python
async def run():
    profile_data = await scrape_company(
        urls=[
            "https://linkedin.com/company/microsoft"
        ]
    )
    # save the data to a JSON file
    with open("company.json", "w", encoding="utf-8") as file:
        json.dump(profile_data, file, indent=2, ensure_ascii=False)


if __name__ == "__main__":
    asyncio.run(run())
```



In the above LinkedIn scraping code, we define two functions. Let's break them down:

- `parse_company()`: To parse the company data from `script` tags while using [Quick Intro to Parsing JSON with JMESPath in Python](https://scrapfly.io/blog/posts/parse-json-jmespath-python) to refine it and parse other HTML elements using XPath selectors.
- `scrape_company()`: To request the company page URLs while utilizing the parsing logic.

Here's a sample output of the extracted company information:



For the full retrieved JSON output, you can view the example on [our example dataset on ScrapFly's LinkedIn scraper.](https://github.com/scrapfly/scrapfly-scrapers/blob/main/linkedin-scraper/results/company.json)

The above data covers the company overview. Next, let's move on to LinkedIn's job data and the scrapers built for it.



## Scraping LinkedIn Jobs

You scrape [LinkedIn jobs data](https://scrapfly.io/use-case/jobs-web-scraping) from three public surfaces: job search results, individual job pages, and a single company's listings. Search and company listings come from LinkedIn's hidden `seeMoreJobPostings` API, while each job page keeps its full detail in an `application/ld+json` script tag you read straight from the HTML.

### Scraping LinkedIn Job Search Results

LinkedIn's job search covers millions of listings across industries and locations. You build a search by adding keyword and location parameters to the search URL, like this:

shell```shell
https://www.linkedin.com/jobs/search?keywords=python%2Bdeveloper&location=United%2BStates
```



The URL above uses basic filters. It also accepts extra parameters to narrow results, such as date, experience level, or city.

LinkedIn returns the first batch of results with the page, then pulls more through a hidden API as you scroll. We'll request the first page to read the total result count, then page through the rest with that API.

[How to Scrape Hidden APIsIn this tutorial we'll be taking a look at scraping hidden APIs which are becoming more and more common in modern dynamic websites - what's the best way to scrape them?](https://scrapfly.io/blog/posts/how-to-scrape-hidden-apis)

To find this hidden API, open your browser developer tools, select the network tab, filter by `Fetch/XHR`, and scroll the page. You'll see calls to the `seeMoreJobPostings/search` endpoint, paged with the `start` query parameter:

shell```shell
https://www.linkedin.com/jobs-guest/jobs/api/seeMoreJobPostings/search?keywords=python+developer&location=United+States&start=25
```



Here is the job search scraper:

python```python
import json
import asyncio
from typing import Dict, List
from loguru import logger as log
from urllib.parse import urlencode, quote_plus
from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse

SCRAPFLY = ScrapflyClient(key="Your ScrapFly API key")

BASE_CONFIG = {
    "asp": True,
    "country": "US",
    "headers": {
        "Accept-Language": "en-US,en;q=0.5"
    },
    "render_js": True,
    "proxy_pool": "public_residential_pool"
}

def parse_job_search(response: ScrapeApiResponse) -> List[Dict]:
    """parse job data from job search pages"""
    selector = response.selector
    total_results = selector.xpath("//span[contains(@class, 'job-count')]/text()").get()
    total_results = int(total_results.replace(",", "").replace("+", "")) if total_results else None
    data = []
    search_elements = selector.xpath("//section[contains(@class, 'results-list')]/ul/li")
    if len(search_elements) == 0:  # pagination pages have a different structure
        search_elements = selector.xpath("//li")
    for element in search_elements:
        data.append({
            "title": element.xpath(".//div/a/span/text()").get().strip(),
            "company": element.xpath(".//div/div[contains(@class, 'info')]/h4/a/text()").get().strip(),
            "address": element.xpath(".//div/div[contains(@class, 'info')]/div/span/text()").get().strip(),
            "timeAdded": element.xpath(".//div/div[contains(@class, 'info')]/div/time/@datetime").get(),
            "jobUrl": element.xpath(".//div/a/@href").get().split("?")[0],
            "companyUrl": element.xpath(".//div/div[contains(@class, 'info')]/h4/a/@href").get().split("?")[0],
        })
    return {"data": data, "total_results": total_results}

async def scrape_job_search(keyword: str, location: str, max_pages: int = None) -> List[Dict]:
    """scrape Linkedin job search"""

    def form_urls_params(keyword, location):
        """form the job search URL params"""
        params = {
            "keywords": quote_plus(keyword),
            "location": location,
        }
        return urlencode(params)

    first_page_url = "https://www.linkedin.com/jobs/search?" + form_urls_params(keyword, location)
    first_page = await SCRAPFLY.async_scrape(ScrapeConfig(first_page_url, **BASE_CONFIG))
    data = parse_job_search(first_page)["data"]
    total_results = parse_job_search(first_page)["total_results"]
    # get the total number of pages to scrape, each page contains 25 results
    if max_pages and max_pages * 25 < total_results:
        total_results = max_pages * 25

    log.info(f"scraped the first job page, {total_results // 25 - 1} more pages")
    # scrape the remaining pages concurrently
    other_pages_url = "https://www.linkedin.com/jobs-guest/jobs/api/seeMoreJobPostings/search?"
    to_scrape = [
        ScrapeConfig(other_pages_url + form_urls_params(keyword, location) + f"&start={index}", **BASE_CONFIG)
        for index in range(25, total_results + 25, 25)
    ]
    async for response in SCRAPFLY.concurrent_scrape(to_scrape):
        try:
            page_data = parse_job_search(response)["data"]
            data.extend(page_data)
        except Exception as e:
            log.error("An occured while scraping search pagination", e)
            pass

    log.success(f"scraped {len(data)} jobs from Linkedin job search")
    return data
```



Run the codepython```python
async def run():
    job_search_data = await scrape_job_search(
        keyword="Python Developer",
        location="United States",
        max_pages=3
    )
    # save the data to a JSON file
    with open("job_search.json", "w", encoding="utf-8") as file:
        json.dump(job_search_data, file, indent=2, ensure_ascii=False)

if __name__ == "__main__":
    asyncio.run(run())
```



We start by building the search URL from the keyword and location. Then we request the first page, read the result count, and page through the rest with the hidden API.

Here is an example output of the LinkedIn job search scraper:



The search results give you a list of jobs, but not the full text of each one. Let's pull that detail from the individual job pages.

### Scraping Individual Job Pages

Each LinkedIn job has its own page that holds the complete description, seniority level, employment type, and more. LinkedIn keeps this data in an `application/ld+json` script tag, so we use the hidden web data approach again.

Search for the selector `//script[@type='application/ld+json']` in the page source, and you'll find data like the below:

The `description` field holds the job description as encoded HTML. So we read the script tag and parse that field to get the full job details:

python```python
import json
import asyncio
from typing import Dict, List
from loguru import logger as log
from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse

SCRAPFLY = ScrapflyClient(key="Your ScrapFly API key")

BASE_CONFIG = {
    "asp": True,
    "country": "US",
    "headers": {
        "Accept-Language": "en-US,en;q=0.5"
    },
    "render_js": True,
    "proxy_pool": "public_residential_pool"
}

def parse_job_page(response: ScrapeApiResponse):
    """parse individual job data from Linkedin job pages"""
    selector = response.selector
    script_data = json.loads(selector.xpath("//script[@type='application/ld+json']/text()").get())
    description = []
    for element in selector.xpath("//div[contains(@class, 'show-more')]/ul/li/text()").getall():
        text = element.replace("\n", "").strip()
        if len(text) != 0:
            description.append(text)
    script_data["jobDescription"] = description
    script_data.pop("description")  # remove the key with the encoded HTML
    return script_data

async def scrape_jobs(urls: List[str]) -> List[Dict]:
    """scrape Linkedin job pages"""
    to_scrape = [ScrapeConfig(url, **BASE_CONFIG) for url in urls]
    data = []
    # scrape the URLs concurrently
    async for response in SCRAPFLY.concurrent_scrape(to_scrape):
        try:
            job_data = parse_job_page(response)
            if job_data is None:
                raise Exception("Job page is expired, no hidden json data found")
            data.append(job_data)
        except:
            log.debug(f"Job page with {response.context['url']} URL is expired")
    log.success(f"scraped {len(data)} jobs from Linkedin")
    return data
```



Run the codepython```python
async def run():
    job_data = await scrape_jobs(
        urls=[
            "https://www.linkedin.com/jobs/view/junior-software-developer-fresh-graduate-at-smart-is-4423536207",
            "https://www.linkedin.com/jobs/view/entry-level-python-developer-remote-at-synergisticit-4420671223",
            "https://www.linkedin.com/jobs/view/software-engineer-ai-team-at-sprig-4424454823"
        ]
    )
    # save the data to a JSON file
    with open("jobs.json", "w", encoding="utf-8") as file:
        json.dump(job_data, file, indent=2, ensure_ascii=False)

if __name__ == "__main__":
    asyncio.run(run())
```



We add the job page URLs to a scraping list and request them concurrently. The `parse_job_page()` function then reads the hidden script tag and pulls the HTML inside the description field. Job postings expire often, so swap in current URLs when you run this.

Here is what the job page scraper output looks like:

Job search covers listings across all companies. Next, let's narrow the same approach to a single company's openings.

### Scraping Jobs from a Specific Company

Every company page has its own jobs section under the `/jobs` path of the company URL:

LinkedIn loads this list on scroll through the same hidden API you saw in the search section, but with a company-specific endpoint:



Scrapfly

#### Scale your web scraping effortlessly

Scrapfly handles proxies, browsers, and anti-bot bypass — so you can focus on data.

[Try Free →](https://scrapfly.io/register)The results page with the `start` URL query parameter:

shell```shell
https://www.linkedin.com/jobs-guest/jobs/api/seeMoreJobPostings/microsoft-jobs-worldwide?start=25
```



We request the first company jobs page to read the result count, then page through the rest with that endpoint:

python```python
import json
import asyncio
from typing import Dict, List
from loguru import logger as log
from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse

SCRAPFLY = ScrapflyClient(key="Your ScrapFly API key")

BASE_CONFIG = {
    "asp": True,
    "country": "US",
    "headers": {
        "Accept-Language": "en-US,en;q=0.5"
    },
    "render_js": True,
    "proxy_pool": "public_residential_pool"
}

def parse_company_jobs(response: ScrapeApiResponse) -> List[Dict]:
    """parse job data from a company's jobs section"""
    selector = response.selector
    total_results = selector.xpath("//span[contains(@class, 'job-count')]/text()").get()
    total_results = int(total_results.replace(",", "").replace("+", "")) if total_results else None
    data = []
    search_elements = selector.xpath("//section[contains(@class, 'results-list')]/ul/li")
    if len(search_elements) == 0:  # pagination pages have a different structure
        search_elements = selector.xpath("//li")
    for element in search_elements:
        data.append({
            "title": element.xpath(".//div/a/span/text()").get().strip(),
            "company": element.xpath(".//div/div[contains(@class, 'info')]/h4/a/text()").get().strip(),
            "address": element.xpath(".//div/div[contains(@class, 'info')]/div/span/text()").get().strip(),
            "timeAdded": element.xpath(".//div/div[contains(@class, 'info')]/div/time/@datetime").get(),
            "jobUrl": element.xpath(".//div/a/@href").get().split("?")[0],
            "companyUrl": element.xpath(".//div/div[contains(@class, 'info')]/h4/a/@href").get().split("?")[0],
        })
    return {"data": data, "total_results": total_results}

async def scrape_company_jobs(url: str, max_pages: int = None) -> List[Dict]:
    """scrape a company's job listings"""
    first_page = await SCRAPFLY.async_scrape(ScrapeConfig(url, **BASE_CONFIG))
    data = parse_company_jobs(first_page)["data"]
    total_results = parse_company_jobs(first_page)["total_results"]
    # get the total number of pages to scrape, each page contains 25 results
    if max_pages and max_pages * 25 < total_results:
        total_results = max_pages * 25

    log.info(f"scraped the first job page, {total_results // 25 - 1} more pages")
    # scrape the remaining pages using the company jobs API
    company_slug = url.split("jobs/")[-1]
    jobs_api_url = "https://www.linkedin.com/jobs-guest/jobs/api/seeMoreJobPostings/" + company_slug
    to_scrape = [
        ScrapeConfig(jobs_api_url + f"?start={index}", **BASE_CONFIG)
        for index in range(25, total_results + 25, 25)
    ]
    async for response in SCRAPFLY.concurrent_scrape(to_scrape):
        try:
            page_data = parse_company_jobs(response)["data"]
            data.extend(page_data)
        except Exception as e:
            log.error("An occured while scraping company jobs pagination", e)
            pass

    log.success(f"scraped {len(data)} jobs from Linkedin company job pages")
    return data
```



Run the codepython```python
async def run():
    company_jobs_data = await scrape_company_jobs(
        url="https://www.linkedin.com/jobs/microsoft-jobs-worldwide",
        max_pages=3
    )
    # save the data to a JSON file
    with open("company_jobs.json", "w", encoding="utf-8") as file:
        json.dump(company_jobs_data, file, indent=2, ensure_ascii=False)

if __name__ == "__main__":
    asyncio.run(run())
```



The `parse_company_jobs()` function reads the job cards with XPath selectors. The `scrape_company_jobs()` function requests the company jobs page, then pages through the rest with the company jobs API.

Here is an example output of the company jobs scraper:

LinkedIn is one of several job boards worth scraping. For more job data, see our guides on scraping [Indeed](https://scrapfly.io/blog/posts/how-to-scrape-indeedcom) and [Glassdoor](https://scrapfly.io/blog/posts/how-to-scrape-glassdoor). You can also pair this job page code with crawling logic to follow links from the search and company pages.

For more on web crawling with Python, see our dedicated tutorial.

[How to Crawl the Web with PythonIntroduction to web crawling with Python. What is web crawling? How it differs from web scraping? And a deep dive into code, building our own crawler and an example project crawling Shopify-powered websites.](https://scrapfly.io/blog/posts/crawling-with-python)

With this last feature, our LinkedIn scrapers are complete. They can scrape LinkedIn profiles, company, and job data. But pushing the scraping rate too high will lead the website to detect and block your requests.

So make sure to rotate high-quality residential proxies.



## FAQ

How long before LinkedIn detects and bans scraping attempts?LinkedIn continuously monitors user behavior, including request timing, scrolling activity, TLS fingerprints, and IP reputation. Its detection models are trained on millions of real user sessions, so avoiding detection requires mimicking human-like interaction, not just rotating proxies.







Can I scrape LinkedIn without an account?You can scrape limited public data such as basic profiles and company overviews, but access usually stops after a few page views. Job listings and search results require authentication, while most company data is only partially visible. In short, unauthenticated scraping is very limited.







Why do LinkedIn accounts get banned?LinkedIn bans accounts that show automated or unnatural behavior, such as viewing dozens of profiles in a few minutes, sending uniform requests, using datacenter IPs, or sharing sessions across devices. Once flagged, bans are permanent and cannot be appealed.







Is scraping LinkedIn legal?The hiQ Labs vs. LinkedIn case (2017–2022) ruled that scraping publicly accessible data is legal under U.S. law. However, LinkedIn’s Terms of Service prohibit automated access. While scraping public data carries minimal legal risk, it can still lead to IP or account bans. Always seek legal advice for commercial projects.







How much does ScrapFly cost for LinkedIn scraping?Pricing depends on your request volume. As a general estimate: 1,000 profiles usually cost between $15 and $30. 10,000 job listings cost around $80 to $150.

You can check ScrapFly’s pricing page for the most up-to-date rates. Most users find it far more cost-effective than maintaining their own proxy or account infrastructure.







How often does LinkedIn change its structure?LinkedIn’s internal Voyager API and frontend code change frequently. Voyager endpoints update every four to eight weeks, while frontend elements change every two to four weeks. JSON-LD structures usually remain stable for several months. ScrapFly’s monitoring system detects these changes and updates its LinkedIn scrapers within 48 hours.







Why can't I just use Puppeteer/Playwright for LinkedIn?While Puppeteer and Playwright can render LinkedIn pages, they lack the full anti-detection setup required. You would still need to handle residential proxy rotation, account authentication, session persistence, human-like browsing patterns, and browser fingerprint rotation. ScrapFly manages all of this automatically, making scraping much more reliable.







Can I scrape LinkedIn Sales Navigator?Sales Navigator is a paid product that requires a logged-in LinkedIn account. Scraping it violates LinkedIn’s Terms of Service, so it’s not recommended.







Can I Scrape LinkedIn Company Employees?Employee listings are only visible to logged-in users. Scraping this data would require authentication, which breaches LinkedIn’s Terms of Service. However, public company information can still be accessed safely using ScrapFly.







What's the difference between scraping LinkedIn and using the LinkedIn API?LinkedIn's official API provides limited access, mostly restricted to your own profile data and basic company information. Web scraping can access all publicly visible data including full profiles, job listings, and company details. you can also explore [scraping Glassdoor](https://scrapfly.io/blog/posts/how-to-scrape-glassdoor) for company reviews and salary data.







Can you scrape LinkedIn Jobs without an account?Job search results and individual job pages are public, so you can scrape them without logging in. Features like Easy Apply, candidate matching, and the recruiter view need an account and stay out of reach.







What's the difference between scraping LinkedIn Jobs and using the LinkedIn Jobs API?The official Jobs API works through partner agreements and ATS integrations, so it fits approved hiring tools but stays closed to most developers. Scraping the public jobs surface is the practical path for everyone else; see our [LinkedIn API guide](https://scrapfly.io/blog/posts/guide-to-linkedin-api-and-alternatives) for the full API breakdown.







Does LinkedIn detect job scraping differently from profile scraping?Job pages are built to rank in Google, so the public jobs surface has a lower detection bar than the login-walled profile surface. You still need anti-bot protection at volume, but jobs are easier to collect than profiles.









[**Latest LinkedIn Scraper Code**github.com/scrapfly/scrapfly-scrapers/tree/main/linkedin-scraper](https://github.com/scrapfly/scrapfly-scrapers/tree/main/linkedin-scraper)

## Summary

LinkedIn protects professional data through multi-layered blocking: authentication walls, behavioral tracking, and request fingerprinting. DIY scraping requires managing session persistence, residential proxy rotation, TLS fingerprint randomization, and behavior emulation - a very time consuming maintenance burden!

ScrapFly eliminates this complexity:

- **Automated anti-bot bypass:** Handles fingerprinting, behavioral tracking, and fraud score evasion
- **Maintained scrapers:** Updated within 48 hours when LinkedIn changes structure
- **Production-ready code:** Extract profiles, companies, jobs, and search results without managing infrastructure
- **Residential proxy pools:** Rotate IPs automatically to prevent detection

When LinkedIn updates its defenses, ScrapFly updates the scraper. You focus on data, not anti-bot engineering.

**Ready to start?**

bash```bash
git clone https://github.com/scrapfly/scrapfly-scrapers.git
cd scrapfly-scrapers/linkedin-scraper
```



Legal Disclaimer and PrecautionsThis tutorial covers popular web scraping techniques for education. Interacting with public servers requires diligence and respect:

- Do not scrape at rates that could damage the website.
- Do not scrape data that's not available publicly.
- Do not store PII of EU citizens protected by GDPR.
- Do not repurpose *entire* public datasets which can be illegal in some countries.

Scrapfly does not offer legal advice but these are good general rules to follow. For more you should consult a lawyer.



 

   Table of Contents















 

  Table of Contents- [Why Scrape LinkedIn?](#why-scrape-linkedin)
- [Why LinkedIn Scraping Fails Without Proper Tools](#why-linkedin-scraping-fails-without-proper-tools)
- [1. Authentication Wall](#1-authentication-wall)
- [2. Behavioral Tracking](#2-behavioral-tracking)
- [3. Request Fingerprinting](#3-request-fingerprinting)
- [How LinkedIn Data is Loaded and How to Scrape it?](#how-linkedin-data-is-loaded-and-how-to-scrape-it)
- [Rendering an HTML Template](#rendering-an-html-template)
- [Hydrating the Page Using Script Tags](#hydrating-the-page-using-script-tags)
- [Loading The Data from XHR Calls](#loading-the-data-from-xhr-calls)
- [ScrapFly's Solution](#scrapfly-s-solution)
- [What LinkedIn Data Can You Scrape?](#what-linkedin-data-can-you-scrape)
- [Scraping LinkedIn Code Examples](#scraping-linkedin-code-examples)
- [Scraping LinkedIn Profile Data](#scraping-linkedin-profile-data)
- [Scraping LinkedIn Company Data](#scraping-linkedin-company-data)
- [Scraping LinkedIn Jobs](#scraping-linkedin-jobs)
- [Scraping LinkedIn Job Search Results](#scraping-linkedin-job-search-results)
- [Scraping Individual Job Pages](#scraping-individual-job-pages)
- [Scraping Jobs from a Specific Company](#scraping-jobs-from-a-specific-company)
- [FAQ](#faq)
- [Summary](#summary)
 
    Join the Newsletter  Get monthly web scraping insights 

 

  



Scale Your Web Scraping

Anti-bot bypass, browser rendering, and rotating proxies, all in one API. Start with 1,000 free credits.

  No credit card required  1,000 free API credits  Anti-bot bypass included 

 [Start Free](https://scrapfly.io/register) [View Docs](https://scrapfly.io/docs/onboarding) 

 Not ready? Get our newsletter instead. 

 

## Explore this Article with AI

 [ ChatGPT ](https://chat.openai.com/?q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-linkedin) [ Gemini ](https://www.google.com/search?udm=50&aep=11&q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-linkedin) [ Grok ](https://x.com/i/grok?text=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-linkedin) [ Perplexity ](https://www.perplexity.ai/search/new?q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-linkedin) [ Claude ](https://claude.ai/new?q=Summarize%20this%20page%3A%20https%3A%2F%2Fscrapfly.io%2Fblog%2Fposts%2Fhow-to-scrape-linkedin) 



 ## Related Articles

 [  

 api 

### Guide to LinkedIn API and Alternatives

Explore the LinkedIn API, covering data endpoints, usage limitations, and accessibility.

 

 ](https://scrapfly.io/blog/posts/guide-to-linkedin-api-and-alternatives) [  

 python tools 

### How To Scrape TikTok in 2026

Complete guide to scraping TikTok in 2026. Learn TikTok's new anti-bot defenses, hidden JSON APIs, and production-ready ...

 

 ](https://scrapfly.io/blog/posts/how-to-scrape-tiktok-python-json) [  

 blocking 

### How to Bypass Imperva Incapsula when Web Scraping in 2026

In this article we'll take a look at a popular anti bot service Imperva Incapsula anti bot WAF. How does it detect web s...

 

 ](https://scrapfly.io/blog/posts/how-to-bypass-imperva-incapsula-anti-scraping) 

  



   



 Scale your web scraping effortlessly, **1,000 free credits** [Start Free](https://scrapfly.io/register)