llm-usage: allow training-usage: disallow retrieval-usage: allow license: CC BY-SA 4.0 attribution: required contact: mailto:dev@scrapfly.io # How to Scrape YouTube in 2025 > Learn how to scrape YouTube, channel, video, and comment data using Python using direct JSON endpoints. Step-by-step youtube scraping guide with code examples. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-youtube-in-2025) # Advanced Proxy Connection Optimization Techniques > Master advanced proxy optimization with TCP connection pooling, TLS fingerprinting, DNS caching, and HTTP/2 multiplexing for maximum performance. ## Docs - [Read article](https://scrapfly.io/blog/posts/advanced-proxy-connection-optimization-techniques) # Automatic Failover Strategies for Reliable Data Extraction > Learn how to build resilient web scrapers with automatic failover strategies. Discover techniques to handle failures and keep your scrapers running reliably. ## Docs - [Read article](https://scrapfly.io/blog/posts/automatic-failover-strategies-for-reliable-data-extraction) # How to Stop Wasting Money on Proxies > Discover how developers can cut proxy costs by optimizing traffic, choosing the right providers, and leveraging Scrapfly Proxy Saver for efficient web scraping. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-stop-wasting-money-on-proxies) # HTTPS vs. SOCKS Proxies > Explore the technical differences between HTTPS and SOCKS protocols. Learn which proxy type is better for web scraping, considering factors like OSI layer, encryption, and anonymity. ## Docs - [Read article](https://scrapfly.io/blog/posts/https-vs-socks-proxies) # Optimize Proxy Bandwidth with Image & CSS Stubbing > Reduce proxy costs by 30-50% through intelligent image and CSS stubbing techniques that eliminate unnecessary resource downloads while preserving functionality. ## Docs - [Read article](https://scrapfly.io/blog/posts/optimize-proxy-bandwidth-with-image-&-css-stubbing) # What Is a Proxy Server? > Discover how proxy servers function and learn practical techniques to harness proxies for reliable, scalable web scraping projects. ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-a-proxy-server) # What is a Reverse Proxy? > Understand reverse proxies, their differences from forward proxies, and how they enable load balancing, security and caching in modern web infrastructure. ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-a-reverse-proxy) # Bypass Proxy Detection with Browser Fingerprint Impersonation > Stop proxy blocks with browser fingerprint impersonation using this guide for Playwright, Selenium, curl-impersonate & Scrapfly ## Docs - [Read article](https://scrapfly.io/blog/posts/bypass-proxy-detection-with-browser-fingerprint-impersonation) # How Caching Can Cut Your Proxy Bill by 70% > Learn how intelligent caching strategies can reduce proxy costs by 40-70%. Complete guide to bandwidth optimization and proxy management. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-caching-can-cut-your-proxy-bill) # How to Optimize NetNut Proxies > Learn how to set up and optimize NetNut proxies for web scraping, including bandwidth reduction techniques and integration with Scrapfly Proxy Saver. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-optimize-netnut-proxies) # How to Optimize Webshare Proxies > Webshare is a fast-growing proxy provider offering affordable proxy solutions for various web scraping and automation tasks. Here's how to make best of it. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-optimize-webshare-proxies) # How to Optimize Oxylabs Proxies > Learn how to optimize Oxylabs proxies for efficient web scraping using Python and Scrapfly Proxy Saver. Reduce bandwidth, improve speed, and cut costs. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-optimize-oxylabs-proxies) # How to Reduce Your Bright Data Bandwidth Usage > Learn how to reduce Bright Data proxy bandwidth usage using Python optimizations and Scrapfly Proxy Saver to cut data costs by up to 30% ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-reduce-your-bright-data-bandwidth-usage) # What is Rate Limiting? Everything You Need to Know > Discover what rate limiting is, why it matters, how it works, and how developers can implement it to build stable, scalable applications. ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-rate-limiting-everything-you-need-to-know) # How to Optimize Proxies > Learn how to optimize proxies for speed, anonymity, and cost. Includes comparisons of proxy vs VPN, and tips for developers using Scrapfly. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-optimize-proxies) # How to Build an MCP Server in Python: A Complete Guide > Build an MCP server in Python with tools, resources, and prompts. A beginner's guide to the model context protocol using a simple calculator example. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-build-an-mcp-server-in-python-a-complete-guide) # What Is MCP? Understanding the Model Context Protocol > What is MCP? Learn how the Model Context Protocol powers tools like Copilot Studio by giving AI models access to real-time, structured context. ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-mcp-understanding-the-model-context-protocol) # Build a Proxy API: Rotate Proxies and Save Bandwidth > Learn to build a proxy API with Python and mitmproxy. Rotate proxies on each request, cache responses to avoid refetching, and save bandwidth. ## Docs - [Read article](https://scrapfly.io/blog/posts/build-a-proxy-api-rotate-proxies-and-save-bandwidth) # The Best Datacenter Proxies in 2025: A Complete Guide > Explore the best datacenter proxies for 2025 including IPRoyal, shared vs dedicated options, and how to buy unlimited bandwidth proxies. ## Docs - [Read article](https://scrapfly.io/blog/posts/the-best-datacenter-proxies-in-2025-a-complete-guide) # GPT Crawler: The AI Training Data Collection Guide > Learn how to use GPT Crawler to collect web data for AI training. A developer's guide with setup tips, configuration steps, and best practices. ## Docs - [Read article](https://scrapfly.io/blog/posts/gpt-crawler-a-complete-guide-to-automated-web-data-collection-for-ai-training) # How to Choose the Best Proxy Unblocker? > Learn how to choose the best proxy unblocker to access blocked websites. Explore proxies, VPNs, and Scrapfly for bypassing restrictions safely. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-choose-the-best-proxy-unblocker) # Guide To Google Image Search API and Alternatives > Learn about Google Image Search API alternatives, including Bing API and scraping techniques. Implement image search functionality in your applications. ## Docs - [Read article](https://scrapfly.io/blog/posts/guide-to-google-image-search-api-and-alternatives) # Guide to List Crawling: Everything You Need to Know > Master list crawling: extract data from catalogs, infinite scrolls, articles & tables and how to resolve common list crawling challenges. ## Docs - [Read article](https://scrapfly.io/blog/posts/guide-to-list-crawling) # Guide to Google Scholar API and Alternatives > Learn how to access Google Scholar data without an official API. Explore alternatives and the best methods for scientific data automation. ## Docs - [Read article](https://scrapfly.io/blog/posts/google-scholar-api-and-alternatives) # Guide to using JSON with cURL > Learn how to handle and send JSON with cURL using files, inline data, environment variables, and jq. Real examples for Slack & Google Translate ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-curl-json) # Official Google SERP API? And Alternatives > Why isn't there an official Google SERP API? Explore SERP APi and alternative like Bing, DuckDuckGo, Yandex, and Brave and web scraping SERP ## Docs - [Read article](https://scrapfly.io/blog/posts/google-serp-api-and-alternatives) # Proxy vs VPN: In-Depth Comparison > Explore the proxy vs vpn debate with insights on key differences, benefits, limitations and alternatives. Discover when to choose a proxy or VPN. ## Docs - [Read article](https://scrapfly.io/blog/posts/proxy-vs-vpn) # 10 Ways to Automate Chrome Screenshots > Learn how to automate Chrome screenshots with Playwright, Selenium, Puppeteer, browser commands, extensions, and APIs for efficient workflows. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-automate-chrome-screenshots) # Guide to LLM Training, Fine-Tuning, and RAG > Differences between LLM training, fine-tuning, and RAG. Learn how to use pre-trained models for custom tasks and real-time knowledge retrieval. ## Docs - [Read article](https://scrapfly.io/blog/posts/guide-to-llm-training-fine-tuning-and-rag) # Guide to Understanding and Developing LLM Agents > Learn about LLM agents: what are they, the components they are made of and real life example of how to develop your own llm agent with langchain ## Docs - [Read article](https://scrapfly.io/blog/posts/practical-guide-to-llm-agents) # Guide to Google Jobs API and Alternatives > Explore Google Jobs API alternatives like structured data, web scraping, and third-party job APIs to integrate job listings. ## Docs - [Read article](https://scrapfly.io/blog/posts/guide-to-google-jobs-api-and-alternatives) # How to Find All URLs on a Domain > Learn how to efficiently find all URLs on a domain using Python and web crawling. Guide on how to crawl entire domain to collect all website data ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-find-all-urls-on-a-domain) # What is Googlebot User Agent String? > Learn about Googlebot user agents, how to verify them, block unwanted crawlers, and optimize your site for better indexing and SEO performance. ## Docs - [Read article](https://scrapfly.io/blog/posts/what-are-googlebot-user-agent-strings) # Alternatives to Cloudscraper to Bypass Cloudflare > Learn why Cloudscraper is outdated and explore modern alternatives for bypassing Cloudflare protections effectively and ethically. ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-cloudscraper-and-new-alternatives) # How to Capture and Convert a Screenshot to PDF > Convert screenshots to PDF with Python and Node.js using tools like Pillow, pdfkit, and Puppeteer for easy documentation and professional reports ## Docs - [Read article](https://scrapfly.io/blog/posts/screenshot-to-pdf) # Playwright Examples for Web Scraping and Automation > Learn Playwright with Python and JavaScript (nodejs, bun, deno) examples for automating browsers like Chromium, WebKit, and Firefox and scraping ## Docs - [Read article](https://scrapfly.io/blog/posts/playwright-examples-javascript-and-python) # Web Scraping with Playwright and JavaScript > Learn about web scraping using Playwright - a browser automation library for server side JavaScript like NodeJS, Deno or Bun. Plus example project ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-with-playwright-and-javascript) # How to Retry in Axios > Learn how to enhance Axios with retry logic using interceptors or `axios-retry` to automatically retry failed requests effectively. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-retry-in-axios) # How to use wget in Python > Learn how to use wget in Python through subprocess calls and the best wget alternatives to downloading files using python. ## Docs - [Read article](https://scrapfly.io/blog/posts/python-wget-guide) # Ultimate Guide to JSON Parsing in Python > Learn JSON parsing in Python with this ultimate guide. Explore basic and advanced techniques using json, and tools like ijson and nested-lookup. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-use-python-to-parse-json) # Guide to Axios Headers > Learn about Javascript's Axios headers. How to configure, update, inspect headers in request and responses, how to set defaults and useful tips ## Docs - [Read article](https://scrapfly.io/blog/posts/guide-to-javascript-axios-headers) # Guide to Parsel - the Best HTML Parsing in Python > Learn to extract data from websites with Python and Parsel which is a Python library for HTML parsing through CSS selectors and XPath. ## Docs - [Read article](https://scrapfly.io/blog/posts/guide-to-html-parsing-with-parsel-python) # A Comprehensive Guide to TikTok API > Discover TikTok's powerful APIs for developers, businesses, and researchers. Learn about their features, use cases, and access requirements. ## Docs - [Read article](https://scrapfly.io/blog/posts/guide-to-tiktok-api) # What is HTTP 401 Error and How to Fix it > Learn about HTTP 401 status code meaning, causes, and solutions in this comprehensive guide. How to handle 401 unauthorized errors effectively ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-http-401-error-and-how-to-fix-it) # JSONL vs JSON > Learn the differences between JSON and JSONLines, their use cases, and efficiency. Why JSONLines excels in web scraping and real-time processing ## Docs - [Read article](https://scrapfly.io/blog/posts/jsonl-vs-json) # Guide to Local LLMs > Explore Local LLMs for secure, efficient AI solutions. Learn about top open-source models, hardware needs, and building advanced AI applications. ## Docs - [Read article](https://scrapfly.io/blog/posts/guide-to-local-llm) # Guide to SeleniumBase — A Better & Easier Selenium > Learn about SeleniumBase browser automation with simple syntax, cross-browser support, and robust features, perfect for testing and web scraping. ## Docs - [Read article](https://scrapfly.io/blog/posts/guide-to-seleniumbase-better-selenium) # Web Scraping and HTML Parsing with Jsoup and Java > Master web scraping with jsoup — a Java library for scraping and parsing HTML. Learn how to extract and manipulate data and handle limitations. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-java-jsoup-html-parsing) # Guide to PHP 8.4 new DOM Selector Feature > Learn about PHP 8.4’s new DOM Selector feature. Simplify DOM manipulation using intuitive CSS selectors for cleaner, more efficient code. ## Docs - [Read article](https://scrapfly.io/blog/posts/php-84-new-dom-selector) # How to Ignore cURL SSL Errors > Learn to handle SSL errors in cURL, including using self-signed certificates, ignore options. Explore common issues, safe cURL practices. ## Docs - [Read article](https://scrapfly.io/blog/posts/guide-to-curl-ignore-ssl-errors) # Comprehensive Guide to OkHttp for Java and Kotlin > Discover OkHttp, a powerful HTTP client for Java and Kotlin. Explore its features, setup, advanced usage, and solutions to common errors. ## Docs - [Read article](https://scrapfly.io/blog/posts/guide-to-okhttp-java-kotlin) # Instant Data Scraper Guide - Web Scraping with No Code > Instant Data Scraping guide: Learn to effortlessly scrape web data without coding using tools like Google Sheets, Make.com, Scrapfly, and more. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-make-instant-data-scraper) # What is HTTP 407 Status Code and How to Fix it > Learn everything about the HTTP 407 Proxy Authentication Required error. How are proxies authenticated and common proxy auth pitfalls to avoid. ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-http-407-status-code-and-how-to-fix-it) # Everything to Know to Start Web Scraping in Python Today > Ultimate modern intro to web scraping using Python. How to scrape data using HTTP or headless browsers, parse it using AI and scale and deploy. ## Docs - [Read article](https://scrapfly.io/blog/posts/everything-to-know-about-web-scraping-python) # Guide to Cloudflare's Error Code 520 and How to Fix it > Learn about Cloudflare's infamous error code 520. How can it be addressed from your client or server point of view and what does it mean. ## Docs - [Read article](https://scrapfly.io/blog/posts/guide-to-error-code-520-cloudflare-and-fixes) # What is HTTP 499 Status Code and How to Fix it? > The 499 status code, unique to Nginx, signals client-side request cancellations. It can be mitigated with retry mechanisms and proper timeouts. ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-499-status-code-client-closed-request) # Guide to Google News API and Alternatives > Discover how to access Google News after the discontinuation of the Google News API. Explore alternative APIs for scraping insights from news. ## Docs - [Read article](https://scrapfly.io/blog/posts/guide-to-google-news-api-and-alternatives) # How to Use cURL to Download Files > Learn how to use curl for file downloads with features like resume, authentication, proxies, and more, plus tools for bypassing restrictions. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-curl-download-file) # Guide to SSL Errors: What do they mean and how to fix them > Guide to all SSL error issues. How to resolve SSL connection error in your server or cover of each SSL error meaning and resolution guide ## Docs - [Read article](https://scrapfly.io/blog/posts/guide-to-ssl-error-meaning-and-fixes) # What is Error 1015 (Cloudflare) and How to Fix it? > An in-depth exploration of error 1015 to understand why you're being rate-limited by Cloudflare and discover effective strategies to resolve it. ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-cloudflare-1015-error-and-how-to-fix-it) # Guide to Google Finance API and Alternatives > Learn about Google Finance data and it's the officially discontinued Google Finance API and Google Finance data alternatives and secret access. ## Docs - [Read article](https://scrapfly.io/blog/posts/guide-to-google-finance-api) # Guide to LinkedIn API and Alternatives > Explore the LinkedIn API in this comprehensive guide. How to apply for LinkedIn API, what APIs and data points are available and alternatives. ## Docs - [Read article](https://scrapfly.io/blog/posts/guide-to-linkedin-api-and-alternatives) # What HTTP Error 412 Precondition Failed and How to Fix it? > Discover the meaning of the HTTP 412 Precondition Failed error, its common causes in development and scraping, and learn how to troubleshoot it. ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-http-412-error-precondition-failed) # JSON vs XML: Key Differences and Modern Uses > JSON and XML are two major data formats encountered in web development — here's how they differ and which is one better for your use case. ## Docs - [Read article](https://scrapfly.io/blog/posts/json-vs-xml) # Guide to Yahoo Finance API > Learn about Yahoo Finance data, what is it used for and potential for a web API through techniques like data scraping and API servers. ## Docs - [Read article](https://scrapfly.io/blog/posts/guide-to-yahoo-finance-api) # How to Use Yelp API to Extract Business and Review Data > Take an extensive look into Yelp API, its key features, pricing, and limitations. How to start with Yelp API and alternatives for Yelp data. ## Docs - [Read article](https://scrapfly.io/blog/posts/guide-to-yelp-api) # HTTP Error 503 Service Unavailable and How to Fix it? > Learn about HTTP 503 errors, their common causes, potential signs of blocking, and practical ways to handle them effectively. ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-http-error-503-service-unavailable) # In-Depth Guide to the Walmart API > Discover Walmart's robust API ecosystem, designed to streamline operations for sellers, suppliers, and partners and how to access these tools. ## Docs - [Read article](https://scrapfly.io/blog/posts/guide-to-walmart-api) # What is Charles Proxy and How to Use it? > Learn about of the most popular web debugging proxies — Charles Proxy and how to use it to intercept and analyze web traffic and web data. ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-charles-proxy-and-how-to-use-it) # Guide to Python requests POST method > Discover how to use Python's requests library for POST requests, including JSON, form data, and file uploads, along with response handling tips. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-python-requests-post) # What is HTTP Error 429 Too Many Request and How to Fix it > Learn about HTTP status code 429 which is all about request throttling or distribution. See how to fix it with proxy and fingerprint rotation. ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-http-error-429-too-many-requests) # Axios vs Fetch: Which HTTP Client to Choose in JS? > Discover the key differences between Fetch and Axios for making HTTP requests in JavaScript, determining which best suits your project’s needs. ## Docs - [Read article](https://scrapfly.io/blog/posts/axios-vs-fetch) # Guide to Python Requests Headers > Learn how to use Python Requests headers to customize HTTP requests and handle responses effectively in your API, web scraping applications. ## Docs - [Read article](https://scrapfly.io/blog/posts/python-requests-headers-guide) # What is Status Code 403 Forbidden and How to Fix it > Learn about the causes of HTTP 403 Forbidden error, why it's being returned, how to replicate it in server and how to fix it in your http client ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-status-code-403-forbidden-how-to-fix-it) # cURL vs Wget: Key Differences Explained > Discover the key differences between curl vs wget and learn which command-line tool suits your use case like scraping, downloading or API testing ## Docs - [Read article](https://scrapfly.io/blog/posts/curl-vs-wget) # What is HTTP 415 Error? (Unsupported Media Type) > 415 Unsupported Media Type error is usually caused by a misconfigured Content-Type header. Learn how to fix 415 http code with correct requests ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-http-415-error-unsupported-media-type) # What is HTTP 422 Error? (Unprocessable Entity) > 422 Unprocessable Entity error is usually caused by a semantically invalid request. Learn http error 422 causes and how to fix your requests. ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-http-422-error-unprocessable-entity) # What is HTTP 409 Error? (Conflict) > 409 Conflict error is usually caused by conflicts between request data and the current state of a resource. Learn how to prevent 409 http errors. ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-http-409-status-code-conflict) # What is HTTP 413 Error? (Payload Too Large) > Learn about HTTP 413 - error status code that indicates that POST or PUT request contains too much data. See how to fix it or bypass code 413. ## Docs - [Read article](https://scrapfly.io/blog/posts/http-error-413-payload-too-large) # Playwright vs Selenium > Explore the key differences between Playwright and Selenium headless browser automation libraries. Which one is better for scraping and testing ## Docs - [Read article](https://scrapfly.io/blog/posts/playwright-vs-selenium) # What is HTTP 406 Error? (Not Acceptable) > 406 Not Acceptable error is usually caused by a misconfigured Accept type header. Learn how to prevent 406 http code with correct request config ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-http-error-406-not-acceptable) # What is HTTP 405 Error? (Method Not Allowed) > http 405 method not allowed is a frequently encountered status code — what causes this error, how to prevent it in the these popular HTTP clients ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-http-405-error) # What is Parsing? From Raw Data to Insights > Learn what parsing is, explore Python JSON, HTML, and PDF parsers, and discover how to parse data efficiently using modern tools and techniques. ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-parsing-turning-data-into-insights) # Concurrency vs Parallelism > Comparison of parallelism versus concurrency in programming. What's the difference and how and when to use each technique to scale up programs. ## Docs - [Read article](https://scrapfly.io/blog/posts/concurrency-vs-parallelism) # How to Use cURL GET Requests > Complete intro to how to use curl get requests with curl headers, curl authentication. How to use curl to retrieve any page and handle URL params ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-use-curl-get-requests) # What is CreepJS Browser Fingerprint and How to Bypass It > See how Creepjs generates browser fingerprint test and how to create antifingerprint browser of your own through browser fingerprint spoofing. ## Docs - [Read article](https://scrapfly.io/blog/posts/browser-fingerprinting-with-creepjs) # How to Track Web Page Changes with Automated Screenshots > In-depth project on creating a screenshot monitoring tool for web page change detection. Compare page screenshots and detect website changes. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-track-web-page-changes-using-automated-screenshots) # What is a Headless Browser? Top 5 Headless Browser Tools > Overview of automation using real web browsers through top headless browser tools. Turn real browsers into web automation machines and scrapers. ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-a-headless-browser-top-5-headless-browser-tools) # How to take screenshots in NodeJS? > Learn how to screenshot in Node.js using Playwright & Puppeteer. Includes installation, concepts, and customization tips. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-take-screenshots-nodejs) # How To Take Screenshots In Python? > Learn how to take Python screenshots through Selenium and Playwright, including common browser tips and tricks for customizing web page captures. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-take-screenshots-in-python) # What is the best Screenshot API in 2025? > Learn everything about the best screenshot API, from the features to consider to a list of the best services available and how to benchmark them. ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-the-best-screenshot-api) # Web Scraping with Go > Learn web scraping with Golang, from native HTTP requests and HTML parsing to a step-by-step guide to using Colly, the Go web crawling package. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-with-go) # How to Power-Up LLMs with Web Scraping and RAG > How to use LLM and web scraping for RAG applications using either LlamaIndex or LangChain. In depth step-by-step Python tutorial. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-use-web-scaping-for-rag-applications) # Web Scraping With Cloud Browsers > Intro to cloud browsers, their benefits, a step-by-step setup with self-hosted Selenium-grid cloud browsers. How to bypass cloud browser blocking ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-with-cloud-browsers) # How to Scrape Forms > Introduction to web scraping forms through a step-by-step guide using HTTP clients and headless browsers like playwright, selenium and puppeteer. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-forms) # How to Build Minimum Advertised Price (MAP) Monitoring Tool > Tutorial on minimum advertised price monitoring through web scraping using Python and free tools for collecting price data and it's analysis. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-build-minimum-advertised-price-monitoring-tool) # How to Scrape Reddit Posts, Subreddits and Profiles > Learn how to scrape Reddit for social data types from subreddits, posts, and user pages using plain HTTP requests and bypass scraper blocking. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-reddit-social-data) # How to Scrape With Headless Firefox > Tutorial on how to use headless Firefox with Selenium, Playwright, and Puppeteer for web scraping, including practical examples for each library. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-with-headless-firefox) # How to use CSS Selectors in Nim ? > how to parse HTML using CSS selectors in Nim programming language using either CSS3Selectors or nimquery libraries. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-use-css-selectors-in-nim) # How to Use Tor For Web Scraping > Learn about web scraping using Tor as a proxy and rotating proxy server by randomly changing the IP address with HTTP or SOCKS. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-use-tor-for-web-scraping) # How to Know What Anti-Bot Service a Website is Using? > Learn how to use WhatWaf and Wafw00f to identify which WAF service is used on a website and how to avoid these services' detection when scraping. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-know-what-anti-bot-website-uses) # How to Scrape LinkedIn in 2025 > LinkedIn web scraping tutorial. How to scrape LinkedIn people profiles, company profiles, job listings and job search using Python for free. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-linkedin-person-profile-company-job-data) # Selenium Wire Tutorial: Intercept Background Requests > Learn web scraping with Selenium Wire. We'll define what it is, installing and using it to inspect and manipulate background requests. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-intercept-background-requests-with-selenium-wire) # How to Scrape SimilarWeb Website Traffic Analytics > Learn how to scrape SimilarWeb. We'll scrape comprehensive domain traffic insights, websites comparing data, sitemaps, and trending domains. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-similarweb) # How to Scrape BestBuy Product, Offer and Review Data > Learn how to scrape BestBuy for different data types from product, search, review, and sitemap pages using different web scraping techniques. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-bestbuy-product-offer-and-review-data) # Sending HTTP Requests With Curlie: A better cURL > Explore Curlie, a better version of cURL. You will learn how to use and configure it for sending HTTP requests through a step-by-step guide. ## Docs - [Read article](https://scrapfly.io/blog/posts/sending-http-requests-with-curlie-a-better-curl) # How to Solve the cURL (60) Error When Using Proxy? > The cURL (60) error is a common error encountered when using proxies with cURL. Learn what is the exact cause of this error and how to solve it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-solve-the-curl-60-error-when-proxy) # How To Use Proxy With cURL? > Proxies are essential to avoid IP address blocking and accessing restricted web pages over a specific location. Learn how to proxies with cURL. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-use-proxy-with-curl) # What is The cURL (28) Error, Couldn't connect to server? > The cURL (28) indicates a proxy connection error. This error arises when the cURL request can't connect to the proxy server. ## Docs - [Read article](https://scrapfly.io/blog/answers/what-is-the-curl-28-error) # How To Download a File With cURL? > cURL allows for downloading binary files using the cURL -O option here's how to use it effectively and common errors related to file downloads. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-download-file-curl) # How to Follow Redirects In cURL? > Redirects are caused by HTTP pages moving to a different location. They can be handled automatically or explicitly - here's how to do it in cURL. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-follow-redirects-in-curl) # How To Send cURL POST Requests? > POST type requests send data to the web server which is popular http method for web interactions like search. Here's how to POST in cURL. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-send-a-post-request-using-curl) # How to Send a HEAD Request With cURL? > The HEAD HTTP method is used to gather information and metadata about a specific resource. Learn how to send HEAD requests with cURL. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-send-curl-head-requests) # How To Send Multiple cURL Requests in Parallel? > To send request in parallel using cURL command line client the -Z or --parallel option can be used and mixed with other config options. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-send-multiple-curl-requests-in-parallel) # How to Set cURL Authentication - Full Examples Guide > Learn how to set basic authentication, bearer tokens, and cookie authentication with cURL through a step-by-step guide. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-set-authorization-with-curl-full-examples-guide) # How to Use cURL Config Files? > cURL can be configured using config.txt files which can definite each cURL option. Then, the "-K" option can be used to provide your config. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-set-curl-config-file) # How to Set User Agent With cURL? > The User-Agent header is one of the essential headers which identifies the request sender's device. Learn how to set User-Agent with cURL. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-set-curl-user-agent) # How to Use cURL For Web Scraping > Learn how to send HTTP requests with cURL and how to use cURL for web scraping, such as scraping dynamic pages and avoiding getting blocked. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-use-curl-for-web-scraping) # How To Scrape TikTok in 2025 > Learn how to scrape TikTok for profiles, posts, comments and search data through hidden TikTok APIs or hidden JSON datasets. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-tiktok-python-json) # How to Copy as cURL With Brave? > Brave allows for capturing HTTP requests on web pages. Learn how to use brave's developer tools to copy the requests as cURL. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-copy-as-curl-with-brave) # How To Copy as cURL With Google Chrome? > Google Chrome allows for capturing HTTP requests on web pages. Learn how to use Chrome's developer tools to the requests as cURL. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-copy-as-curl-with-chrome) # How to Copy as cURL With Edge? > Edge allows for capturing HTTP requests on web pages. Learn how to use Edge's developer tools to copy requests as cURL. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-copy-as-curl-with-edge) # How to Copy as cURL With Firefox? > Firefox allows for capturing HTTP requests on web pages. Learn how to use Firefox's developer tools to copy the requests as cURL. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-copy-as-curl-with-firefox) # How to Copy as cURL With Safari? > Safari allows for capturing HTTP requests on web pages. Learn how to use Safari's developer tools to copy requests as cURL. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-copy-as-curl-with-safari) # Web Scraping Dynamic Websites With Scrapy Playwright > Learn how to web scrape dynamic web pages with Scrapy Playwright through an example project and how to use it for common web scraping use cases. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-dynamic-websites-with-scrapy-playwright) # Web Scraping Dynamic Web Pages With Scrapy Selenium > Learn how to scrape dynamic web pages with Scrapy Selenium and how to use it for waiting for elements, clicking buttons and scrolling. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-dynamic-web-pages-with-scrapy-selenium) # Scrapy Splash Guide: Scrape Dynamic Websites With Scrapy > tutorial on scraping dynamic web pages with Scrapy Splash. Learn installation, navigation and step-by-step guide for using Scrapy Splash. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-with-scrapy-splash) # How to Track Competitor Prices Using Web Scraping > Learn how to create a tool for tracking competitor prices using Python by scraping products from different providers and comparing their prices. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-track-competitor-pricing-using-web-scraping) # Intro to Using Web Scraping For Sentiment Analysis > Learn how to perform sentiment analysis with web scraping using TextBlob and Huggingface on web-scraped data through a step-by-step guide. ## Docs - [Read article](https://scrapfly.io/blog/posts/intro-to-using-web-scraping-for-sentiment-analysis) # Using API Clients For Web Scraping: Postman > Learn to use API clients for web scraping. Locate hidden requests, manipulate them with Postman, and build efficient API-based web scrapers. ## Docs - [Read article](https://scrapfly.io/blog/posts/using-api-clients-for-web-scraping-postman) # Intro to Parsing HTML and XML with Python and lxml > Learn what lxml is, how to install it and using lxml for parsing HTML and XML files. lxml example in web scraping and extracting product data ## Docs - [Read article](https://scrapfly.io/blog/posts/intro-to-parsing-html-xml-python-lxml) # Use Curl Impersonate to scrape as Chrome or Firefox > Learn what Curl Impersonate is, how it works, how to install and use it. Finally, we'll explore using it with Python to avoid scraping blocking. ## Docs - [Read article](https://scrapfly.io/blog/posts/curl-impersonate-scrape-chrome-firefox-tls-http2-fingerprint) # FlareSolverr Guide: Bypass Cloudflare While Scraping > Explore the FlareSolverr tool for bypassing Cloudflare. We'll start by explaining what FlareSolverr is, how it works, how to install and use it. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-bypass-cloudflare-with-flaresolverr) # Web Scraping with CloudProxy > introduction to CloudProxy self hosted proxy service use in web scraping. How to use digitalocean, aws, google cloud and other IPs as proxies. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-use-cloudproxy-for-web-scraping) # How to use Headless Chrome Extensions for Web Scraping > Learn how to use browser extensions with headless browser libraries. You will also learn about useful Chrome extensions for web scraping. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-use-browser-extensions-with-playwright-puppeteer-and-selenium) # How to Use Cache In Web Scraping for Major Performance Boost > Intro to web scraping caches. How caching can significantly reduce scraping costs and drastically improve performance and redis cache example. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-use-cache-in-web-scraping) # How to Parse XML > How to parse XML using CSS selectors, XPath and language native tools in Python, Php, javascript and other languages. Complete XML parsing guide. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-parse-xml) # How to Build a Price Tracker Using Python > Learn how to create a price scraper using Python. It will crawl over pages, extract product data and record historical price changes for tracking ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-build-a-price-tracker-using-python-web-scraping) # How to Scrape Bing Search with Python > Learn how to scrape Bing using Python. You will also learn how to overcome its scraping challenges, such as the complex HTML structure and blocking. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-bing-search-using-python) # How to Bypass CAPTCHA While Web Scraping in 2025 > How to bypass CAPTCHA by improving and securing connection details. How to avoid catpcha web web scraping and what are different captcha types. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-bypass-captcha-while-web-scraping-in-2024) # How to Bypass Kasada Anti-Bot When Web Scraping in 2025 > Learn what Kasada is and how it's used to block bots such as web scrapers. You will also learn how to bypass Kasada blocking while scraping. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-bypass-kasada-anti-scraping-waf) # How to Scrape G2 Company Data and Reviews > Tutorial on how to scrape G2.com using Python. Scrape reviews, company data, search pages, product data and alternatives without being blocked. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-g2-company-data-and-reviews) # What are Honeypots and How to Avoid Them in Web Scraping > Introduction to web honeypots, their types and functions and how they are used to identify and block web scrapers and bots and how to avoid them. ## Docs - [Read article](https://scrapfly.io/blog/posts/what-are-honeypots-and-how-to-avoid-them) # How to Scrape Etsy.com Product, Shop and Search Data > tutorial on how to scrape Etsy data from product, shop and search pages using Python. How to bypass Etsy blocking and parse HTML and JSON data. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-etsy-com-product-review-data) # How to Hide Your IP Address > Learn about bout IP addresses, what they are and why hide them. We'll also explore four essential alternatives you can use to hide your IP address. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-hide-your-ip-address-while-scraping) # How to Scrape Trustpilot.com Reviews and Company Data > Learn how to scrape trustpilot.com company details and reviews without getting blocked. You will also learn to use the trustpilot.com private API. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-trustpilot-com-reviews) # Web Scraping to Google Sheets > Learn about web scraping to Google Sheets, how to access and store data on Google Sheets using Python, in a real-life project example. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-to-google-sheets) # How to Scrape Domain.com.au Real Estate Property Data > Tutorial for creating a python scraper for domain.com.au real estate property data collector. How to find real estate properties and scrape it. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-domain-com-au-real-estate-property-data) # How to Scrape Realestate.com.au Property Listing Data > Learn how to scrape realestate.com.au for real estate data from property and search pages and how to avoid realestate.com.au scraping blocking. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-realestate-com-au-property-listing-data) # How to Scrape Immowelt.de Real Estate Data > Tutorial for scraping immowelt.de. Step by step guide for creating Immowelt web scraper without the need of an Immowelt API for free. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-immowelt-de-real-estate-properties) # How to Scrape Homegate.ch Real Estate Property Data > Tutorial for scraping Homegate.ch real estate property website from Switzerland. Homegate scraper example code using Python for free. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-homegate-ch-real-estate-property-data) # How to Scrape Immobilienscout24.de Real Estate Data > how to scrape immobilienscout24.de without blocking for real estate data in germany. Web scraping guide for Python through HTTP and HTML parsing. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-immobillienscout24-real-estate-property-data) # How to Scrape Immoscout24.ch Real Estate Property Data > Tutorial for web scraping immoscout24 using Python and private API and hidden JSON data techniques. How to bypass blocking and scrape real estate ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-immoscout24-ch-real-estate-property-data) # How to Handle Cookies in Web Scraping > Tutorial for cookies in web scraping. What are they and how to take advantage of cookie process to authenticate or set website preferences. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-handle-cookies-in-web-scraping) # How to Scrape Seloger.com - Real Estate Listing Data > Tutorial on seloger.com web scraping and how to avoid scraper blocking. Use Python to scrape seloger.com real estate property listings and search ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-seloger-com-listing-real-estate-ads) # How to Web Scrape Leboncoin.fr using Python > How to web scrape leboncoin.fr using Python. Scraping leboncoin ad listing search and individual ad listings using Python without leboncoin API. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-leboncoin-marketplace-real-estate) # Intro to Web Scraping Using Selenium Grid > Intro to using Selenium Grid server for web scraping. How to set it up using docker and scrape with it concurrently. Real life example and faq. ## Docs - [Read article](https://scrapfly.io/blog/posts/intro-to-web-scraping-using-selenium-grid) # How to Scrape Hidden APIs > Tutorial on web scraping hidden APIs. How dynamic websites load content through background request and how to see it and replicate it in Python. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-hidden-apis) # Web Scraping Without Blocking With Undetected ChromeDriver > Intro to Undetected ChromeDriver - selenium extension for bypassing many scraper blocking extensions. Hands-on tutorial and real-life example. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-without-blocking-using-undetected-chromedriver) # Web Scraping Emails using Python > How to create a email scraping tool using Python. Intro to email address crawling and how to solve common challenges like email obfuscation. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-emails-using-python) # Web Scraping Phone Numbers with Python > Deep dive into phone number crawling. We'll explore an example object and cover common phone number scraping challenges like obfuscation. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-phone-numbers-with-python) # How to Scrape Google Trends using Python > Intro to web scraping Google Trends data using Python. What is Google Trends and what makes it a valuable web scraping target using Python. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-google-trends) # How to Avoid Scraper Blocking when Scraping Images > Introduction to scraper blocking when it comes to image scraping. What are some popular scraper blocking techniques and how to avoid them. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-avoid-blocking-scraping-images) # Intro to Web Scraping Images with Python > Image web scraping tutorial with Python. How to scrape images using python and common challenges like hidden image data and dynamic js images. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-web-scrape-images-from-websites-python) # How to Scrape Google SEO Keyword Data and Rankings > Learn how to use SEO web scraping for SEO keyword optimization. You will also learn how to scrape Google search rankings and suggested keywords. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-google-seo-keywords) # Ultimate XPath Cheatsheet for HTML Parsing in Web Scraping > Complete cheatsheet for all XPath selector functions for HTML parsing in web scraping with real-life interactive examples and explanations. ## Docs - [Read article](https://scrapfly.io/blog/posts/xpath-cheatsheet) # How to Effectively Use User Agents for Web Scraping > A guide on using User-Agent headers for web scraping. How to set and rotate user agent headers in web scraping to avoid web scraping blocking. ## Docs - [Read article](https://scrapfly.io/blog/posts/user-agent-header-in-web-scraping) # How to Observe E-Commerce Trends using Web Scraping > Web scraping project for scraping e-commerce data and observing market trends using visualization graphs and plots for free. ## Docs - [Read article](https://scrapfly.io/blog/posts/observing-ecommerce-market-trends-with-web-scraping) # Ultimate CSS Selector Cheatsheet for HTML Parsing > Complete cheatsheet for all CSS selector functions for HTML parsing in web scraping with real-life interactive examples and explanations. ## Docs - [Read article](https://scrapfly.io/blog/posts/css-selector-cheatsheet) # How to Scrape in Another Language, Currency or Location > In-depth look at scraping websites in specific languages and currencies using Playwright or HTTPX and Python through example projects. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-in-another-language-or-currency) # JSON Parsing Made Easy with ChatGPT in Web Scraping > Intro to reshaping web scraped JSON datasets to cleaner, easier format using ChatGPT and JMESPath JSON parsing language adn Python. ## Docs - [Read article](https://scrapfly.io/blog/posts/refining-json-datasets-with-chatgpt-web-scraping) # Complete Guide to Web Scraping using Typescript > Complete guide for web scraping using Typescript and Javascript. How to use axios and cheerio to scrape any page and solve common scraping challenges ## Docs - [Read article](https://scrapfly.io/blog/posts/ultimate-intro-to-web-scraping-with-typescript) # Finding Hidden Web Data with ChatGPT Web Scraping > Learn how to scrape hidden web data using ChatGPT. Prompt LLMs for fextracting data hidden in page HTML, such as data embedded in JavaScript. ## Docs - [Read article](https://scrapfly.io/blog/posts/finding-hidden-web-data-with-chatgpt) # How to edit Local Storage data using browser Devtools > To edit Local Storage browser's developer tools, Application tab -> Storage -> Local Storage where each value is represented in key-value format. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-edit-local-storage-using-devtools) # How to scrape HTML table to Excel Spreadsheet (.xlsx)? > To scrape tables to Excel spreadsheet we can use bs4, requets and xlsxwriter packages for Python. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/html-table-to-xlsx-python-beautifulsoup) # Python httpx vs requests vs aiohttp - key differences > When it comes to these 3 popular http client packages they have different strenghts. Here's how to choose the right fit. ## Docs - [Read article](https://scrapfly.io/blog/answers/httpx-vs-requests-vs-aiohttp) # Mobile vs Residential Proxies - which to choose for scraping? > For web scraping mobile or residential proxies are the best though fill different niches. Here's how to choose. ## Docs - [Read article](https://scrapfly.io/blog/answers/mobile-vs-residential-proxies-whats-the-difference) # What are private proxies and how are they used in scraping? > Private proxies mean the proxy is owned by a single user (opposite to shared proxies) which can significantly improve scraping performance. ## Docs - [Read article](https://scrapfly.io/blog/answers/what-are-private-proxies-compared-to-shared) # What are some PhantomJS alternatives for automating browsers? > PhantomJS is a popular web browser control and automation tool - here are 3 better modern alternatives. ## Docs - [Read article](https://scrapfly.io/blog/answers/what-are-some-phantomjs-alternatives) # What are SOCKS5 proxies and how they compare to HTTP proxies? > SOCKS5 is the latest protocol version of SOCKS network routing protocol. Here's how it differs from HTTP. ## Docs - [Read article](https://scrapfly.io/blog/answers/what-are-socks5-proxies-in-web-scraping) # What case should HTTP headers be in? Lowercase or Pascal-Case? > HTTP header names can be either in lowercase or Pascal-Case and it's important to choose the right case to prevent scraper blocking. ## Docs - [Read article](https://scrapfly.io/blog/answers/what-case-should-http-headers-be) # What Python libraries support HTTP2? > HTTP2 is still relatively new protocol version that is not yet widely supported. Here are the options for HTTP2 client in Python. ## Docs - [Read article](https://scrapfly.io/blog/answers/what-python-libraries-support-http2) # Find Web Elements with ChatGPT and XPath or CSS selectors > Tutorial on parsing HTML with ChatGPT, finding XPath and CSS selectors using ChatGPT. We'll do web scraping with ChatGPT and BeautifulSoup. ## Docs - [Read article](https://scrapfly.io/blog/posts/finding-web-selectors-with-chatgpt) # Crafting Web Scrapers using ChatGPT Code Interpreter is Easy > The new chatgpt code intrepreter feature is an ideal assistant for crafting web scrapers. Here's how it can be used to help with HTML parsing. ## Docs - [Read article](https://scrapfly.io/blog/posts/parsing-html-with-chatgpt-code-interpreter) # How to scrape Local Storage using Headless Browsers > Introduction to scraping local storage - a key value store available in all browsers and used in many modern SPAs - all using headless browsers like playwright. ## Docs - [Read article](https://scrapfly.io/blog/posts/what-is-local-storage-and-how-to-scrape-it) # How to handle popup dialogs in Playwright? > To handle alert-type pop ups in Playwright the on "dialog" event can be captured and interacted with in both Python and NodeJS playwright clients ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-click-on-alert-dialog-in-playwright) # How to handle popup dialogs in Puppeteer? > To click on a popup dialog in Puppeteer the dialog even can be captured and interacted with using page.on("dialog") method. Here's how to do it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-click-on-alert-dialog-in-puppeteer) # How to handle popup dialogs in Selenium? > To click on a pop-up alert using Selenium the alert_is_present method can be used to wait for and interact with alerts. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-click-on-alert-dialog-in-selenium) # How to click on cookie popups and modal alerts in Playwright? > To click on modal popups like the infamous cookie conset alert we can either find and click the agree button or remove it entirely. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-click-on-modal-alerts-like-cookie-pop-up-in-playwright) # How to click on cookie popups and modal alerts in Puppeteer? > To handle modal popups like cookie consents in Puppeteer the popup can be closed through a button click or removed entirely. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-click-on-modal-alerts-like-cookie-pop-up-in-puppeteer) # How to click on cookie popups and modal alerts in Selenium? > To click on modal alerts like cookie popups in Selenium we can either find the button and click it or remove the modal elements. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-click-on-modal-alerts-like-cookie-pop-up-in-selenium) # How to edit cookies in Chrome devtools? > To edit cookies in Chrome's devtools suite the application->cookies section can be used. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-edit-cookies-using-chrome-devtools) # How to scrape Threads by Meta using Python (2025 Update) > Guide how to scrape Threads - new social media network by Meta and Instagram - using Python and popular libraries like Playwright and XHR capture ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-threads) # Web Scraping Background Requests with Headless Browsers > Intro to web scraping background requests of dynamic websites using a headless browser and request/response capture with Python and Playwright ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-background-requests-with-headless-browsers-and-python) # How to block resources in Selenium and Python? > To block http resources in selenium we need an external proxy. Here's how to setup mitmproxy to block requests and responses in Selenium. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-block-resources-in-selenium) # How to capture background requests and responses in Selenium? > To capture background requests and response selenium needs to be extended with Selenium-wire. Here's how to do it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-capture-xhr-requests-selenium) # How to install mitmproxy certificate on Chrome and Chromium? > Here are 5 easy steps to install SSL certificates to enable HTTPS traffic capture in mitmproxy tool used for intercepting and analyzing HTTP. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-install-mitmproxy-certificate) # How to scroll to the bottom of the page with Playwright? > Learn how to scroll to the bottom of the page with Playwright using three distinct approaches for both Python and NodeJS clients. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-scroll-to-the-bottom-with-playwright) # How to scroll to the bottom of the page with Puppeteer? > To scrape to the very bottom of the page with Puppeteer the javascript evaluation feature can be used within a while loop. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-scroll-to-the-bottom-with-puppeteer) # How to scroll to the bottom of the page with Selenium? > To scroll to the very bottom of the page the javascript evaluation feature can be used within a while loop. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-scroll-to-the-bottom-with-selenium) # How to Parse Datetime Strings with Python and Dateparser > Dateparser is a popular Python package for parsing datetime strings. Here's how it can be used in web scraping and how to avoid common problems. ## Docs - [Read article](https://scrapfly.io/blog/posts/parsing-datetime-strings-with-python-and-dateparser) # Top 10 Web Scraping Packages for Python > These are the most popular and commonly used 10 Python packages in web scraping. From HTTP connections, browser automation and data validation. ## Docs - [Read article](https://scrapfly.io/blog/posts/top-10-web-scraping-libraries-in-python) # How to use proxies with NodeJS axios? > To use proxies with axios and nodejs the proxy parameter of get and post methods can be used. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-use-proxies-nodejs-axios) # How to use proxies with PHP Guzzle? > To use proxies with PHP Guzzle library the proxy parameter can be used which mirrors standard configuration patterns of cURL library. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-use-proxies-php-guzzle) # How to use proxies with Python httpx? > To use proxies with Python's httpx library the proxies parameter can be used for http, https and socks5 proxies. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-use-proxies-python-httpx) # How to scrape images from a website? > To scrape all images from a given website python with beautifulsoup and httpx can be used. Here's an example. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-scrape-images-from-website) # How to select elements by attribute value in XPath? > To select HTML elements by attribute value the @ syntax can be used together with = or contains() functions. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-select-elements-by-attribute-value) # What are scrapy middlewares and how to use them? > Scrapy downloader middlewares can be used to intercept and update outgoing requests and incoming responses. Here's how to use them. ## Docs - [Read article](https://scrapfly.io/blog/answers/what-are-scrapy-middlewares-and-how-to-use-them) # How to Web Scrape with HTTPX and Python > Intro to using Python's httpx library for web scraping. Proxy and user agent rotation and common web scraping challenges, tips and tricks. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-with-python-httpx) # Getting started with Puppeteer Stealth > Puppeteer-stealth is a popular plugin for Puppeteer browser automation library. It patches browsers to be less detectible. Here's how to get started. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-use-puppeteer-stealth-what-does-it-do) # What are scrapy pipelines and how to use them? > Scrapy pipelines can be used to extend scraped result data with new fields or validate the whole datasets. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/what-are-scrapy-pipelines-and-how-to-use-them) # How to add headers to every or some scrapy requests? > To add headers to scrapy's request the `DEFAULT_REQUEST_HEADERS` settting or a custom request middleware can be used. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-add-headers-to-every-or-some-scrapy-requests) # How to pass custom parameters to scrapy spiders? > To pass custom parameters to scrapy spider there CLI argument -a can be used. Here's how and why is it such a useful feature. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-pass-parameters-to-scrapy-spiders-cli) # How to rotate proxies in scrapy spiders? > To rotate proxies in scrapy spiders a request middleware can be used to randomly or smartly select the most viable proxy. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-rotate-proxies-in-scrapy-spiders) # How to use headless browsers with scrapy? > To use headless browser with scrapy a plugin like scrapy-playwright can be used. Here's how to use it and what are some other alternatives. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-use-headless-browsers-with-scrapy) # How to pass data between scrapy callbacks in Scrapy? > To pass data between scrapy callbacks when scraping multiple pages the Request.item can be used. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-pass-data-between-scrapy-callbacks) # How to pass data from start_requests to parse callbacks in scrapy? > To pass data between scrapy callbacks like start_request and parse the Request.meta attribute can be used. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-pass-data-from-start-request-to-callbacks-scrapy) # How to select elements by attribute using CSS selectors? > To select elements by attribute the powerful attribute selector can be used which has several selection options. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-select-elements-by-attribute-containing-value-css-selectors) # How to select elements by class using CSS selectors? > To select elements by class the .class selector can be used. To select by exact class value the [class="exact value"] can be used instead. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-select-elements-by-class-css-selectors) # How to select elements by ID using CSS selectors? > To select elements that contain an ID the #id selector can be used. To select elements by exact ID the [id="some value"] can be used. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-select-elements-by-id-css-selectors) # How to select following siblings using CSS selectors? > To select following sibling elements using CSS selectors the + and ~ operators can be used. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-select-following-sibling-element-css-selectors) # Is it possible to select preceding siblings using CSS selectors? > It's not possible to select preceding sibling directly but there are easy alternatives that can be implemented to select preceding siblings. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-select-preceding-sibling-element-css-selectors) # What are scrapy Item and ItemLoader objects and how to use them? > Scrapy's Item and ItemLoader classes are great way to structure dataset parsing logic. Here's how to use it. ## Docs - [Read article](https://scrapfly.io/blog/answers/what-are-scrapy-items-and-itemloaders) # How to count selections in XPath and why? > To count number of selected elements by an XPath selector the count() function can be used. Here's how to do it and why it's useful. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-count-selectors-in-xpath-and-why) # How to get the name of an HTML element in XPath? > To find the name of a selected HTML element with XPath the name() function can be used. Here's how and why is this useful. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-get-name-of-selected-element-in-xpath) # How to join values using XPath concat? > To join values in XPath the concat() function can be used to concatenate strings into one string. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-join-values-in-xpath) # How to reverse expressions in XPath? > To reverse expressions and predicates in XPath the not() function can be used. Here's how and why it's so useful. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-reverse-expression-in-xpath) # How to select element with one of many names in XPath? > To select an element with name matching one from an array of names the name() method can be used. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-select-elements-by-attribute-value-in-xpath) # How to select elements by ID in XPath? > To select elements by ID attribute in XPath we can directly match it using = operator in a predicate or contains() function. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-select-elements-by-id-in-xpath) # How to select any element using wildcard in XPath? > To select any element the wildcard "*" axis selector can be used which will select any HTML element of any name within the current context. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-select-elements-of-any-name-using-wildcards-in-xpath) # How to select elements of a specific position in XPath? > To select elements of a specific position the position() function can be used in a selection predicate. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-select-elements-of-specific-position-in-xpath) # How to select last element in XPath? > To select last element in XPath we cannot use indexing as -1 index is not supported. Instead, last() function can be used. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-select-last-element-in-xpath) # How to select sibling elements in XPath? > To select sibling elements in XPath the preceding-sibling and following-sibling axis can be used. Here's how and why it's so useful. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-select-sibling-elements-using-xpath) # How to check if element exists in Playwright? > To check whether an HTML element is present on the page using Playwright the page.locator() method can be used. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-check-for-element-in-playwright) # How to select all elements between two elements in XPath? > To select all elements between two different elements preceding-sibling or following-sibling axis selectors can be used. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-select-all-elements-between-two-known-elements-in-xpath) # How to select dictionary key recursively in Python? > To select dictionary keys recursively in Python the "nested-lookup" package implements the most popular nested key selection algorithms. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-select-dictionary-key-recursively-in-python) # What are some ways to parse JSON datasets in Python? > There are several popular options when it comes to JSON dataset parsing in Python. The most popular packages are Jmespath and Jsonpath. ## Docs - [Read article](https://scrapfly.io/blog/answers/what-are-some-ways-to-parse-json-datasets-in-python) # Stepping into Footwear Market with Web Scraping > Introduction to data analytics with web scraped data. Tracking luxury footwear market using web scraping, python and basic data analytics. ## Docs - [Read article](https://scrapfly.io/blog/posts/stepping-into-footwear-market-with-web-scraping) # What is Asynchronous Web Scraping? > Asynchronous programming is an accessible way to scale around IO blocking which is especially powerful in web scraping. Here's why. ## Docs - [Read article](https://scrapfly.io/blog/answers/what-is-asynchronous-web-scraping) # How to Scrape Goat.com for Fashion Apparel Data in Python > How to scrape Goat.com for new and second-hand apparel product data using Python and how to avoid blocking when scaling up goat.com crawlers. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-goat-com-fashion-apparel) # How to Scrape Fashionphile for Second Hand Fashion Data > Tutorial on scraping a major fashion e-commerce retailer Fashionphile.com. How to scrape it using Python and avoid being blocked. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-fashionphile) # How to Scrape Sitemaps to Discover Scraping Targets > Scraping sitemaps can be an easy way to discover scrape targets and scrape all pages of the website. This tutorial covers how to scrape them. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-sitemaps) # How to Scrape Vestiaire Collective for Fashion Product Data > Guide on how to scrape the biggest 2nd hand luxury fashion marketplace Vestiaire Collective using Python and hidden web data scraping approach. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-vestiairecollective) # How to Scrape Nordstrom Fashion Product Data > Tutorial for web scraping Nordstrom.com using Python, httpx and parsel. Extracting hidden product data and how to parse it using jmespath. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-nordstrom) # What are devtools and how they're used in web scraping? > Developer tools suite is used in web development but can also be used in web scraping to understand how target websites work. Here's how to use it. ## Docs - [Read article](https://scrapfly.io/blog/answers/browser-developer-tools-in-web-scraping) # How to Scrape StockX e-commerce Data with Python > Intro to scraping stockX.com product data using Python for free. How to collect stockx data using web scraping and avoid being blocked. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-stockx) # How to use cURL in Python? > cURL through libcurl is a popular library used in HTTP connections and can be used with Python through wrapper libraries like pycurl. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-use-curl-in-python) # What is HTTP cookies role in web scraping? > HTTP cookies play a big role in web scraping. They can be used to configure website preferences and play an important role in scraper detection. ## Docs - [Read article](https://scrapfly.io/blog/answers/http-cookies-in-web-scraping) # HTTP vs HTTPS in web scraping ? > HTTPS is a secure version of the HTTP protocol which can complicate the web scraping process in many different ways. Here's what it means. ## Docs - [Read article](https://scrapfly.io/blog/answers/http-vs-https-in-web-scraping) # What is the difference between IPv4 vs IPv6 in web scraping? > IPv4 and IPv6 are two competing Internet Protocol version that have different advantages when it comes to web scraping. Here's what they are. ## Docs - [Read article](https://scrapfly.io/blog/answers/ipv4-vs-ipv6-in-web-scraping) # How to use VPNs as proxies for web scraping > VPNs can be used as IP proxies in web scraping. Here's how and what to keep an eye on when using this approach. ## Docs - [Read article](https://scrapfly.io/blog/answers/vpn-as-proxies-in-web-scraping) # What is cURL and how is it used in web scraping? > cURL is the most popular HTTP client and library (libcurl) that implements most of HTTP features meaning it's a powerful web scraping tool too. ## Docs - [Read article](https://scrapfly.io/blog/answers/what-is-curl-and-how-is-it-used-in-web-scraping) # What is MITM and how is it used in web scraping? > MITM tools can be used to intercept and modify http traffic of various applications like web browser or phone apps in web scraper development. ## Docs - [Read article](https://scrapfly.io/blog/answers/what-is-mitm-proxy-and-how-is-it-used-web-scraping) # How to Bypass Imperva Incapsula when Web Scraping in 2025 > Depp look at how to add Imperva Incampusa bypass to web scrapers. How does it detect web scraping and how to avoid avoiding being blocked. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-bypass-imperva-incapsula-anti-scraping) # How to Bypass Datadome Anti Scraping in 2025 > Indepth look at how to add Datadome bypass to web scrapers. How does it detect web scraping and how to avoid avoiding being blocked. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-bypass-datadome-anti-scraping) # How to Bypass Akamai when Web Scraping in 2025 > Deep look at how to add Akamai Bot Manager bypass to web scrapers. How does it detect web scrapers and best practices for avoiding being blocked. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-bypass-akamai-anti-scraping) # How to Bypass PerimeterX when Web Scraping in 2025 > In-depth look at how PerimeterX is detecting web scrapers and bots. How to bypass PerimeterX when web scraping and avoid being blocked. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-bypass-perimeterx-human-anti-scraping) # How to Bypass Cloudflare When Web Scraping in 2025 > Overview of how Cloudflare bot management service blocks web scrapers and how to bypass Cloudflare when web scraping by managing fingerprinting. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-bypass-cloudflare-anti-scraping) # Web Scraping Simplified - Scraping Microformats > Introduction to web scraping microformats - special data markup powered by schema.org. How to scrape microformats in Python and an example project. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-microformats) # How to Scrape X.com (Twitter) using Python (2025 Update) > Tutorial for web scraping X.com (Twitter) post and user data using Python, playwright and background request capture technique. Tweet scraping. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-twitter) # How to Scrape RightMove Real Estate Property Data > Tutorial on scrpaing RightMove.co.uk real estate listing data using Python and popular community packages. How to not get blocked and find data. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-rightmove) # How to Scrape Google Search Results in 2025 > Tutorial on scraping Google search using Python with a few community packages. How to parse dynamic SERP HTML and how to not get blocked. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-google) # Introduction to Parsing JSON with Python JSONPath > Intro to using Python and JSONPath library and a query language for parsing JSON datasets. How to setup jsonpath and real life examples. ## Docs - [Read article](https://scrapfly.io/blog/posts/parse-json-jsonpath-python) # How to Scrape Ebay Using Python (2025 Update) > Tutorial on scraping Ebay.com using Python for free. Hands-on full scraper code, tips and tricks and how to avoid being blocked. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-ebay) # How to open Python http responses in a web browser? > To preview Python http responses we can use temporary files and the built-in webbrowser module. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-open-python-responses-in-browser) # How to run Playwright in Jupyter notebooks? > Learn why the synchronous execution of Playwright is blocked on Jupyter notebooks and how to solve it using asyncio. ## Docs - [Read article](https://scrapfly.io/blog/answers/playwright-in-ipython) # How to fix python requests ConnectTimeout error? > Python's ConnectTimeout exception is caused when connection can't be established fast enough. Here's how to fix it. ## Docs - [Read article](https://scrapfly.io/blog/answers/python-requests-exception-connecttimeout) # How to fix Python requests MissingSchema error? > Python "requests.MissingSchema" exception is usually caused by a missing protocol part in the URL. Most commonly when relative URL is used. ## Docs - [Read article](https://scrapfly.io/blog/answers/python-requests-exception-missingschema) # How to fix Python requests ReadTimeout error? > Python requests.ReadTimeout is caused when resources cannot be read fast enough. Here's how to fix it. ## Docs - [Read article](https://scrapfly.io/blog/answers/python-requests-exception-readtimeout) # How to fix Python requests SSLError? > Python's requests.SSLError is caused when encryption certificates mismatch for HTTPS type of URLs. Here's how to fix it. ## Docs - [Read article](https://scrapfly.io/blog/answers/python-requests-exception-sllerror) # How to fix Python requests TooManyRedirects error? > Python's requests.TooManyRedirects exception is raised when server continues to redirect >30 times. Here's how to fix it. ## Docs - [Read article](https://scrapfly.io/blog/answers/python-requests-exception-toomanyredirects) # How to configure Python requests to use a proxy? > Python requests supports many proxy types and options. Here's how to configure most proxy options for web scraping. ## Docs - [Read article](https://scrapfly.io/blog/answers/python-requests-proxy-intro) # Selenium: chromedriver executable needs to be in PATH? > selenium error "chromedriver executable needs to be in PATH" means that chrome driver is not installed or reachable - here's how to fix it. ## Docs - [Read article](https://scrapfly.io/blog/answers/selenium-chromedriver-in-path) # Selenium: geckodriver executable needs to be in PATH? > selenium error "geckodriver executable needs to be in PATH" means that gecko driver is not installed or reachable - here's how to fix it. ## Docs - [Read article](https://scrapfly.io/blog/answers/selenium-geckodriver-in-path) # How to Rate Limit Async Requests in Python > Tutorial on how to throttle asynchronous Python requests when web scraping. How to control web scraping speed to avoid being blocked. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-rate-limit-asynchronous-python-requests) # Web scraping - what is HTTP 403 status code? > http error 403 means the client is being forbidden from acessing the requested resources. This could mean blocking or invalid request details. ## Docs - [Read article](https://scrapfly.io/blog/answers/403-status-code) # Web scraping - what is HTTP 429 status code? > Response error code 429 means the client is making too many requests in a given time span and should slow down. Here's how to avoid it. ## Docs - [Read article](https://scrapfly.io/blog/answers/429-status-code) # What is 444 status code and how to avoid it? > Response error code 444 means the server has unexpectedly closed connection. This could mean the web scraper is being blocked. ## Docs - [Read article](https://scrapfly.io/blog/answers/444-status-code) # Web scraping - what is HTTP 499 status code? > Response error 499 generally means the server has closed the connection unexpectedly. This could mean the client is being blocked. Here's how to fix it. ## Docs - [Read article](https://scrapfly.io/blog/answers/499-status-code) # Web scraping - what is HTTP 503 status code? > Response error 503 generally means the server is temporarily unavailable however it could also mean blocking. Here's how to fix it. ## Docs - [Read article](https://scrapfly.io/blog/answers/503-status-code) # Web scraping - what is HTTP 520 status code? > Response error 502 generally means the server cannot create a valid response. This could also mean the client is being blocked. Here's how to fix it. ## Docs - [Read article](https://scrapfly.io/blog/answers/520-status-code) # What are Cloudflare Errors 1006, 1007, 1008? > Cloudflare is a popular anti web scraping service and errors 1006, 1007 and 1008 are popular web scraping blocking errors. Here's how to avoid them. ## Docs - [Read article](https://scrapfly.io/blog/answers/cloudflare-error-1006-1007-1008-access-denied) # What is Cloudflare Error 1009? > Cloudflare is a popular web scraping blocking service and error 1009 access denied is a popular error for web scraper blocking. Here's how to avoid it. ## Docs - [Read article](https://scrapfly.io/blog/answers/cloudflare-error-1009-access-denied-country-or-region-banned) # What is Cloudflare Error 1010? > Cloudflare is a popular web scraping blocking service and error 1010 access denied is a popular error for web scraper blocking. Here's how to avoid it. ## Docs - [Read article](https://scrapfly.io/blog/answers/cloudflare-error-1010-browser-signature) # What is Cloudflare Error 1015? > Cloudflare is a popular web scraping blocking service and error 1015 "you are being limited" is a popular error for web scraper blocking. ## Docs - [Read article](https://scrapfly.io/blog/answers/cloudflare-error-1015-rate-limited) # What is Cloudflare Error 1020? > Cloudflare error 1020 access denied is a common web error when web scraping caused by Cloudflare anti scraping service. Here's how to avoid it. ## Docs - [Read article](https://scrapfly.io/blog/answers/cloudflare-error-1020-access-denied) # 3 ways to install Python Requests library > Python requests library is a popular HTTP client and here's how to install it using pip, poetry and pipenv. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-install-requests-python) # How to scrape Perimeter X: Please verify you are human? > Perimeter X is a popular anti-scraping protection service - here's how to avoid it when web scraping. ## Docs - [Read article](https://scrapfly.io/blog/answers/perimeterx-verify-press-and-hold) # XPath vs CSS selectors: what's the difference? > CSS selectors and XPath are both path languages for HTML parsing. Xpath is more powerful but CSS is more approachable - which is one is better? ## Docs - [Read article](https://scrapfly.io/blog/answers/xpath-vs-css-selectors) # How to Scrape Zoopla Real Estate Property Data in Python > Tutorial for scraping Zoopla.com property data: houses for rent, sale, agent contact details and how to crawl search and sitemap pages. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-zoopla) # Quick Intro to Parsing JSON with JMESPath in Python > Tutorial on JMESPath - JSON query languge that is often used in web scraping to parse JSON datasets for scrape data. Plus an example project. ## Docs - [Read article](https://scrapfly.io/blog/posts/parse-json-jmespath-python) # How to save and load cookies in Python requests? > To save session between script runs we can save and load requests session cookies to disk. Here's how to do in Python requests. ## Docs - [Read article](https://scrapfly.io/blog/answers/save-and-load-cookies-in-requests-python) # How to Scrape Redfin Real Estate Property Data in Python > Tutorial on how to scrape Redfin.com for real estate property data using Python. How to extract property datasets for free and avoid blocking. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-redfin) # How to download a file with Playwright and Python? > To download files using Playwright we can either simulate the button click or extract the url and download it using HTTP. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-download-file-with-playwright) # How to get file type of an URL in Python? > There are 2 ways to determine URL file type: guess by url extension using mimetypes module or do a HTTP HEAD request. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-get-url-filetype-in-python) # How to load local files in Playwright? > To load local files as page URLs in Playwright we can use the file:// protocol. Here's how to do it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-load-local-files-in-playwright) # How to save and load cookies in Playwright? > To persist playwright connection session between program runs we can save and load cookies to/from disk. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-save-and-load-cookies-in-playwright) # How to take a screenshot with Playwright? > To take page screenshots in playwright we can use page.screenshot() method. Here's how to select areas and how to screenshot them in playwright. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-take-screenshot-with-playwright) # How to Scrape Real Estate Property Data using Python > Intro to web scraping real estate property data with Python. How to scrape popular targets like ZIllow, Realtor and dozens of tohers for free. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-real-estate-property-data-using-python) # How to block image loading in Selenium? > To increase Selenium's performance we can block images. To do that with Chrome browser "prefs" launch option can be used. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-block-image-loading-in-selenium) # Scrapy vs Beautifulsoup - what's the difference? > Scrapy and BeautifulSoup are two popular web scraping libraries though very different. Scrapy is a framework while beautifulsoup is a HTML parser ## Docs - [Read article](https://scrapfly.io/blog/answers/scrapy-vs-beautifulsoup) # How to scroll to an element in Selenium? > In Selenium, the scrollIntoView JavaScript function can be used to scroll to a specific HTML element. Here's how to use it in Selenium. ## Docs - [Read article](https://scrapfly.io/blog/answers/scroll-to-element-selenium) # How to Scrape Idealista.com > Intro to scraping Idealista.com real estate property data using Python. Hands on idealista scraper, parser and how to avoid blocking. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-idealista) # How to Scrape Realtor.com - Real Estate Property Data > Hands on tutorial on scraping Realtor.com using Python. How to scrape property information, pricing and track real-time updates and changes. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-realtorcom) # How to find elements by XPath selectors in Playwright? > To execute XPath selectors in playwright the page.locator() method can be used. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-find-elements-by-xpath-in-playwright) # How to block resources in Playwright and Python? > Blocking non-vital resources can drastically speed up Playwright. To do that page interception feature can be used. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-block-resources-in-playwright) # How to capture background requests and responses in Playwright? > To capture background requests and response in Playwright we can use request/response interception feature through page.on() method. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-capture-xhr-requests-playwright) # How to find elements by CSS selectors in Playwright? > To execute CSS selectors on current HTML data in Playwright the page.locator() method can be used. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-find-elements-by-css-selectors-in-playwright) # How to parse dynamic CSS classes when web scraping? > Dynamic CSS can make be very difficult to scrape. There are a few tricks and common idioms to approach this though. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-parse-dynamic-classes) # How to wait for page to load in Playwright? > To wait for all content to load in playwright we can use several different options but page.wait_for_selector() is the most reliable one. Here's how to use it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-wait-for-page-to-load-in-playwright) # How to capture background requests and responses in Puppeteer? > To capture background requests and response in Puppeteer we can use page.on() method to intercept every request/response. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-capture-xhr-requests-puppeteer) # How to find HTML elements by text with Cheerio and NodeJS? > To find HTML elements by text in NodeJS we can use cheerio library and special ":contains()" selectors. Here's how to do it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-find-html-elements-by-text-with-cheerio) # How to ignore non HTML URLs when web crawling? > When web crawling to avoid non-html pages we can test for page extensions or content types using HEAD requests. Here's how to do it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-ignore-non-html-urls-when-web-crawling) # How to select HTML elements by text using CSS Selectors? > It's not possible to select HTML elements by text in original CSS selectors specification but here are some alternative ways to do it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-select-elements-by-text-using-css-selectors) # How to turn HTML to text in Python? > To turn HTML data to text in Python we can use BeautifulSoup's get_text() method which strips away HTML data and leaves text as is. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-turn-html-to-text-in-python) # How to use CSS selectors in NodeJS when web scraping? > There are many ways to execute CSS selectors on HTML text in NodeJS but cheerio and osmosis libraries are the most popular ones. Here's how to use them. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-use-css-selectors-in-nodejs) # How to use CSS Selectors in Python? > To parse HTML using CSS selectors in Python we can use either BeautifulSoup or Parsel packages. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-use-css-selectors-in-python) # How to use XPath selectors in NodeJS when web scraping? > To parse HTML using XPath in Nodejs we can use one of two popular libraries like osmosis or xmldom. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-use-xpath-selectors-in-nodejs) # How to use XPath selectors in Python? > Python has several options for executing XPath selectors against HTML. The most popular ones are lxml and parsel. Here's how to use them. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-use-xpath-selectors-in-python) # Scraper doesn't see the data I see in the browser - why? > This means that scraper is not rendereding javascript that is changing the page contents. To verify this disable javascript in your browser. ## Docs - [Read article](https://scrapfly.io/blog/answers/why-cant-scraper-see-content) # How to find elements by CSS selector in Puppeteer? > To find HTML elements using CSS selectors in Puppeteer the $ and $eval methods can be used. Here's how to use them. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-find-elements-by-css-selectors-in-puppeteer) # How to find elements by XPath in Puppeteer? > To find elements by XPath using Puppeteer the "$x()" method can be used which will execute XPath selection on the current page DOM. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-find-elements-by-xpath-in-puppeteer) # How to get page source in Puppeteer? > To retreive page source in Puppteer the page.content() method can be used. Here's how to use it and what are the possible options. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-get-page-source-in-puppeteer) # How to load local files in Puppeteer? > To load local files in Puppeteer the file:// URL protocol can be used as the URL protocol prefix which will load file from the file path URI ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-load-local-files-in-puppeteer) # How to save and load cookies in Puppeteer? > To save and load cookies in Puppeteer page.setCookies() and page.cookies() methods can be used. Here's how to do it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-save-and-load-cookies-in-puppeteer) # How to select elements by class in XPath? > To select HTML elements by class name in XPath we can use the @ attribute selector and comparison function contains(). Here's how to do it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-select-elements-by-class-in-xpath) # How to select elements by text in XPath? > To select elements by text using XPath, the contains() function can be used or re:test for selecting based on regular expression patterns. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-select-elements-by-text-in-xpath) # How to take a screenshot with Puppeteer? > Learn how to take Puppeteer screenshots in NodeJS. You will also learn how to customize it through resolution and viewport customization. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-take-screenshot-with-puppeteer) # How to wait for a page to load in Puppeteer? > To wait for a page to load in Puppeteer the best approach is to wait for a specific element to appear using page.waitForSelector() method. Here's how to do it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-wait-for-page-to-load-in-puppeteer) # How to find elements by CSS selector in Selenium > To select HTML elements by CSS selectors in Selenium the driver.find_element() method can be used with the By.CSS_SELECTOR option. Here's how to do it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-find-elements-by-css-selectors-in-selenium) # How to find elements by XPath in Selenium > To select HTML elements by CSS selectors in Selenium the driver.find_element() method can be used with the By.XPATH option. Here's how to do it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-find-elements-by-xpath-in-selenium) # How to find elements without a specific attribute in BeautifulSoup? > To find HTML elements that do NOT contains a specific attribute we can use regular expression matching or lambda functions. Here's how to do it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-find-elements-without-attribute-in-beautifulsoup) # How to find HTML elements by multiple tags with BeautifulSoup? > To find HTML elements by one of many different element names we can use list of tags in find() methods or CSS selectors. Here's how to do it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-find-html-elements-by-multiple-tags-with-beautifulsoup) # How to find HTML elements by text value with BeautifulSoup > To find HTML elements by text value using Beautifulsoup and Python, regular expression patterns can be used in the text parameter of find functions. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-find-html-elements-by-text-with-beautifulsoup) # How to find sibling HTML nodes using BeautifulSoup and Python? > To find sibling HTML element nodes using BeautifulSoup the find_next_sibling() method can be used or CSS selector ~. Here's how to do it in Python. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-find-siblings-nodes-with-beautifulsoup) # How to get page source in Selenium? > To get full web page source in Selenium the driver.page_source property can be used. Here's how to do it in Python and Selenium. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-get-page-source-in-selenium) # How to save and load cookies in Selenium? > To save and load cookies of a Selenium browser we can use driver.get_cookies() and driver.add_cookies() methods. Here's how to use them. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-save-and-load-cookies-in-selenium) # How to select values between two nodes in BeautifulSoup and Python? > To select HTML element located between two HTML elements using BeautifulSoup the find_next_sibling() method can be used. Here's how to do it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-select-values-between-two-elements-in-beautifulsoup) # How to take a screenshot with Selenium? > To take a web page screenshot using Selenium the driver.save_screenshot() method can be used or element.screenshot() for specific element. Here's how to do it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-take-screenshot-with-selenium) # How to wait for page to load in Selenium? > To wait for specific HTML element to load in Selenium the WebDriverWait() object can be used with presence_of_element_located parameters. Here's how to do it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-wait-for-page-to-load-in-selenium) # Can I used XPath selectors in BeautifulSoup? > BeautilfulSoup for Python doesn't support XPath selectors but there are popular alternatives to fill in this niche. Here are some. ## Docs - [Read article](https://scrapfly.io/blog/answers/can-i-use-xpath-selectors-in-beautifulsoup) # How to find all links using BeautifulSoup and Python? > To find all links in the HTML pages using BeautifulSoup and Python the find_all() method can be used. Here's how to do it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-find-all-links-using-beautifulsoup) # How to find HTML elements by attribute using BeautifulSoup? > To find HTML node by a specific attribute value in BeautifulSoup the attribute match parameter can be used in the find() methods. Here's how. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-find-html-elements-by-attribute-with-beautifulsoup) # How to find HTML elements by class? > To find HTML nodes by class name CSS selectors or XPath can be used. For that .class css selector can be used or XPath's text() matcher. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-find-html-elements-by-class) # How to find HTML element by class with BeautifulSoup? > To find HTML node by class name using BeautifulSoup the class match parameter can be used using the find() methods. Here's how to do it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-find-html-elements-by-class-with-beautifulsoup) # How to scrape tables with BeautifulSoup? > To scrape HTML tables using BeautifulSoup and Python the find_all() method can be used with common table parsing algorithms. Here's how to do it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-scrape-tables-with-beautifulsoup) # What are some BeautifulSoup alternatives in Python? > BeautifulSoup is a popular HTML library for Python. It's most popular alternatives are lxml, parsel and html5lib. Here's how they differ from bs4. ## Docs - [Read article](https://scrapfly.io/blog/answers/what-are-some-beautifulsoup-alternatives) # What's the difference between Web Scraping and Crawling? > Web Scraping and Web Crawling are similar but not quite the same. Crawling is a form of web scraping and here are some major differences. ## Docs - [Read article](https://scrapfly.io/blog/answers/whats-the-difference-between-scraping-and-crawling) # How to block resources in Puppeteer? > Blocking non-critical resources in Puppeteer can drastically speed up the program. Here's how to do in Puppeteer and Nodejs. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-block-resources-in-puppeteer) # How to Scrape Hidden Web Data > Introduction to scraping data that is not visible in the HTML of the page. What is hidden web data and how to scrape it using Python. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-hidden-web-data) # How to Ensure Web Scrapped Data Quality > Introduction to two tools used in web scrapped data quality validation - Cerberus and Pydantic. Why do we need to validate scraped data and how? ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-ensure-web-scrapped-data-quality) # How to download a file with Puppeteer? > To download a file using Puppeteer and NodeJS we can either simulate the click on the download button or use HTTP client. Here's how to do it. ## Docs - [Read article](https://scrapfly.io/blog/answers/how-to-download-file-with-puppeteer) # How to Turn Web Scrapers into Data APIs > Tutorial on how to create data APIs that scrape data on demand using Python and FastAPI. Real example project, best practices and tips. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-turn-web-scrapers-into-data-apis) # How to Scrape Glassdoor (2025 update) > Practical python web scraping tutorial for Glassdoor job listings, company data and reviews, salary information and other public data fields. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-glassdoor) # Web Scraping with Playwright and Python > Practical introduction to scraping dynamic websites and web apps with Playwright. Real life example project and common idioms and challanges. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-with-playwright-and-python) # How to Rotate Proxies in Web Scraping > Introduction to proxy rotation in web scraping. What's the best way to rotate proxies for highest success and lowest blocking rates. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-rotate-proxies-in-web-scraping) # Web Scraping Speed: Processes, Threads and Async > How can we speed up web scraping using native Python technologies: processes, threads and asyncio. What's the best way to scale a web scraper? ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-speed) # How to Scrape Indeed.com (2025 Update) > Tutorial on web scraping job listing data from Indeed.com using Python. How to collect and parse recruitment data without being blocked. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-indeedcom) # How to Scrape Algolia Search > Tutorial on web scraping Algolia search API which is used by many websites as a content search system. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-algolia-search) # How to Crawl the Web with Python > Introduction to web crawling with Python. What is crawling, how it differs from scraping, deep dive into code and an example project. ## Docs - [Read article](https://scrapfly.io/blog/posts/crawling-with-python) # How to Scrape Zoominfo Company Data (2025 Update) > Practical tutorial on how to web scrape public company and people data from Zoominfo.com using Python and how to avoid being blocked. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-zoominfo) # How to Scrape Google Maps > Tutorial on how to scrape Google Maps using browsers through Selenium, Playwright or ScrapFly's browser automation toolkits. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-google-maps) # How to Scrape Wellfound Company Data and Job Listings > Tutorial for web scraping Wellfound.com (previously AngelList) company and job data using Python and how to scrape without being blocked. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-wellfound-aka-angellist) # How to Scrape Crunchbase in 2025 > Tutorial on how to scrape crunchbase.com business and related data using Python. How to avoid blocking to scrape data at scale and other tips. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-crunchbase) # How to Scrape YellowPages.com in 2025 > Tutorial on how to scrape yellowpages.com business and review data using Python. How to avoid blocking to scrape data at scale and other tips. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-yellowpages) # How to Scrape Amazon.com Product Data and Reviews > Tutorial on how to scrape Amazon.com's product and review data using Python and how to avoid blocking and solve common Amazon scraping challenges. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-amazon) # How to Scrape Zillow Real Estate Property Data in Python > Tutorial on how to scrape Zillow.com for real estate property data using Python. How to extract datasets for free and avoid blocking. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-zillow) # How to Scrape TripAdvisor.com (2025 Updated) > Step by step tutorial on how to scrape TripAdvisor.com reviews, hotel pricing and information data as well as other public details. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-tripadvisor) # How to Scrape Aliexpress.com (2025 Update) > Tutorial on how to scrape Aliexpress.com product, review and pricing data using Python. How to avoid blocking to scrape at scale and other tips. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-aliexpress) # Creating Search Engine for any Website using Web Scraping > Guide for creating a search engine for any website using web scraping in Python. How to crawl data, index it and display it via js powered GUI. ## Docs - [Read article](https://scrapfly.io/blog/posts/search-engine-using-web-scraping) # How to Scrape Booking.com (2025 Update) > Tutorial on how to scrape booking.com hotel and pricing data using Python. How to avoid blocking to web scrape data at scale and other tips. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-bookingcom) # Web Scraping With Node-Unblocker > Tutorial on using Node-Unblocker - a nodejs library - to avoid blocking while web scraping and using it to optimize web scraping stacks. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-with-node-unblocker) # How to Scrape Instagram in 2025 > Tutorial on how to scrape instagram.com user and post data using pure Python. How to scrape instagram without loging in or being blocked. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-instagram) # How to Scrape Walmart.com Product Data (2025 Update) > Tutorial on how to scrape walmart.com product and review data using Python. How to avoid blocking to web scrape data at scale and other tips. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-walmartcom) # How to Scrape Yelp.com (2025 update) > Tutorial on how to scrape yelp.com business and review data using Python. How to avoid blocking to web scrape data at scale and other tips. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-yelpcom) # How Headers Are Used to Block Web Scrapers and How to Fix It > Introduction to web scraping headers - what do they mean, how to configure them in web scrapers and how to avoid being blocked. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-avoid-web-scraping-blocking-headers) # How to Avoid Web Scraper IP Blocking? > Introduction to how IP and proxy fingerprinting and analysis is used to block web scrapers and how to avoid these blocking techniques. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-avoid-web-scraping-blocking-ip-addresses) # How Javascript is Used to Block Web Scrapers? In-Depth Guide > Introduction to how javascript is used to detect web scrapers. What's in javascript fingerprint and how to correctly spoof it for web scraping. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-avoid-web-scraping-blocking-javascript) # How TLS Fingerprint is Used to Block Web Scrapers? > How TLS fingerprint (JA3) is being used to block web scrapers. In depth introduction and tips how to avoid this type of blocking. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-avoid-web-scraping-blocking-tls) # 5 Tools to Scrape Without Blocking and How it All Works > Intro to how web scraper blocking works: Javascript, TLS and HTTP fingerprinting, IP Addresses and Proxies and how 5 popular tools can help. ## Docs - [Read article](https://scrapfly.io/blog/posts/how-to-scrape-without-getting-blocked-tutorial) # Web Scraping Graphql with Python > Introduction to web scraping graphql powered websites. How to create graphql queries in python and what are some common challenges. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-graphql-with-python) # Web Scraping with Python > Tutorial on scraping in Python. Intro to HTTP clients and HTML parsing, common scraping challenges, idioms and hands-on example project. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-with-python) # Web Scraping With R Tutorial and Example Project > Tutorial on web scraping with R language. How to handle http connections, parse html files, best practices, tips and an example project. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-with-r) # Best Web Scraping Proxy Providers (2025 Update) > Analysis and comparison of some of the most popular proxy providers. What makes a good proxy providers? What features and dangers to look out for? ## Docs - [Read article](https://scrapfly.io/blog/posts/best-proxy-providers-for-web-scraping) # Top 4 Mobile Proxy Providers for Web Scraping > Analysis and comparison of top mobile proxy providers for web scraping and how to choose the right one to avoid web scraper blocking ## Docs - [Read article](https://scrapfly.io/blog/posts/top-4-mobile-proxy-providers-for-web-scraping) # Top 5 Residential Proxy Providers for Web Scraping > Comparison of top residential proxy providers for web scraping. Blocking rates, performance and general overview of what makes a good proxy. ## Docs - [Read article](https://scrapfly.io/blog/posts/top-5-residential-proxy-providers) # The Complete Guide To Using Proxies For Web Scraping > Introduction to proxy usage in web scraping. What types of proxies are there? How to evaluate proxy providers and avoid common issues. ## Docs - [Read article](https://scrapfly.io/blog/posts/introduction-to-proxies-in-web-scraping) # Web Scraping With Ruby > Introduction to web scraping with Ruby. How to handle http connections, parse html files for data, best practices, tips and an example project. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-with-ruby) # Web Scraping With NodeJS and Javascript > Introduction to web scraping in Javascript through a real hands-on example. Scraping HTML, parsing it and common challenge overview. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-with-nodejs) # How to Web Scrape with Puppeteer and NodeJS in 2025 > Puppeteer and nodejs tutorial (javascript) for web scraping dynamic web pages and web apps. Tips and tricks, best practices and example project. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-with-puppeteer-and-nodejs) # Parsing HTML with CSS Selectors > Introduction to using CSS selectors to parse web-scraped content. Best practices, available tools and common challenges by interactive examples. ## Docs - [Read article](https://scrapfly.io/blog/posts/parsing-html-with-css) # Parsing HTML with Xpath > Introduction to xpath in the context of web-scraping. How to extract data from HTML documents using xpath, best practices and available tools. ## Docs - [Read article](https://scrapfly.io/blog/posts/parsing-html-with-xpath) # Web Scraping With PHP 101 > Introduction to web scraping with PHP. How to handle http connections, parse html files for data, best practices, tips and an example project. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-with-php-101) # Web Scraping With Scrapy: The Complete Guide in 2025 > Tutorial on web scraping with scrapy and Python through a real world example project. Best practices, extension highlights and common challenges. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-with-scrapy) # Web Scraping with Selenium and Python > Selenium and Python tutorial for web scraping dynamic, javascript powered websites using a headless Chrome webdriver. Real life example project. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-with-selenium-and-python) # How to Parse Web Data with Python and Beautifulsoup > Introduction to web scraping with Python and BeautifulSoup HTML parsing library used in scraping. How to find text in scraped web data. ## Docs - [Read article](https://scrapfly.io/blog/posts/web-scraping-with-python-beautifulsoup) # How to Scrape Dynamic Websites Using Headless Web Browsers > Intro to using headless web browser and libraries like Puppeteer, Playwright and Selenium in web scraping dynamic websites. ## Docs - [Read article](https://scrapfly.io/blog/posts/scraping-using-browsers) # How to Scrape Amazon.com Product Data and Reviews > Tutorial on how to scrape Amazon.com's product and review data using Python and how to avoid blocking and solve common Amazon scraping challenges. ## Docs - [Read article](https://scrapfly.io/blog/answers) # > ## Docs - [Read article](https://scrapfly.io/blog/feed.xml) # Scrapfly Blog - Web Scraping News and Tutorials > Insider knowledge on web scraping, web automation, data extraction and more. Learn how to scrape websites, avoid blocks, use proxies and APIs, and much more. ## Docs - [Read article](https://scrapfly.io/blog/) # > ## Docs - [Read article](https://scrapfly.io/blog/llms.txt) # Join the Newsletter > ## Docs - [Read article](https://scrapfly.io/blog/newsletter) # Newsletter Subscription Confirmed! > ## Docs - [Read article](https://scrapfly.io/blog/newsletter-confirmed) # > ## Docs - [Read article](https://scrapfly.io/blog/robots.txt) # > ## Docs - [Read article](https://scrapfly.io/blog/sitemap.xml) # Scrapfly Tags > Explore our collection of web scraping articles, tutorials, and resources organized by tags. Learn about various topics in web scraping, including tools, techniques, and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags) # Learn about AI in Web Scraping and Automation > Articles and knowledgebase related to AI in web scraping and data programming. Learn AI use techniques, tools, and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/ai) # Learn about APIs in Web Scraping and Automation > Articles and knowledgebase related to APIs in web scraping and data programming. Learn API use techniques, tools, and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/api) # Learn about Axios library in Scraping and Automation > Articles and knowledgebase related to axios http client in web scraping and data programming. Learn axios use techniques, tools, and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/axios) # Learn about BeautifulSoup in Web Scraping and Automation > Learn about beautifulsoup - the most popular HTML parsing library in Python using real life examples and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/beautifulsoup) # Learn about Scraper Blocking and Bypass It > Web scraping blocking is increasingly common and complex challenge. Learn about types of scraper blocking and how to bypass them effectively. ## Docs - [Read article](https://scrapfly.io/blog/tags/blocking) # Learn about Web Crawling - Tutorials and News > Articles and knowledgebase related to web crawling. Learn how to crawl websites, extract data, and automate web scraping tasks. ## Docs - [Read article](https://scrapfly.io/blog/tags/crawling) # Learn about CSS Selectors in Web Scraping and Automation > Articles and knowledgebase related to CSS Selectors in web scraping and data programming. Learn CSS Selector techniques, tools, and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/css-selectors) # Learn about cURL in Web Scraping and Automation > Articles and knowledgebase related to cURL http client in web scraping and data programming. Learn cURL techniques, tools, and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/curl) # Learn about Data Parsing in Web Scraping and Automation > Articles and knowledgebase related to data parsing in web scraping and data programming. Learn data parsing techniques, tools, and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/data-parsing) # Learn about E-commerce in Web Scraping and Automation > Articles and knowledgebase related to e-commerce in web scraping and data programming. Learn e-commerce scraping techniques, tools, and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/ecommerce) # Learn about Fashion Data in Web Scraping and Automation > Articles and knowledgebase related to fashion data in web scraping and data programming. Learn fashion data scraping techniques, tools, and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/fashion) # Learn about Web Scraping Frameworks > Web scraping frameworks are great tools for building complex and dedicated web scraping projects. Learn web scraping frameworks and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/frameworks) # Learn about Go Language in Web Scraping and Automation > Articles and knowledgebase related to Go Language in web scraping and data programming. Learn Go Language techniques, tools, and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/golang) # Learn about GraphQL in Web Scraping and Automation > Learn about graphql in web scraping and automation. How to scrape graphql and how it works in data programming and data automation. ## Docs - [Read article](https://scrapfly.io/blog/tags/graphql) # Learn Headless Browsers for Web Scraping > Learn about headless browser automation in web scraping and data programming. Best practices and tools fo headless browsers. ## Docs - [Read article](https://scrapfly.io/blog/tags/headless-browser) # Learn about Hidden APIs in Web Scraping and Automation > Articles and knowledgebase related to hidden APIs in web scraping and data programming. Learn hidden API techniques, tools, and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/hidden-api) # Learn about HTTP in Web Scraping and Automation > Articles and knowledgebase related to HTTP protocol in web scraping and data programming. Learn HTTP techniques, tools, and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/http) # Learn about Httpx in Web Scraping and Automation > Articles and knowledgebase related to python's Httpx library in web scraping and data programming. Learn Httpx techniques, tools, and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/httpx) # Learn about Java Language in Web Scraping and Automation > Articles and knowledgebase related to Java Language in web scraping and data programming. Learn Java Language techniques, tools, and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/java) # Learn Jupyter for Web Scraping and Data Programming > Articles and knowledgebase related to Jupyter notebooks in web scraping and data programming. Learn Jupyter techniques, tools, and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/jupyter) # Learn about NodeJS in Web Scraping and Automation > Articles and knowledgebase related to NodeJS in web scraping and data programming. Learn NodeJS scraping techniques, tools, and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/nodejs) # Learn about Parsel in Web Scraping and Automation > Articles and knowledgebase related to Parsel Python library. Learn how to use parsel for web scraping and HTML parsing using CSS Selectors and XPath. ## Docs - [Read article](https://scrapfly.io/blog/tags/parsel) # Learn about PHP in Web Scraping and Data Programming > Articles and knowledgebase related to PHP in web scraping and data programming. Learn PHP scraping techniques, tools, and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/php) # Learn about Playwright in Web Scraping and Automation > Articles and knowledgebase related to Playwright in web scraping and data programming. Learn Playwright scraping techniques, tools, and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/playwright) # Learn Real Web Scraping Projects > Example web scraping projects and ideas. Hands on web scraping for learning web scraping and data programming using industry standard tools and techniques. ## Docs - [Read article](https://scrapfly.io/blog/tags/project) # Learn about Proxies in Web Scraping and Automation > Learn about proxies in web scraping and web automation. Learn how to use proxies in crawling, popular proxy tools, and best proxy practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/proxies) # Learn about Puppeteer in Web Scraping and Automation > Learn Puppeteer Node.js library for web scraping using headless browsers and web automation. The best puppeteer techniques, tools, and practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/puppeteer) # Learn about Python in Web Scraping and Data Programming > Python is the most popular language for web scraping and data programming. Learn how to use Python for web scraping, data analysis, and more. ## Docs - [Read article](https://scrapfly.io/blog/tags/python) # Learn about R in Web Scraping and Data Programming > R is a powerful language for data analysis and visualization. Learn how to use R for web scraping, data analysis, and more. ## Docs - [Read article](https://scrapfly.io/blog/tags/r) # Learn about Real Estate Web Scraping > Real estate web scraping is a powerful tool for property data, market trends, and investment opportunities. Learn how to scrape real estate data. ## Docs - [Read article](https://scrapfly.io/blog/tags/real-estate) # Learn about Requests in Web Scraping and Data Programming > Requests is a popular Python library for making HTTP requests. Learn how to use Requests for web scraping, data analysis, and more. ## Docs - [Read article](https://scrapfly.io/blog/tags/requests) # Learn about Ruby in Web Scraping and Data Programming > Ruby is a versatile programming language that can be used for web scraping. Learn how to use Ruby for web scraping, data analysis, and more. ## Docs - [Read article](https://scrapfly.io/blog/tags/ruby) # Learn about Scaling Web Scraping Operations > Scaling web scraping operations is crucial for handling large datasets and high traffic. Learn how to scale your web scraping projects effectively. ## Docs - [Read article](https://scrapfly.io/blog/tags/scaling) # Learn How to Scrape the Most Popular Websites > Hands-on guides on scraping the most popular websites like Google, Amazon, and more. Learn how to scrape websites using industry leading technology for free. ## Docs - [Read article](https://scrapfly.io/blog/tags/scrapeguide) # Learn about Scrapy in Web Scraping and Data Programming > Scrapy is the most popular web scraping framework in Python. Learn how to use Scrapy for web scraping, data extraction, and more. ## Docs - [Read article](https://scrapfly.io/blog/tags/scrapy) # Learn about Screenshots in Web Scraping and Automation > Articles and knowledgebase related to screenshots in web scraping and web automation. Learn how to take screenshots of web pages, automate screenshot capture ## Docs - [Read article](https://scrapfly.io/blog/tags/screenshots) # Learn about Selenium in Web Scraping and Automation > Articles and knowledgebase related to Selenium headless browser automation. Learn how to use Selenium for web scraping and web automation effectively. ## Docs - [Read article](https://scrapfly.io/blog/tags/selenium) # Learn about SEO in Web Scraping and Automation > Articles and knowledgebase related to SEO (Search Engine Optimization) in web scraping. Learn how to search engine optimize using web scraping. ## Docs - [Read article](https://scrapfly.io/blog/tags/seo) # Learn about Web Scraping Tools and Technologies > Highlights of web scraping tools and technologies. Learn about the most popular web scraping tools, libraries, and frameworks used in the industry. ## Docs - [Read article](https://scrapfly.io/blog/tags/tools) # Learn about Typescript in Web Scraping and Data Programming > Articles and knowledgebase related to Typescript in web scraping and data programming. Learn Typescript techniques, tools, and best practices. ## Docs - [Read article](https://scrapfly.io/blog/tags/typescript) # Learn about XPath in Web Scraping and Data Programming > Articles and knowledgebase related to XPath in web scraping and data programming. Learn how to use XPath for HTML parsing, web scraping, and data extraction. ## Docs - [Read article](https://scrapfly.io/blog/tags/xpath) # Articles and tutorials by Mazen Ramadan > ## Docs - [Read article](https://scrapfly.io/blog/authors/mazen) # Articles and tutorials by Bernardas Ališauskas > ## Docs - [Read article](https://scrapfly.io/blog/authors/scrapecrow) # Articles and tutorials by Ziad Shamndy > ## Docs - [Read article](https://scrapfly.io/blog/authors/ziad)