Build a Proxy API: Rotate Proxies and Save Bandwidth

APIs can consume significant bandwidth, especially when multiple clients or services are fetching the same resources repeatedly. One way to reduce this overhead is by using a proxy API – an intermediary that sits between your application and external APIs or websites. A proxy API can cache responses and filter out unnecessary data, saving bandwidth and speeding up requests for all clients that use it.

In this tutorial, we'll walk through building a simple API proxy in Python using mitmproxy, a powerful open-source MITM (man-in-the-middle) proxy tool. By rotating proxies on each request and caching responses, our proxy will help avoid IP blocks and reduce duplicate data transfers. We’ll also configure it to drop unwanted resources (like images or styles) to further conserve bandwidth. Let’s dive into the benefits of rotating proxies and how to set up this bandwidth-saving proxy tool.

What Is a Proxy API and Why Is It Useful?

A proxy API is a server that forwards API requests from clients to external services. Instead of directly contacting the target API or website, your application sends requests through the proxy, which may modify requests, manage authentication, caching, or IP rotation, and then returns responses to your application.

Proxy APIs offer several key benefits:

  • Privacy: Conceal your application's IP address.
  • Centralized control: Simplify logging, rate limiting, and caching.
  • Efficiency: Reduce bandwidth usage and improve response reliability through caching.
  • Flexibility: Easily manage API rate limits and bypass IP restrictions.

Understanding these advantages helps highlight the essential features needed for building an effective proxy API, which we'll explore next.

Key Features of a Good Proxy API

Not all proxies are created equal. A good proxy API for bandwidth saving and web scraping tasks should include a few important features out of the box. Below are some key features and why they matter:

Feature Purpose & Benefit
Proxy Rotation Use a pool of IP proxies and rotate them on each request. This prevents any single proxy from being overused and getting blocked, ensuring higher availability and fewer captchas or bans. It also distributes traffic load across multiple IPs.
Response Caching Store responses (e.g., API results or webpage content) and serve them for identical requests. Caching avoids redundant downloads of the same data, significantly saving bandwidth and improving response times for repeated queries.
Content Filtering Drop or ignore unnecessary resource requests like images, CSS, or ads. By filtering out these non-critical assets, the proxy saves bandwidth and focuses on the data that your application actually needs (e.g. HTML or API JSON).
HTTPS Support Intercept and handle HTTPS traffic by using trusted certificates. Full HTTPS support ensures even secure API calls can be proxied, while still allowing the proxy to inspect and cache their content.

These features combined make a proxy API both efficient and resilient. Proxy rotation keeps your scraping or API consumption stealthy and unblockable. Caching and filtering make it lightweight on bandwidth usage. In the next section, we'll start building our own proxy API step by step with Python and mitmproxy, incorporating each of these features.

Now that we've identified what we need (rotation, caching, filtering, etc.), it's time to get our hands dirty and build the proxy API with these capabilities.

Step 1: Set Up mitmproxy in Python

To build our proxy API, we'll use mitmproxy, a Python-based intercepting proxy. Mitmproxy can be scripted with Python addons to modify requests and responses on the fly. First, let's install mitmproxy and create a basic addon script to ensure everything is wired up correctly:

# Install mitmproxy via pip if you haven't already
$ pip install mitmproxy

# (Optional) Verify the installation by checking the version
$ mitmproxy --version

Next, we'll set up a simple mitmproxy addon in Python. Create a file (for example, proxy_tool.py) and add a basic class that will handle proxy events. For now, we'll just log each request to confirm our proxy is intercepting traffic:

from mitmproxy import http

class BandwidthSaver:
    def request(self, flow: http.HTTPFlow):
        # Log each incoming request URL (for debugging purposes)
        print("Request URL:", flow.request.pretty_url)

# Register the addon with mitmproxy
addons = [BandwidthSaver()]

In this snippet, we import mitmproxy's http module and define a class BandwidthSaver with a request method. Mitmproxy will call request() for every HTTP request passing through the proxy. Here we simply print the URL of the request (flow.request.pretty_url) to the console. The last line registers our class as a mitmproxy addon.

Running the proxy: To test this setup, run mitmproxy (or its console-less variant mitmdump) with the addon script:

$ mitmdump -s proxy_tool.py
Example Output

[20:01:58.311] Loading script proxy_tool.py
[20:01:58.311] HTTP(S) proxy listening at *:8080.

By default, mitmproxy listens on localhost:8080 as an HTTP proxy. Configure your application or browser to use localhost:8080 as the HTTP/HTTPS proxy and perform a request (for example, open a webpage or make an API call). You should see the request URLs being printed by our script. This confirms the proxy is intercepting requests successfully.

With mitmproxy installed and our basic addon logging requests, we have the foundation ready. Next, we'll ensure HTTPS traffic can be handled by our proxy.

Step 2: Enable HTTPS by Installing mitmproxy’s Certificate

Modern APIs and websites mostly use HTTPS. For our proxy API to inspect and cache those requests, we need to enable HTTPS interception. Mitmproxy does this by acting as a "man-in-the-middle" with its own Certificate Authority (CA). We must install mitmproxy's CA certificate on the client system so that it trusts the proxy for HTTPS connections.

First, start mitmproxy (or mitmdump) to generate the necessary certificates if not already done:

$ mitmproxy  # Start the proxy; it will generate a CA cert on first run

While mitmproxy is running, open a web browser (or use your device) and visit http://mitm.it. This special page provides instructions to download and install the mitmproxy CA certificate for various platforms (Windows, macOS, Linux, Android, iOS). Install the certificate according to your environment. This typically involves trusting a new CA in your system or browser settings.

Once the certificate is installed, your system will treat the mitmproxy as a trusted authority. This means mitmproxy can decrypt HTTPS traffic between clients and servers, allowing our addon to read and modify those requests and responses. HTTPS support is now enabled for our proxy API.

Note: Only install the mitmproxy certificate on devices or environments you control for development or scraping. It gives the proxy power to intercept secure communications, which should be used responsibly.

With the proxy set up and HTTPS enabled, we can proceed to implement the core features of our bandwidth-saving proxy API. Next, we'll add functionality to rotate upstream proxies for each request.

Now that secure traffic can flow through our proxy, we’re ready to enhance it with proxy rotation for better IP diversity.

Step 3: Rotate Proxies Randomly on Each Request

One major benefit of a proxy API is the ability to hide the client's IP address. We can take this further by rotating through a list of upstream proxy servers on every request. By doing so, each request appears to come from a different IP—helping avoid rate limits or bans on the target service. Mitmproxy supports forwarding requests to an upstream proxy, which we can control in our script.

Let's update our addon to choose a random proxy for each request. Suppose we have a list of proxy server addresses (IP:port or host:port). We’ll configure mitmproxy to use one by default and then override it per request in our script:

import random
from mitmproxy import http

class BandwidthSaver:
    # List of upstream proxy servers to rotate through
    upstream_proxies = [
        "203.0.113.10:3128",
        "198.51.100.23:3128",
        "203.0.113.47:3128",
        # ... add as many proxies (IP:port or host:port) as you have
    ]

    def request(self, flow: http.HTTPFlow):
        # Pick a random upstream proxy for this request
        proxy_address = random.choice(self.upstream_proxies)
        host, port = proxy_address.split(":")
        # In upstream mode, tell mitmproxy to use the chosen proxy
        if flow.live:
            flow.live.change_upstream_proxy_server((host, int(port)))

        # (Optional) Log which proxy was chosen for debugging
        print(f"→ Rotating via proxy: {proxy_address} for {flow.request.host}")

In this code, we added an upstream_proxies list to our class containing proxy server addresses (you would replace these example IPs with actual proxies you have access to). In the request method, we use Python's random.choice to select a proxy from the list for each incoming request. The flow.live.change_upstream_proxy_server((host, port)) call tells mitmproxy to forward the current request through that upstream proxy.

A couple of important notes for this to work:

  • Start mitmproxy in upstream mode: When launching mitmdump or mitmproxy, use the --mode upstream: option with any proxy (or a default one) specified. For example:

    $ mitmdump --mode upstream:http://203.0.113.10:3128 -s proxy_tool.py
    

    This sets an initial upstream proxy. Our script will then override the upstream proxy on a per-request basis. Mitmproxy requires an upstream mode to be enabled for change_upstream_proxy_server to have effect.

  • HTTPS requests: The rotation logic above works for both HTTP and HTTPS requests, now that we've installed the certificate. Mitmproxy will decrypt the HTTPS request, then re-encrypt it as it passes it to the chosen upstream proxy.

With proxy rotation in place, every request through our proxy API will emerge from a random IP address. This helps distribute traffic and avoid IP-based blocking. For example, if you're scraping a website that limits one request per second per IP, using five rotating proxies could effectively allow ~5 requests per second without triggering blocks.

At this stage, our proxy API is forwarding requests through random proxies, enhancing anonymity and reliability. Next, we'll implement response caching to reuse results and save more bandwidth.

Step 4: Cache Responses to Save Bandwidth

Caching is a crucial feature for saving bandwidth. If multiple clients request the same resource through our proxy API, there's no need to fetch it from the origin server every time – we can return a stored copy. Let's add a simple cache to our proxy using a Python dictionary to store responses.

We'll cache responses by URL. When a request comes in, the addon will first check if we have a cached response for that URL. If yes, it will immediately return the cached data without forwarding the request to the internet. If not, it will proceed normally (possibly using a rotated proxy upstream), and then save the response for next time.

Here's how we can integrate caching into our BandwidthSaver addon:

from mitmproxy import http
import random

class BandwidthSaver:
    upstream_proxies = [
        "203.0.113.10:3128",
        "198.51.100.23:3128",
        "203.0.113.47:3128",
        # ... (same proxy list as before)
    ]
    # Initialize an in-memory cache (dictionary)
    cache = {}

    def request(self, flow: http.HTTPFlow):
        # 1. If this URL was seen before and cached, serve it from cache
        if flow.request.pretty_url in self.cache:
            cached_resp = self.cache[flow.request.pretty_url]
            # Create a response directly from cache without contacting upstream
            flow.response = http.HTTPResponse.make(
                cached_resp["status_code"],       # e.g. 200
                cached_resp["content"],           # cached raw content (bytes)
                cached_resp["headers"]            # cached headers
            )
            return  # respond from cache, no need to forward request

        # 2. Not cached: pick a random proxy as in Step 3
        proxy_address = random.choice(self.upstream_proxies)
        host, port = proxy_address.split(":")
        if flow.live:
            flow.live.change_upstream_proxy_server((host, int(port)))
        # (The request will now be forwarded to the origin through the chosen proxy)

    def response(self, flow: http.HTTPFlow):
        # After receiving a response from origin, cache it for future requests
        url = flow.request.pretty_url
        if url not in self.cache:
            self.cache[url] = {
                "status_code": flow.response.status_code,
                "content": flow.response.content,  # raw bytes of the response body
                "headers": dict(flow.response.headers)
            }
            # (Now the next request for the same URL will hit the cache)

Let's break down the caching logic:

  • We added a class attribute cache as a dictionary to store responses by URL. In a real scenario, you might want a more robust cache with size limits or expiration, but this simple dict will do for demonstration.

  • Cache check in request: Before forwarding a request, we check if flow.request.pretty_url (the full URL as a string) exists in our cache. If it does, we retrieve the cached data and use http.HTTPResponse.make(...) to create a synthetic response. We supply the cached status code, content, and headers. Setting flow.response in the request phase like this short-circuits the request – the client will get the response immediately from our proxy, and mitmproxy will not forward the request to the upstream server.

  • Saving in response: If the request was not cached, it went out to the origin (through a proxy). In the response handler, we take the newly received response and store it in the cache dict. We use the same URL as key. We store the status code, the content (which is a bytes object for the body), and the headers (converted to a regular dict for simplicity). Next time the same URL is requested, the request method will find it in cache and return this data.

With caching enabled, repeated requests for the same resource will be served from the proxy API's memory instead of the network. This saves bandwidth because the data travels only once from the external server; subsequent requests get the data from the local cache. It also reduces latency for those requests since returning data from memory is faster than making a network round-trip.

For example, if client A requests https://api.example.com/data?id=123 and then client B (or even A again) requests the same URL, the second request will get an instant cached response. No outgoing proxy usage or internet bandwidth is needed for the second call.

Now our proxy API rotates proxies and caches responses, making it efficient and fast for repeated requests. Next, we'll add a final touch: filtering out unnecessary requests to conserve even more bandwidth.

Step 5: Drop Unnecessary Requests (Stylesheets & Images)

When proxying web content (as opposed to pure API JSON), browsers often try to fetch images, stylesheets, scripts, and other assets. In a scraping context, these usually aren't needed – they just waste bandwidth. Our proxy API can proactively drop such requests. Even for API use cases, there might be certain endpoints or file types you know are extraneous. By filtering them out, the proxy saves the client from downloading useless data.

We'll update the request method in our addon to identify requests for common static asset types (like images and CSS) and short-circuit them with an empty response. This should happen before the caching check or proxy forwarding:

    def request(self, flow: http.HTTPFlow):
        # 0. Filter out unwanted asset types to save bandwidth
        if flow.request.pretty_url.endswith((".png", ".jpg", ".jpeg", ".gif", ".css", ".js")):
            # Return an empty 204 No Content response for these requests
            flow.response = http.HTTPResponse.make(204, b"", {})
            return

        # 1. Serve from cache if available (as implemented in Step 4)
        if flow.request.pretty_url in self.cache:
            cached_resp = self.cache[flow.request.pretty_url]
            flow.response = http.HTTPResponse.make(
                cached_resp["status_code"],
                cached_resp["content"],
                cached_resp["headers"]
            )
            return

        # 2. Otherwise, rotate proxy and forward (from Step 3)
        proxy_address = random.choice(self.upstream_proxies)
        host, port = proxy_address.split(":")
        if flow.live:
            flow.live.change_upstream_proxy_server((host, int(port)))

The new addition here is the first if block: it checks the URL's suffix against a tuple of file extensions for images (.png, .jpg, .jpeg, .gif), stylesheets (.css), and scripts (.js). You can adjust this list based on what you consider "unnecessary" for your scenario. If a match is found, we immediately set flow.response to an HTTP 204 (No Content) with an empty body. A 204 status tells the client that the request succeeded but there's no content to load. We then return without forwarding the request further. The result is that, for example, if a webpage tries to load a large .png image, our proxy will respond with nothing (saving the bandwidth that would have been used to download the image).

After adding this filter, the rest of the logic remains the same: we check the cache, and if not cached, we forward the request through a rotated proxy. The response handler also remains as implemented in Step 4 (caching any new responses). We typically don't need to cache the dropped items since we never fetch them in the first place.

With this final step, our proxy API tool is quite complete. It rotates among multiple upstream proxies, caches responses to reuse data, and blocks superfluous asset requests. All these measures contribute to substantial bandwidth savings and can speed up your data fetching pipelines.

To run the full proxy with all features combined, use the script and start mitmproxy as before. For instance:

$ mitmdump --mode upstream:http://203.0.113.10:3128 -s proxy_tool.py

Remember to update the upstream_proxies list in the script with proxies you have. Also ensure your clients are configured to use the mitmproxy server (e.g., HTTP_PROXY environment variable or browser proxy settings pointing to localhost:8080). Once running, your proxy API will handle incoming requests according to the logic we implemented.

We have now built a functional proxy API that can be used as a drop-in bandwidth-saving layer for web scraping or API consumption.

Proxies at ScrapFly

ScrapFly provides web scraping, screenshot, and extraction APIs for data collection at scale.

scrapfly middleware

FAQ

What is a Proxy API?

A proxy API is a server that forwards requests from clients to external services, optionally modifying requests and responses (e.g., adding caching or authentication). It helps hide client details, enforce policies, and aggregate data.

Why use rotating proxies in a proxy API?

Rotating proxies distribute requests across different IPs, preventing rate-limiting and bans when scraping or accessing restricted APIs. This ensures reliability and higher request volume.

How does caching in a proxy API save bandwidth?

Caching stores responses locally on the proxy. Subsequent identical requests use cached responses rather than fetching again from external services, significantly reducing bandwidth usage.

Summary

In this article, we built a bandwidth-saving proxy API from scratch using Python and mitmproxy. We started by setting up mitmproxy and enabling HTTPS interception so that we could handle secure traffic. Then we added proxy rotation, allowing each request to exit through a different IP address to avoid rate limits and blocking. Next, we implemented a simple in-memory cache to store responses and serve repeated requests without re-downloading data. We also introduced a filtering mechanism to drop unnecessary resources like images and styles, conserving bandwidth further.

Related Posts

The Best Datacenter Proxies in 2025: A Complete Guide

Explore the best datacenter proxies for 2025 including IPRoyal, shared vs dedicated options, and how to buy unlimited bandwidth proxies.

How to Choose the Best Proxy Unblocker?

Learn how to choose the best proxy unblocker to access blocked websites. Explore proxies, VPNs, and Scrapfly for bypassing restrictions safely.

Proxy vs VPN: In-Depth Comparison

Explore the proxy vs vpn debate with insights on key differences, benefits, limitations and alternatives. Discover when to choose a proxy or VPN.