Proxies & Load Balancers

Proxies and load balancers are the traffic layer of a distributed system — the components that sit between clients and servers and decide how requests are intercepted, inspected, and routed. Understanding each one, and how they differ, is essential for designing systems that are scalable, secure, and observable.

Component	Sits between	Primary job
Forward Proxy	Client → internet	Represents the client; controls outbound traffic
Reverse Proxy	Internet → servers	Represents the server; controls inbound traffic
Load Balancer	Internet → server pool	Distributes requests across multiple server instances
API Gateway	Clients → microservices	Single entry point with auth, routing, rate limiting

Forward Proxy

A forward proxy (often just called a "proxy") sits between a client and the internet and makes requests on behalf of the client. From the destination server's perspective, the request comes from the proxy — the client's real IP address is hidden.

What a forward proxy does

Anonymity — the origin server sees the proxy's IP, not the client's. This is the mechanism behind VPNs and tools like Tor.
Content filtering and access control — corporate networks route all outbound traffic through a forward proxy to block specific domains, log activity, or enforce security policies. The proxy can inspect and reject requests that violate policy before they leave the network.
Caching — a forward proxy can cache responses from the internet. If 500 employees all request the same software update, the proxy fetches it once and serves the cached copy to everyone, saving bandwidth.
Geo-unblocking — by routing through a proxy in a different country, clients can access content that is geographically restricted.

Forward proxy vs VPN

Both hide the client's IP and route traffic through an intermediary. The key difference is scope and encryption: a VPN encrypts all traffic at the OS level (every application, every protocol); a forward proxy typically works at the HTTP/HTTPS level and requires explicit configuration per application. A VPN creates a private tunnel; a forward proxy acts as a web intermediary.

Where it appears in system design

Forward proxies are rarely part of the server-side architecture you design. They appear when your service needs to make outbound requests through a controlled egress point — for example, all requests from your backend to third-party APIs route through a single forward proxy so the third party can whitelist one IP instead of your entire fleet's IP range.

Reverse Proxy

A reverse proxy sits in front of one or more servers and forwards incoming client requests to them. From the client's perspective, it is talking directly to the service — the origin servers are completely hidden. The client sends a request to api.example.com; the reverse proxy receives it and decides which backend server handles it.

The distinction from a forward proxy is direction: a forward proxy represents the client to the outside world; a reverse proxy represents the server to the outside world.

What a reverse proxy provides

SSL/TLS termination — the reverse proxy handles the HTTPS handshake and encryption. Backend servers communicate over plain HTTP on the private network, reducing their CPU overhead and centralising certificate management.
Security and DDoS protection — hides origin server IP addresses, making them unreachable directly. Can enforce rate limiting, block malicious IPs, and act as a Web Application Firewall (WAF).
Caching — caches responses from the origin and serves them directly for repeated requests, without forwarding to the backend at all.
Compression — gzips or brotli-compresses responses before sending them to clients, reducing bandwidth without any changes to the application server.
Request routing and virtual hosting — routes different paths or hostnames to different backend services. /api/* goes to the application servers; /static/* goes to object storage or a CDN origin. Multiple domains can share a single reverse proxy.
Observability — centralised access logging, metrics, and tracing across all inbound requests before they reach application code.

Common implementations

Tool	Typical use
Nginx	High-performance reverse proxy and static file server; extremely low memory footprint
HAProxy	Battle-tested L4/L7 proxy and load balancer; preferred for very high connection counts
Cloudflare	Global reverse proxy with CDN, DDoS protection, and WAF built in
AWS ALB / NLB	Managed L7 (ALB) and L4 (NLB) proxies tightly integrated with the AWS ecosystem
Envoy	Service-mesh sidecar proxy; used in Kubernetes with Istio for east-west traffic

Load Balancer

A load balancer is a specialised reverse proxy whose primary job is to distribute incoming requests across a pool of server instances. It is the primary mechanism for horizontal scaling and fault tolerance — if one server fails its health check, the load balancer automatically stops sending it traffic.

Layer 4 vs Layer 7

Layer 4 (transport layer) — routes by IP address and TCP/UDP port. Does not inspect the packet payload. Extremely fast; handles millions of connections per second. Suitable when routing decisions don't require knowledge of the request content (e.g. distributing raw TCP connections to a database cluster).
Layer 7 (application layer) — inspects the full HTTP request: URL path, headers, cookies, and body. Enables content-based routing, SSL termination, sticky sessions, and request rewriting. Slightly higher overhead but far more powerful. Most web-facing load balancers are L7.

Routing algorithms

Algorithm	How it works	Best for
Round-robin	Cycles through servers in order	Homogeneous servers, uniform request cost
Weighted round-robin	Servers with higher capacity receive a proportionally larger share	Heterogeneous server pools (mixed instance types)
Least connections	Sends to the server with the fewest active connections	Long-lived or variable-duration requests
IP hash	Hashes the client IP to deterministically select a server	Stateful sessions without a shared session store
Least response time	Sends to the server with the lowest average response time	Latency-sensitive APIs with variable backend performance

Health checks

The load balancer continuously monitors each server by sending periodic probes — an HTTP request to /health, a TCP connection attempt, or a ping. A server that fails a configurable number of consecutive checks is removed from rotation automatically. When it recovers and passes checks again, it is added back. This enables:

Zero-downtime deployments — drain a server, update it, wait for health checks to pass, then route traffic back to it. Repeat for each instance.
Automatic failure recovery — a crashing instance is detected within seconds and removed without manual intervention.

Redundant load balancers

A single load balancer is itself a single point of failure. Production systems run load balancers in pairs using active-passive failover — a virtual IP (VIP) floats between the two; if the active one fails, the passive one takes over the VIP within seconds. Cloud-managed load balancers (AWS ALB, GCP Load Balancer) handle this transparently.

API Gateway

An API gateway is a managed reverse proxy that acts as the single entry point for all client requests to a microservice backend. Where a basic reverse proxy routes and forwards, an API gateway also enforces cross-cutting concerns that would otherwise be duplicated across every service.

What an API gateway handles

Authentication and authorisation — validates JWTs or API keys on every incoming request, rejecting unauthenticated calls before they reach any service. Individual services trust the gateway and do not need their own auth logic.
Rate limiting and throttling — enforces per-client, per-endpoint, or global request quotas. Prevents abuse and protects backend services from traffic spikes.
Request routing — routes /users/* to the user service, /orders/* to the order service, and so on — based on path, method, or headers.
Protocol translation — exposes a REST or GraphQL API to external clients while communicating with internal services over gRPC, which clients cannot call directly from a browser.
Request/response transformation — reshapes payloads between client format and service format, aggregates responses from multiple services into one (BFF pattern), or strips internal fields before returning to the client.
Observability — centralised logging, distributed tracing correlation, and metrics collection at the system boundary.

API Gateway vs Load Balancer vs Reverse Proxy

	Reverse Proxy	Load Balancer	API Gateway
Primary job	Controlled forwarding gateway	Distribute load across instances	Enforce cross-cutting concerns
Auth / rate limiting	Possible but manual	No	First-class, built-in
Protocol translation	No	No	Yes (REST ↔ gRPC, etc.)
Response aggregation	No	No	Yes
Typical layer	Edge or internal	Edge	Edge — sits in front of all services
Examples	Nginx, HAProxy, Envoy	AWS ALB/NLB, GCP LB, HAProxy	AWS API Gateway, Kong, Apigee, Traefik

In practice the boundaries blur — Nginx can do light auth, an ALB can do header-based routing, and Kong is both a reverse proxy and a full API gateway. What matters is understanding what responsibility you are assigning to each layer, not what the product is called.