Load Balancing — DevOps Engineer Track

Overview

A load balancer distributes incoming requests across a pool of servers so no single server is overwhelmed. It matters because it enables horizontal scaling and improves availability by routing around failed servers. It's a foundational piece of almost any high-traffic system.

Syntax / Usage

Clients send requests to the load balancer's address, and it forwards each request to a healthy backend server based on a routing algorithm. Health checks let it stop sending traffic to servers that are down.

                 --> [ Server A ]
[ Clients ] --> [ LB ] --> [ Server B ]
                 --> [ Server C ]

Common algorithms: round robin (rotate evenly), least connections (send to the least busy), and IP hash (stick a client to one server).

Examples

An nginx upstream block spreads traffic across three app servers and skips any that fail health checks:

upstream app {
  server 10.0.0.1;
  server 10.0.0.2;
  server 10.0.0.3;
}

A cloud provider's load balancer runs health checks every 10 seconds and removes an unhealthy instance from rotation, so users never hit the broken server.

Common Mistakes

Making the load balancer itself a single point of failure (use redundancy)
Requiring sticky sessions because servers hold local state
Skipping health checks, so traffic still flows to dead servers
Assuming even distribution when requests vary greatly in cost
Ignoring the load balancer as a capacity bottleneck of its own

Overview

Syntax / Usage

Examples

Common Mistakes

See Also