Choosing the Right Load Balancing Algorithm Type

Last updated: August 19, 2025

Recently we've released the ability to choose how requests are routed for HTTP(S) ALBs through the load_balancing_algorithm_type field on Aptible. The availabe options are round robin, least outstanding requests, and weighted random, which match the routing algorithms offered by AWS. Based on the needs of your application, one algorithm type may be better suited than others.

Round Robin / Weighted Random

Round Robin and Weighted Random are grouped together here, as they both route in a similar fashion. For both algorithms, requests are routed evenly across all healthy containers. For round robin, this is done sequentially whereas weighted random routes requests to targets in a random order. These algorithms are commonly used when the requests being received are similar in complexity or if you need to distribute requests equally among containers. On Aptible, the default setting is round robin.

Least Outstanding Requests

As the name implies, the Least Outstanding Requests (LOR) routing algorithm routes requests to the container with the lowest number of in-progress requests. This is great for services that expect requests of varied complexity as it avoids scenarios where simple requests are routed to containers waiting for lengthier requests to finish.

The LOR algorithm is a good choice for:

Workloads with a mix of fast and slow endpoints or variable payload sizes.
Backends where a small number of outlier tasks temporarily consume much more CPU than typical requests.
Apps that occasionally perform blocking work, such as synchronous file encryption, PDF rendering, image and video transforms, large JSON diffing, or on-the-fly report generation.

Concrete example: consider an API that mostly serves lightweight GETs, but a small percentage of requests trigger a synchronous “export report” endpoint. The export fans out to the database, runs several aggregations, and streams the result while computing checksums. A container that picks up a few of these exports can sit at an elevated CPU for minutes. With round robin, new traffic keeps arriving at that hot container on schedule. With LOR, new requests are steered toward cooler containers with fewer outstanding requests. This tends to lower P95 and P99 latency during those periods.

What to watch out for: If most requests are truly uniform and short, round robin or weighted random will perform similarly with less variance.