Connection Multiplexing
HTTP/2 advantage.
Overview
HTTP/2 multiplexing runs multiple concurrent requests over a single TCP connection. HTTP/1.1 limited concurrency to one request per connection (browsers worked around this by opening 6+ connections per origin); HTTP/2 lifts the limit and adds header compression, stream prioritization, and lower per-request overhead. The discipline is in deploying HTTP/2 end-to-end (origin, CDN, client) and monitoring for the failure modes that survive even with multiplexing in place.
- Concurrent streams. Many requests on one connection; replaces the connection-per-request shape that limited HTTP/1.1.
- Header compression. HPACK reduces header bytes; per-request overhead drops, especially for repetitive headers.
- Stream prioritization. Important requests get more bandwidth on the same connection; supports critical-path optimization.
- Lower latency plus end-to-end requirement. No per-request queuing; the savings only land if origin, CDN, and client all support HTTP/2.
The approach
The practical approach is HTTP/2 on origin (nginx and ALB support it natively), modern HTTP client libraries that negotiate HTTP/2, monitor max concurrent streams to catch saturation, watch for TCP-level head-of-line blocking (HTTP/3 over QUIC fixes it where it matters), and document the per-tier protocol so the chain from origin to client is reviewable.
- HTTP/2 on origin. nginx and ALB support HTTP/2 natively; enable at the origin and at every hop.
- HTTP/2 client libraries. Modern HTTP clients negotiate HTTP/2; the client side must match the server.
- Monitor max concurrent streams. Each connection has a stream limit; saturation produces head-of-line stalls within the connection.
- Watch for HOL blocking plus documented chain. TCP-level head-of-line blocking still exists (HTTP/3 over QUIC fixes it); per-tier protocol committed to the architecture documentation.
Why this compounds
HTTP/2 multiplexing compounds across services. Each connection serves many requests; the per-request overhead drops durably; the team builds intuition for HTTP performance that pays off when HTTP/3 over QUIC becomes the next migration. Without the discipline, services run on HTTP/1.1 long past the point where the concurrency cost matters.
- Lower latency. No connection-per-request; the user-facing latency drops because connections do not stall waiting for sequential requests.
- Lower resource use. Fewer connections per client; the server holds fewer sockets and the load balancer routes fewer flows.
- Better mobile experience. Fewer TLS handshakes; mobile networks pay handshake cost on every reconnect, and multiplexing reduces them.
- Institutional knowledge. Stream-level monitoring teaches HTTP/2 patterns; the team learns where multiplexing helps and where TCP HOL blocking still bites.
HTTP/2 multiplexing discipline is an operational discipline that pays off across years. Nova AI Ops integrates with HTTP telemetry, surfaces stream patterns, and supports the team’s performance discipline.