<< BACK

The Evolution of HTTP

From a one-line protocol in 1991 to UDP-based multiplexed streams in 2022 — how HTTP grew up, what each version fixed, and what it broke.

DATE:
APR.28.2026
READ:
12 MIN

The timeline

Five versions in thirty-one years. Each one fixed something the previous version got wrong, and each one revealed a new bottleneck hiding underneath.

+----------+------+----------+-----------+--------------------+
| Version  | Year | RFC      | Transport | Key Innovation     |
+----------+------+----------+-----------+--------------------+
| HTTP/0.9 | 1991 | None     | TCP       | The web exists     |
+----------+------+----------+-----------+--------------------+
| HTTP/1.0 | 1996 | RFC 1945 | TCP       | Headers and status |
|          |      |          |           | codes              |
+----------+------+----------+-----------+--------------------+
| HTTP/1.1 | 1997 | RFC 2616 | TCP       | Persistent         |
|          |      |          |           | connections        |
+----------+------+----------+-----------+--------------------+
| HTTP/2   | 2015 | RFC 7540 | TCP       | Multiplexing       |
+----------+------+----------+-----------+--------------------+
| HTTP/3   | 2022 | RFC 9114 | QUIC/UDP  | No head-of-line    |
|          |      |          |           | blocking           |
+----------+------+----------+-----------+--------------------+

HTTP/0.9 was a single line: GET /page.html. No headers, no status codes, no content types. The server returned HTML and closed the connection. That was it. Tim Berners-Lee needed a way to fetch hypertext documents, not a protocol that would one day carry video streams, API calls, and WebSocket upgrades. Everything after 0.9 is the story of a protocol outgrowing its original purpose.

The core problem each version solved

HTTP/1.0: “How do we describe what we’re sending?”

The web grew beyond HTML. Images, CSS, JavaScript. The protocol needed a way to say “this is a JPEG, it’s 45KB, and here’s a status code telling you whether the request worked.” Headers solved that. Without them, a browser couldn’t distinguish a 200 from a 404, or an image from a stylesheet.

HTTP/1.1: “How do we stop opening a new TCP connection for every image?”

A typical 1996 web page had maybe 10 resources. Each one required a full TCP handshake, transfer, and teardown. HTTP/1.1 made connections persistent by default (keep-alive) and added pipelining (sending multiple requests without waiting for responses). Pipelining was a good idea that worked poorly in practice — most servers and proxies couldn’t handle it correctly, so browsers disabled it.

For the full story on what 1.0 introduced and how 1.1 tried to fix its connection model, see the HTTP/1.0 and HTTP/1.1 deep dive.

HTTP/2: “How do we stop waiting for one response before starting the next?”

Even with keep-alive, HTTP/1.1 is fundamentally serial per connection. Response A must complete before response B starts. Browsers worked around this by opening 6 parallel connections per domain, which led to the domain-sharding hack. HTTP/2 replaced this with multiplexing: a single TCP connection carries interleaved streams, each with its own flow control. Google shipped this as SPDY first, then the IETF standardized it.

Details on framing, stream priorities, and server push are in the HTTP/2 deep dive.

HTTP/3: “How do we stop TCP from blocking everything when one packet is lost?”

HTTP/2 solved application-level head-of-line blocking but ran straight into transport-level blocking. TCP guarantees in-order delivery. One lost packet stalls every stream on the connection until retransmission completes. Google’s answer was QUIC: a UDP-based transport that gives each stream its own delivery guarantee. Lost packet on stream 4? Streams 1, 2, and 3 keep flowing.

The QUIC handshake, connection migration, and 0-RTT resumption are covered in the HTTP/3 deep dive.

Connection setup comparison

The cost of the first byte keeps dropping. Each version shaves RTTs off the initial handshake.

+----------+---------------+--------------------+--------------------+--------------------+
| Version  | TCP Handshake | TLS Handshake      | Total First        | Subsequent         |
|          |               |                    | Request            | Requests           |
+----------+---------------+--------------------+--------------------+--------------------+
| HTTP/1.0 | 1 RTT         | None               | 1 RTT              | New connection per |
|          |               |                    |                    | request            |
+----------+---------------+--------------------+--------------------+--------------------+
| HTTP/1.1 | 1 RTT         | 2 RTT (TLS 1.2)    | 3 RTT              | 0 RTT (keep-alive) |
+----------+---------------+--------------------+--------------------+--------------------+
| HTTP/2   | 1 RTT         | 1 RTT (TLS 1.3)    | 2 RTT              | Single connection, |
|          |               |                    |                    | multiplexed        |
+----------+---------------+--------------------+--------------------+--------------------+
| HTTP/3   | 1 RTT         | Combined           | 1 RTT              | 0 RTT on           |
|          |               | (QUIC+TLS)         |                    | resumption         |
+----------+---------------+--------------------+--------------------+--------------------+

HTTP/3’s combined handshake is the key win. QUIC merges the transport and cryptographic handshakes into a single round trip. On session resumption, the client can send application data immediately — zero round trips before the first HTTP request leaves. For mobile users on high-latency cellular connections, this is the difference between perceivable and imperceptible delay.

Head-of-line blocking across versions

This is the throughline of the entire HTTP evolution. Every major version change was ultimately about removing a layer of head-of-line blocking.

+----------+--------------------+--------------------+
| Version  | Application-Level  | Transport-Level    |
|          | HOL                | HOL                |
+----------+--------------------+--------------------+
| HTTP/1.0 | N/A (one request   | N/A                |
|          | per connection)    |                    |
+----------+--------------------+--------------------+
| HTTP/1.1 | Yes (responses     | Yes (TCP)          |
|          | must be ordered)   |                    |
+----------+--------------------+--------------------+
| HTTP/2   | No (multiplexed    | Yes (TCP blocks    |
|          | streams)           | all streams on     |
|          |                    | packet loss)       |
+----------+--------------------+--------------------+
| HTTP/3   | No                 | No (independent    |
|          |                    | QUIC streams)      |
+----------+--------------------+--------------------+

Imagine 10 resources loading. In HTTP/1.1, each connection handles one at a time. You open 6 connections and serialize resources across them. HTTP/2 sends all 10 over one connection as interleaved frames, but one lost packet at the TCP layer freezes all 10 streams until retransmission completes. HTTP/3 sends all 10 over independent QUIC streams — a lost packet only freezes the one affected resource. The other nine continue uninterrupted.

This is why HTTP/2 can actually perform worse than HTTP/1.1 on lossy networks. Six independent TCP connections mean a lost packet only blocks one-sixth of your resources. One multiplexed TCP connection means a lost packet blocks everything. HTTP/3 finally gives you the best of both: multiplexing without the shared-fate problem.

Header compression evolution

Headers are repetitive. A browser sends the same User-Agent, Accept, Cookie, and Authorization headers on every request. As web applications grew more header-heavy (especially cookies), this redundancy became a real bandwidth cost.

+----------+--------------------+--------------------+
| Version  | Compression        | Typical Overhead   |
+----------+--------------------+--------------------+
| HTTP/1.0 | None (plain text)  | ~500-800 bytes per |
|          |                    | request            |
+----------+--------------------+--------------------+
| HTTP/1.1 | None (same as 1.0) | ~500-800 bytes per |
|          |                    | request            |
+----------+--------------------+--------------------+
| HTTP/2   | HPACK (static +    | ~76% compression   |
|          | dynamic table,     |                    |
|          | Huffman encoding)  |                    |
+----------+--------------------+--------------------+
| HTTP/3   | QPACK (HPACK       | Similar to HPACK   |
|          | redesigned for     |                    |
|          | out-of-order       |                    |
|          | delivery)          |                    |
+----------+--------------------+--------------------+

HPACK maintains two tables: a static table of 61 common header field/value pairs (:method: GET, :status: 200, etc.) and a dynamic table that accumulates headers seen during the connection. Repeated headers are sent as integer indices instead of strings, and new values get Huffman-encoded.

QPACK does the same thing but solves a subtle problem: HPACK’s dynamic table requires in-order delivery (the encoder and decoder must agree on table state), which reintroduces a head-of-line blocking dependency. QPACK splits the dynamic table into an encoder stream and decoder stream, allowing headers to be processed out of order.

What developers had to change

Every HTTP version shift invalidated some performance “best practices.” The patterns that made HTTP/1.1 fast actively hurt HTTP/2 performance.

+--------------------+--------------------+--------------------+
| Optimization       | HTTP/1.1 Era       | HTTP/2+ Era        |
+--------------------+--------------------+--------------------+
| Domain sharding    | Required (6 conn   | Harmful (breaks    |
|                    | limit)             | multiplexing)      |
+--------------------+--------------------+--------------------+
| CSS sprites        | Common             | Unnecessary        |
|                    | optimization       |                    |
+--------------------+--------------------+--------------------+
| JS/CSS bundling    | Essential          | Optional (granular |
|                    |                    | caching better)    |
+--------------------+--------------------+--------------------+
| Resource inlining  | Useful             | Counterproductive  |
|                    |                    | (prevents caching) |
+--------------------+--------------------+--------------------+
| Connection:        | Manual opt-in      | Default behavior   |
| keep-alive         |                    |                    |
+--------------------+--------------------+--------------------+

Domain sharding is the most dramatic reversal. In the HTTP/1.1 era, you’d serve images from img1.example.com, img2.example.com, etc. to work around the 6-connection-per-domain limit. Under HTTP/2, each sharded domain requires its own TCP connection and its own TLS handshake, defeating the multiplexing that makes HTTP/2 fast. A single origin with one multiplexed connection outperforms four sharded domains with four separate connections.

The bundling story is more nuanced. Giant JS bundles still make sense for initial load if your code changes as a unit. But HTTP/2’s stream multiplexing makes it practical to serve many small files, each with its own cache key. Change one utility function? Invalidate 2KB instead of 400KB.

Adoption snapshot (April 2026)

+----------+--------------------+--------------+-----------------+
| Protocol | Websites (W3Techs) | All Requests | Browser Support |
+----------+--------------------+--------------+-----------------+
| HTTP/1.1 | ~26%               | ~15%         | 100%            |
+----------+--------------------+--------------+-----------------+
| HTTP/2   | ~35%               | ~51%         | 97%+            |
+----------+--------------------+--------------+-----------------+
| HTTP/3   | ~39%               | ~21%         | ~96%            |
+----------+--------------------+--------------+-----------------+

The adoption numbers tell an interesting story. HTTP/3 leads in website count because Cloudflare, which proxies a huge fraction of the web, enables it by default. But HTTP/2 leads in request volume because high-traffic APIs and services behind load balancers tend to be explicitly configured for HTTP/2 and haven’t yet upgraded their infrastructure to support QUIC.

The pattern

Each version followed the same arc:

  1. Real-world problem identified at scale (usually by Google).
  2. Google ships an experimental protocol — the keep-alive hack, SPDY, gQUIC — and deploys it across Chrome and their own servers.
  3. The IETF standardizes a cleaner version, incorporating feedback from other browser vendors and CDN operators.
  4. Browsers adopt the standard. CDNs follow within months. The long tail of servers catches up over years.
  5. A new layer of the stack reveals the next bottleneck.

This pattern holds remarkably well. HTTP/1.1’s persistent connections exposed application-level HOL blocking. HTTP/2’s multiplexing exposed transport-level HOL blocking. HTTP/3’s independent streams exposed… well, we’re still finding out. Connection migration and 0-RTT are the current frontier, and the next bottleneck is likely in the application layer again — how servers prioritize and schedule responses across streams.

Which version should you use?

For everyone else, the practical decision:

  • Static site behind a CDN — You’re already on HTTP/3. Nothing to do. The CDN negotiates the best protocol the client supports.
  • Self-hosted with Nginx — Enable HTTP/2 (listen 443 ssl http2). For HTTP/3, you need nginx 1.25+ compiled with a QUIC library (quictls or BoringSSL) and a listen 443 quic directive. It works, but the ecosystem is still maturing.
  • Internal APIs — HTTP/1.1 is fine. Datacenter links have sub-millisecond latency and negligible packet loss. The RTT savings of HTTP/2 don’t matter when your RTT is already near zero. The complexity of debugging multiplexed streams isn’t worth it.
  • Mobile-first application — HTTP/3 provides the biggest win. High-latency cellular connections (50-200ms RTT), frequent network switches (Wi-Fi to LTE), and lossy radio links are exactly the conditions QUIC was designed for. Connection migration means a network switch doesn’t kill the user’s in-flight request.

For the implementation details behind each version, start with the HTTP/1.0 and HTTP/1.1 deep dive, then HTTP/2, then HTTP/3.

The web started with a protocol that could do exactly one thing: fetch a document. Thirty years of engineering didn’t change that fundamental purpose — it just removed every obstacle between the request and the response.