<< BACK

HTTP/2: Multiplexing and Binary Frames

How SPDY became a standard, why binary framing replaced text, and the TCP head-of-line blocking problem HTTP/2 could not escape.

DATE:
APR.28.2026
READ:
16 MIN

HTTP/1.1 served the web for 16 years. It worked, but it worked by opening lots of TCP connections, sending text headers nobody could compress, and blocking every request behind the one before it. By the late 2000s, the cost of that design was measurable on every page load. Google decided to fix it.

This is the second post in a three-part series. See HTTP/1.0 and HTTP/1.1 for the starting point, and HTTP/3 and QUIC for where the protocol goes next.

From SPDY to HTTP/2

Google announced the SPDY protocol (pronounced “speedy”) in November 2009 and shipped it in Chrome 6 in 2010. The goals were straightforward: multiplex requests over a single connection, compress headers, and allow server push. Google reported 55% faster page loads in early tests.

By 2012, Chrome, Firefox, and Opera all supported SPDY. Google, Twitter, and Facebook had deployed it on their servers. The protocol had proved that HTTP’s performance problems were solvable without breaking the web.

In 2012, the IETF HTTPbis working group announced a call for proposals for HTTP/2.0. SPDY was selected as the starting point. The working group, chaired by Mark Nottingham, spent three years refining the spec, removing SPDY-specific quirks, and formalizing the framing layer.

On May 15, 2015, RFC 7540 was published as a Proposed Standard, alongside RFC 7541 for HPACK header compression. HTTP/2 was official.

In June 2022, RFC 9113 obsoleted RFC 7540 with clarifications and errata fixes. The most notable change: the PRIORITY frame and its dependency tree model were deprecated. Experience showed that few implementations used priority signaling correctly, and the complexity it added was not justified by real-world benefit.

Binary framing layer

HTTP/1.1 is a text protocol. Request lines, header fields, and chunked-transfer boundaries are all ASCII text parsed with string operations. This is human-readable. It is also ambiguous, slow to parse, and impossible to multiplex without framing hacks.

HTTP/2 replaces this with a binary framing layer. Every piece of data on the wire is encoded as a frame with a fixed-format header.

Frame structure

++
+-----------------------------------------------+
|                Length (24 bits)                |
+---------------+-------------------------------+
|  Type (8 bits)|  Flags (8 bits)               |
+---+-----------+-------------------------------+
| R |          Stream Identifier (31 bits)      |
+---+-------------------------------------------+
|                Frame Payload ...               |
+-----------------------------------------------+
++
  • Length: size of the payload in bytes. Maximum 16,384 bytes by default, configurable up to 16,777,215 (2^24 - 1) via SETTINGS_MAX_FRAME_SIZE.
  • Type: one of 10 defined frame types.
  • Flags: type-specific bit flags (e.g., END_STREAM, END_HEADERS, PADDED).
  • R: reserved bit, must be 0.
  • Stream Identifier: which stream this frame belongs to. Stream 0 is the connection itself.

The 9-byte frame header is fixed. No ambiguity, no variable-length parsing of text delimiters. A parser reads exactly 9 bytes, knows the payload length, and reads exactly that many more bytes. This is why HTTP/2 implementations are faster to write and less prone to parsing vulnerabilities than HTTP/1.1 parsers.

Frame types

+---------------+------+----------------------------------------------------+
| Type          | Code | Purpose                                            |
+---------------+------+----------------------------------------------------+
| DATA          | 0x00 | Carries request/response body                      |
+---------------+------+----------------------------------------------------+
| HEADERS       | 0x01 | Opens a stream, carries compressed headers         |
+---------------+------+----------------------------------------------------+
| PRIORITY      | 0x02 | Stream dependency and weight (deprecated in RFC    |
|               |      | 9113)                                              |
+---------------+------+----------------------------------------------------+
| RST_STREAM    | 0x03 | Immediately terminates a stream                    |
+---------------+------+----------------------------------------------------+
| SETTINGS      | 0x04 | Connection-level configuration parameters          |
+---------------+------+----------------------------------------------------+
| PUSH_PROMISE  | 0x05 | Signals server-initiated push (effectively dead)   |
+---------------+------+----------------------------------------------------+
| PING          | 0x06 | Measures RTT, keeps connection alive               |
+---------------+------+----------------------------------------------------+
| GOAWAY        | 0x07 | Initiates graceful connection shutdown             |
+---------------+------+----------------------------------------------------+
| WINDOW_UPDATE | 0x08 | Adjusts flow-control window size                   |
+---------------+------+----------------------------------------------------+
| CONTINUATION  | 0x09 | Continues a sequence of header block fragments     |
+---------------+------+----------------------------------------------------+

Multiplexing

This is the headline feature. HTTP/1.1 processes requests sequentially on each connection. If you want concurrency, you open more connections — browsers use 6 per origin. HTTP/2 multiplexes unlimited logical streams over a single TCP connection.

Each request-response pair is assigned a stream with a unique integer ID. Clients use odd IDs (1, 3, 5, …), servers use even IDs (2, 4, 6, …). Frames from different streams can interleave freely on the wire:

++
Stream 1: HEADERS → DATA → DATA
Stream 3: HEADERS → DATA
Stream 5: HEADERS → DATA → DATA → DATA

On the wire (interleaved):
[H:1] [H:3] [D:1] [H:5] [D:3] [D:1] [D:5] [D:5] [D:5]
++

This eliminates HTTP-level head-of-line blocking. Stream 3 does not wait for stream 1 to finish. A slow response on one stream does not block fast responses on others.

The spec requires implementations to support at least 100 concurrent streams per connection (controlled by SETTINGS_MAX_CONCURRENT_STREAMS). In practice, servers commonly advertise 100-256.

The effect on browsers: instead of 6 TCP connections per origin, each with its own TLS handshake, slow-start ramp, and memory overhead, you get one connection doing the work of all six. Connection establishment cost drops from 6 * (TCP handshake + TLS handshake) to one.

HPACK header compression (RFC 7541)

HTTP headers are shockingly redundant. Every request to the same origin repeats Host, User-Agent, Accept, Cookie, and a dozen other fields verbatim. Measurements on real traffic showed headers averaging 800 bytes per request, with Cookie alone often exceeding 1 KB. On a page loading 80 resources, that is 64 KB of repeated headers.

HPACK compresses headers using three mechanisms:

Static table

A predefined table of 61 common header name-value pairs, indexed 1 through 61. These never change and never consume connection state.

+-------+-------------+--------------+
| Index | Header Name | Header Value |
+-------+-------------+--------------+
| 1     | :authority  |              |
+-------+-------------+--------------+
| 2     | :method     | GET          |
+-------+-------------+--------------+
| 3     | :method     | POST         |
+-------+-------------+--------------+
| 4     | :path       | /            |
+-------+-------------+--------------+
| 5     | :path       | /index.html  |
+-------+-------------+--------------+
| 6     | :scheme     | http         |
+-------+-------------+--------------+
| 7     | :scheme     | https        |
+-------+-------------+--------------+
| 8     | :status     | 200          |
+-------+-------------+--------------+
| 14    | :status     | 404          |
+-------+-------------+--------------+
| 15    | :status     | 500          |
+-------+-------------+--------------+

When the encoder sees :method: GET, it emits the single byte 0x82 (indexed representation, index 2). No string transmitted at all.

Dynamic table

A table that grows during the connection lifetime. When a new header is encoded, it can be added to the dynamic table. Subsequent references to the same header use the index instead of repeating the full string. The table is FIFO — old entries are evicted when it reaches its maximum size (default 4,096 bytes, configurable via SETTINGS_HEADER_TABLE_SIZE).

Huffman encoding

String literals that are not in either table are Huffman-encoded using a fixed code table specified in RFC 7541, Appendix B. The code table was derived from a large sample of HTTP headers, so common characters like lowercase ASCII letters get short codes.

The combined effect: Cloudflare measured 76% average header compression across their network. On subsequent requests to the same origin, compression ratios above 90% are common because the dynamic table has already learned most headers.

Server Push (and its death)

HTTP/2 introduced server push via the PUSH_PROMISE frame. The idea: when a client requests index.html, the server already knows it will need style.css and app.js. So it sends a PUSH_PROMISE for each, then pushes the responses without waiting for the client to ask.

In theory, this eliminates a round trip. In practice, it was a disaster.

Why server push failed:

  • Cache invalidation. The server does not know what the client already has cached. Pushing a resource the client already has wastes bandwidth. The CANCEL mechanism existed but added complexity.
  • Bandwidth waste. Pushed resources compete with requested resources for bandwidth. A server that aggressively pushes can delay the resources the client actually needs right now.
  • Priority inversion. Pushed streams often arrived at the wrong priority relative to what the browser needed.
  • Complexity for marginal gain. Correctly implementing push required deep knowledge of each page’s resource graph, cache state, and priority. Almost nobody got it right.

Adoption never exceeded 1% of HTTP/2-enabled sites. The browsers killed it:

  • Chrome 106 (October 2022): server push disabled by default.
  • Firefox 132 (October 2024): server push support removed entirely.

No major browser supports HTTP/2 server push today. The replacement is 103 Early Hints (RFC 8297), which sends Link headers in a 1xx response to tell the browser what to preload. It is simpler, cache-aware (the browser decides whether to fetch), and actually works.

Flow control

HTTP/2 includes a credit-based flow control system to prevent a fast sender from overwhelming a slow receiver. It operates at two levels:

  • Per-stream: each stream has its own flow-control window.
  • Connection-level: the entire connection has an aggregate window.

The initial window size for both is 65,535 bytes (the maximum value of a 16-bit unsigned integer). Either endpoint can increase the window by sending WINDOW_UPDATE frames.

Key rules:

  • Only DATA frames are flow-controlled. HEADERS, SETTINGS, and other control frames are not.
  • Flow control cannot be disabled. The spec explicitly forbids it.
  • A sender must not send more DATA bytes than the receiver’s window allows. Violating this is a connection error (FLOW_CONTROL_ERROR).
  • WINDOW_UPDATE replenishes the budget. A receiver sends WINDOW_UPDATE after consuming data to grant more credit.
++
Receiver window: 65535 bytes
Sender sends DATA (16384 bytes) → window: 49151
Sender sends DATA (16384 bytes) → window: 32767
Sender sends DATA (16384 bytes) → window: 16383
Receiver sends WINDOW_UPDATE (49152) → window: 65535
++

This design lets a receiver back-pressure a specific stream without affecting others, or throttle the entire connection. It also means a buggy implementation that never sends WINDOW_UPDATE will eventually stall the connection.

TCP head-of-line blocking — the problem HTTP/2 cannot escape

HTTP/2 multiplexes many streams over a single TCP connection. TCP guarantees in-order byte delivery. These two facts combine into a fundamental problem.

When a single TCP packet is lost, the kernel’s receive buffer holds all subsequent packets — including packets for completely unrelated HTTP/2 streams — until the lost packet is retransmitted and received. Every stream stalls, not just the one that lost data.

++
Stream 1: [packet A] [packet B lost] [packet C]
Stream 3: [packet D] [packet E]

TCP receive buffer:
  packet A → delivered
  packet B → lost, waiting for retransmit
  packet C → buffered, cannot deliver (out of order)
  packet D → buffered, cannot deliver (out of order)
  packet E → buffered, cannot deliver (out of order)

All streams blocked until packet B arrives.
++

Research by Google and Fastly showed that at 2% packet loss, HTTP/1.1 with 6 independent TCP connections can outperform HTTP/2 with its single connection. Each HTTP/1.1 connection has its own loss recovery; a dropped packet on connection 3 does not block connections 1, 2, 4, 5, and 6.

This is not a tunable parameter or an implementation bug. It is an inherent property of TCP’s ordered byte stream abstraction. HTTP/2 cannot fix it without changing the transport layer.

ALPN negotiation

How does a client tell the server it wants HTTP/2? Through Application-Layer Protocol Negotiation (ALPN), a TLS extension defined in RFC 7301 (July 2014).

During the TLS handshake, the client includes a list of supported protocols in the ClientHello message:

++
ClientHello
  ...
  Extension: ALPN
    Protocol: h2
    Protocol: http/1.1
++

The server picks the preferred protocol and includes it in the ServerHello:

++
ServerHello
  ...
  Extension: ALPN
    Protocol: h2
++

This adds zero extra round trips. Protocol negotiation happens inside the TLS handshake that was happening anyway.

ALPN replaced NPN (Next Protocol Negotiation), Google’s earlier extension designed for SPDY. NPN had the client making the final protocol choice, which created a timing side-channel. ALPN moved the choice to the server. RFC 7301 was published in July 2014, and NPN was deprecated.

The TLS requirement

RFC 7540 defines two protocol identifiers:

  • h2: HTTP/2 over TLS (the Application-Layer Protocol Negotiation identifier).
  • h2c: HTTP/2 over cleartext TCP, using the Upgrade mechanism from HTTP/1.1.

In theory, HTTP/2 works without encryption. In practice, every major browser — Chrome, Firefox, Safari, Edge — only implements h2. No browser supports h2c.

This was a deliberate choice. The browser vendors used HTTP/2 as leverage to push HTTPS adoption. If you want the performance benefits of HTTP/2 for your website visitors, you need a TLS certificate.

The spec also mandates minimum TLS 1.2 with specific requirements:

  • TLS 1.2 or higher.
  • The server must support ALPN.
  • A restricted cipher suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 at minimum, with a blacklist of weak ciphers in Appendix A of RFC 7540.

What HTTP/2 changed for developers

HTTP/1.1 performance required a set of hacks. HTTP/2 made most of them counterproductive.

Domain sharding is harmful

HTTP/1.1 developers split assets across cdn1.example.com, cdn2.example.com, etc. to bypass the 6-connections-per-origin limit. Under HTTP/2, each additional domain means a separate TCP connection that cannot participate in multiplexing. Consolidate origins.

Asset bundling is less necessary

Concatenating JavaScript files, CSS sprites, and icon fonts existed to reduce request count. With multiplexing, 50 small files cost almost the same as 1 large file in terms of connection overhead. The tradeoff shifts toward smaller, independently-cacheable modules.

Resource inlining reconsidered

Inlining CSS or small images as data URIs in HTML prevents the browser from caching them separately. With HTTP/2’s low request overhead, a separate cacheable resource is usually better.

Connection coalescing

HTTP/2 allows connection reuse across multiple origins if they resolve to the same IP address and the TLS certificate covers both names. If api.example.com and www.example.com point to the same server and use a wildcard or SAN certificate, the browser may reuse one HTTP/2 connection for both.

This is defined in Section 9.1.1 of RFC 7540. It means developers can split logical services across subdomains without the multiplexing penalty, as long as the infrastructure supports coalescing.

Adoption today

+------------------------------+--------+--------------------------+
| Metric                       | Value  | Source                   |
+------------------------------+--------+--------------------------+
| Websites using HTTP/2        | ~35.5% | W3Techs, April 2026      |
+------------------------------+--------+--------------------------+
| Requests served over HTTP/2+ | ~85%   | HTTP Archive, April 2026 |
+------------------------------+--------+--------------------------+
| CDN traffic over HTTP/2+     | ~96%   | Cloudflare Radar, 2026   |
+------------------------------+--------+--------------------------+
| Websites using HTTP/3        | ~31%   | W3Techs, April 2026      |
+------------------------------+--------+--------------------------+
| HTTP/1.1 only                | ~33%   | W3Techs, April 2026      |
+------------------------------+--------+--------------------------+

HTTP/2 adoption has plateaued in the mid-30% range for websites, but that number is misleading. The sites that have not adopted HTTP/2 tend to be low-traffic. By request volume, over 85% of all web requests use HTTP/2 or HTTP/3. CDNs like Cloudflare, Fastly, and AWS CloudFront serve virtually all traffic over HTTP/2+ regardless of the origin server’s protocol.

The trend is clear: HTTP/2 is no longer growing. HTTP/3 is. Cloudflare reported 30% of their traffic using HTTP/3 in 2024, and that share continues to climb.

HTTP/2 solved the right problems at the wrong layer. See HTTP/3 and QUIC for how moving to UDP fixed what TCP could not.