Understanding the BitTorrent Protocol: Part 2

Once a BitTorrent client obtains peer addresses from a tracker or DHT, the real work begins: establishing TCP connections and exchanging the actual file data. This is where the peer wire protocol comes in. Unlike the simple request-response model of tracker communication, the peer wire protocol is a sophisticated, stateful protocol that manages long-lived connections between peers.

Think of it this way: the tracker tells you who to talk to, but the peer wire protocol defines how to have that conversation. It’s the language that peers use to negotiate what they need, what they have, and how they’ll exchange data. And at the heart of this protocol is a clever algorithm called “choking” that solves one of the hardest problems in peer-to-peer networks: how to prevent freeloaders while ensuring everyone gets good download speeds.

The peer wire protocol operates over TCP, which means every connection is reliable, ordered, and bidirectional. Each peer maintains state with every connected peer, tracking things like “Does this peer have pieces I need?” and “Am I currently allowed to download from them?” This stateful nature is what makes BitTorrent so efficient compared to simpler file-sharing protocols.

Contents

  1. TCP Handshake
  2. Message Framing
  3. Message Types
  4. Connection State Machine
  5. Request Pipelining

TCP Handshake

Before any file data can be exchanged, peers must first verify they’re speaking the same language and working on the same torrent. This is where the handshake comes in. It’s the first message exchanged after the TCP connection is established, and it serves as both a protocol identifier and a security check.

The handshake is interesting because both peers send it simultaneously, without waiting for the other side. This is possible because of TCP’s full-duplex nature - data can flow in both directions at the same time. Each peer sends their handshake and then waits to receive and validate the other peer’s handshake.

What makes the handshake critical is the info hash field. Remember from the previous article that the info hash is a SHA-1 hash that uniquely identifies a torrent. When two peers exchange handshakes, they’re essentially saying “I’m here to download this specific torrent.” If the info hashes don’t match, the connection is immediately terminated. This prevents peers from different swarms from accidentally connecting to each other.

Handshake Structure

Fixed 68-byte format:

Offset  Size  Field           Description
0       1     pstrlen         Length of protocol string (always 19)
1       19    pstr            Protocol identifier: "BitTorrent protocol"
20      8     reserved        Extension bits (see below)
28      20    info_hash       Torrent identifier (raw SHA-1)
48      20    peer_id         Peer identifier

Example handshake bytes:

13 42 69 74 54 6f 72 72 65 6e 74 20 70 72 6f 74 6f 63 6f 6c
00 00 00 00 00 00 00 00
2c 6b 68 58 d6 1d a9 54 3d 42 31 a7 1d b4 b1 c9 26 4b 06 85
2d 54 52 33 30 30 30 2d 61 62 63 64 65 66 67 68 69 6a 6b 6c

Breakdown:
[0]:       0x13 (19 decimal)
[1-19]:    "BitTorrent protocol"
[20-27]:   0x00 (reserved bytes)
[28-47]:   info hash (20 bytes)
[48-67]:   peer_id "-TR3000-abcdefghijkl"

Field Details:

pstrlen (1 byte): Always 0x13 (19 decimal). Allows future protocol identification by checking first byte.

pstr (19 bytes): ASCII string "BitTorrent protocol". No null terminator.

reserved (8 bytes): Extension flags. Each bit enables optional features. See Reserved Bytes section.

info_hash (20 bytes): Raw SHA-1 hash of bencoded info dictionary. Must match between peers or connection is severed.

peer_id (20 bytes): Identifies client. Format: -XX####- where XX is client code, #### is version, followed by random bytes. Examples:

  • -TR3000- = Transmission 3.00
  • -UT3500- = μTorrent 3.5.0
  • -AZ5700- = Azureus 5.7.0

Reserved Bytes

Those 8 reserved bytes in the handshake aren’t just padding - they’re a clever extension mechanism. Each bit in these 8 bytes can signal support for an optional protocol feature. It’s like a capability negotiation system built right into the handshake.

When two peers connect, they look at each other’s reserved bytes to see what features both support. If both peers have the same bit set, they can use that feature. If not, they fall back to the base protocol. This allows the protocol to evolve without breaking compatibility with older clients.

Here are the most important bits that are currently defined:

reserved[0] (byte 20):
  0x80 (bit 7): Azureus Messaging Protocol

reserved[5] (byte 25):
  0x10 (bit 4): LTEP (Libtorrent Extension Protocol)

reserved[7] (byte 27):
  0x01 (bit 0): DHT support
  0x04 (bit 2): Fast Extension (suggest, have_all, have_none, reject_request, allowed_fast)
  0x08 (bit 3): NAT traversal

For example, if your client supports DHT and the Fast Extension, you’d set reserved[7] = 0x05 (which is 0x01 | 0x04). When the other peer receives your handshake, they can check these bits and know what features you support.

There’s a practical problem, though: bit collisions. Different client developers sometimes chose the same bits for different features. This is why modern clients prefer to use LTEP (Libtorrent Extension Protocol), which provides a more robust extension negotiation mechanism that doesn’t rely on pre-allocated bits. With LTEP, extensions are identified by name rather than bit position, eliminating collision issues.

Handshake Validation

After receiving peer’s handshake:

  1. Verify pstrlen: Must be 0x13. Reject if different.
  2. Verify pstr: Must be “BitTorrent protocol” (case-sensitive). Reject if different.
  3. Verify info_hash: Must match local torrent’s info hash. Reject if different.
  4. Verify peer_id: If initiating connection, verify peer_id matches expected value from tracker. Accept any peer_id for incoming connections.
  5. Parse reserved bytes: Enable features based on set bits.

Special case: Multi-torrent clients may wait for incoming peer to send handshake first, then respond with matching info hash if torrent is being downloaded.

Connection rejection: Close TCP connection immediately on validation failure. No error message sent.


Message Framing

After the handshake succeeds, the real conversation begins. But how do peers know where one message ends and another begins in a continuous TCP stream? This is where message framing comes in. BitTorrent uses a simple but effective approach: every message is prefixed with its length.

Think of it like receiving packages in the mail. Each package has a label that tells you how big it is before you open it. Similarly, every BitTorrent message starts with a 4-byte length field that tells the receiver “the next N bytes belong to this message.” This allows the receiver to read exactly the right amount of data before moving on to the next message.

The beauty of this approach is its simplicity. There’s no need for complex delimiters or escape sequences. The receiver reads 4 bytes to get the length, then reads exactly that many more bytes to get the complete message. This continues indefinitely in both directions as long as the connection remains open.

Length-Prefix Format

General message structure:

Offset  Size  Field       Description
0       4     length      Message length (excluding this field), big-endian
4       1     id          Message type (0-20+)
5       N     payload     Message-specific data

Example (Have message for piece 42):

00 00 00 05  length = 5 bytes (1 byte ID + 4 bytes payload)
04           message ID = 4 (Have)
00 00 00 2a  piece index = 42

Full message: 00 00 00 05 04 00 00 00 2a

All integers are big-endian (network byte order). Length field does not include itself (only ID + payload).

Keep-Alive Messages

Keep-alive format:

00 00 00 00  length = 0, no ID, no payload

Keep-alives maintain connection liveness. Sent every 2 minutes of inactivity. No response expected. Receiver ignores keep-alives.

Timeout behavior: Close connection if no data received for 2-3 minutes (implementation-specific). Keep-alives prevent premature timeout.


Message Types

State Messages

These messages manage connection state (choked/unchoked, interested/not interested).

Choke (ID 0):

00 00 00 01 00
length=1, id=0, no payload

Sender refuses to send piece data. Receiver must not send requests (or expect rejection).

Unchoke (ID 1):

00 00 00 01 01
length=1, id=1, no payload

Sender willing to send piece data. Receiver may send requests.

Interested (ID 2):

00 00 00 01 02
length=1, id=2, no payload

Sender wants pieces from receiver. Receiver may unchoke to allow downloads.

Not Interested (ID 3):

00 00 00 01 03
length=1, id=3, no payload

Sender does not want any pieces from receiver. Receiver may deprioritize or choke.

Data Messages

Have (ID 4):

00 00 00 05 04 <piece index>
length=5, id=4, payload=4-byte piece index

Announces possession of piece. Sent after successfully downloading and verifying a piece. Receiver updates peer’s bitfield.

Bitfield (ID 5):

00 00 00 <1+X> 05 <bitfield>
length=1+X, id=5, payload=X bytes

Sent immediately after handshake (before any other messages). Bitfield is byte array where bit N indicates possession of piece N. High bit first: byte 0 bit 7 = piece 0, byte 0 bit 6 = piece 1, etc.

Example: 5 pieces, bitfield 0xD8 (11011000):

  • Piece 0: 1 (have)
  • Piece 1: 1 (have)
  • Piece 2: 0 (missing)
  • Piece 3: 1 (have)
  • Piece 4: 1 (have)

Spare bits (when piece count not divisible by 8) must be zero. Omit bitfield if peer has no pieces.

Request (ID 6):

00 00 00 0D 06 <index> <begin> <length>
length=13, id=6, payload=12 bytes (3x 4-byte integers)

Requests block of data. Fields:

  • index (4 bytes): Piece index
  • begin (4 bytes): Byte offset within piece
  • length (4 bytes): Block length in bytes

Standard block size: 16384 bytes (16 KiB). Last block of piece may be smaller. Maximum request length: 32768 bytes (32 KiB) per BEP 3, but most clients enforce 16 KiB.

Example: Request piece 42, offset 0, length 16384:

00 00 00 0D 06 00 00 00 2A 00 00 00 00 00 00 40 00

Piece (ID 7):

00 00 00 <9+X> 07 <index> <begin> <block>
length=9+X, id=7, payload=8 bytes + X bytes block data

Delivers requested block. Fields:

  • index (4 bytes): Piece index
  • begin (4 bytes): Byte offset within piece
  • block (X bytes): Raw block data

Must match prior request. Unsolicited pieces cause connection termination.

Cancel (ID 8):

00 00 00 0D 08 <index> <begin> <length>
length=13, id=8, payload=12 bytes (same as Request)

Cancels previous request. Used in endgame mode when block arrives from another peer. Receiver should not send canceled block, but must respond with either piece or reject (Fast Extension).

Fast Extension Messages

Optional (BEP 6). Enabled if reserved[7] & 0x04 set during handshake.

Suggest Piece (ID 13):

00 00 00 05 0D <piece index>
length=5, id=13, payload=4-byte piece index

Advisory hint: “You might want this piece.” Receiver may prioritize suggested pieces.

Have All (ID 14):

00 00 00 01 0E
length=1, id=14, no payload

Replaces bitfield for seeders. Indicates possession of all pieces. Sent immediately after handshake instead of bitfield.

Have None (ID 15):

00 00 00 01 0F
length=1, id=15, no payload

Replaces bitfield for new peers with zero pieces. More efficient than empty bitfield.

Reject Request (ID 16):

00 00 00 0D 10 <index> <begin> <length>
length=13, id=16, payload=12 bytes (same as Request)

Explicit rejection of request. Sent after choke to clear pending requests. Receiver must not expect piece data.

Allowed Fast (ID 17):

00 00 00 05 11 <piece index>
length=5, id=17, payload=4-byte piece index

Indicates piece will be served even when choked. Helps new peers bootstrap. Typically 10 pieces in allowed fast set.


Connection State Machine

Here’s where things get interesting. Every peer connection isn’t just a simple pipe for data - it’s a carefully managed relationship with its own state. Each peer needs to track two pieces of information about every connection, both from their perspective and from the remote peer’s perspective.

Imagine you’re having a conversation with someone. At any given moment, you’re either interested in what they have to say, or you’re not. And separately, you’re either willing to talk, or you’re not (maybe you’re busy, maybe they’ve been rude). The same concept applies to BitTorrent connections, but with file pieces instead of conversation.

This creates what we call a “four-state system.” For every connection, you need to track:

  1. Whether you’re interested in downloading from them (they have pieces you need)
  2. Whether you’re currently allowing them to download from you (you’ve unchoked them)

And you need to track the same two states from their perspective:

  1. Whether they’re interested in downloading from you
  2. Whether they’re currently allowing you to download from them

Understanding this state machine is crucial because it determines when data can actually flow. You can’t just request pieces whenever you want - you need to be both interested AND unchoked by the remote peer.

Four Connection States

Local perspective (what I send to peer):

State Am Choking Am Interested Meaning
1 Yes No I’m choking them, don’t want their pieces
2 Yes Yes I’m choking them, but want their pieces
3 No No I’m not choking them, don’t want their pieces
4 No Yes I’m not choking them, want their pieces (active transfer)

Remote perspective (what peer sends to me):

State Peer Choking Me Peer Interested In Me Meaning
1 Yes No They’re choking me, don’t want my pieces
2 Yes Yes They’re choking me, but want my pieces
3 No No They’re not choking me, don’t want my pieces
4 No Yes They’re not choking me, want my pieces (active transfer)

Initial state: All connections start choked and not interested. Bitfield/Have messages trigger interest recalculation.

State Transitions

Interest calculation: Peer is interesting if they have pieces I need. Recalculate on:

  • Receive bitfield (initial interest determination)
  • Receive have (peer acquired new piece)
  • Local piece download completes (may no longer need peer’s pieces)

Interest state updates: Must send interested/not interested immediately when calculation changes. Incorrect interest state causes inefficiency:

  • Staying interested when not needing pieces wastes unchoke slots
  • Not expressing interest when needing pieces prevents downloading

Choking transitions: Determined by choking algorithm. Sender controls:

  • When to choke (deny upload)
  • When to unchoke (permit upload)

Data Transfer Rules

Download (receiving pieces):

  • Can only send requests when peer has unchoked me
  • Can send requests while peer is choked if piece is in allowed fast set (Fast Extension)
  • Must not send requests while choked (without Fast Extension)

Upload (sending pieces):

  • Can only send pieces when I’ve unchoked peer
  • Must reject requests when I’m choking peer (send reject_request or ignore)

Request lifetime: Request remains valid until:

  • Piece received
  • Cancel received
  • Connection choked (must reject all pending requests)

Request Pipelining

Now we come to a crucial performance optimization. Imagine if you had to wait for a response to every request before sending the next one. With network latency, this would be painfully slow. If it takes 100 milliseconds for a round trip, you’d only be able to request 10 blocks per second, severely limiting your download speed.

This is where pipelining saves the day. Instead of waiting for each response, you send multiple requests at once, keeping several “in flight” simultaneously. It’s like having multiple questions queued up in a conversation rather than waiting for an answer to each question before asking the next one.

The key is finding the right balance. Too few requests in the pipeline and you waste time waiting. Too many requests and you’re wasting memory and potentially requesting blocks you no longer need. Most BitTorrent clients maintain 5-10 outstanding requests per peer, which provides a good balance for typical internet connections.

Pipeline Depth

Recommended pipeline depth: 5-10 outstanding requests per peer.

Calculation:

Pipeline depth = (RTT × Bandwidth) / Block size

Example: 100 ms RTT, 1 Mbps throughput, 16 KiB blocks:

Depth = (0.1 s × 1,000,000 bits/s) / (16384 bytes × 8 bits/byte)
      = 100,000 / 131,072
      ≈ 0.76

Minimum 1 request required. Most clients use 5-10 for safety margin and varying RTT.

Per-peer tracking: Each peer connection maintains independent request queue. High-latency peers need deeper pipelines. Low-latency peers need shallower pipelines.

Request Queue Management

Sending requests:

  1. Check peer state: unchoked and interesting
  2. Check pipeline depth: current outstanding requests < target depth
  3. Select block from piece picker
  4. Send request
  5. Add to outstanding request tracking

Receiving pieces:

  1. Match piece to outstanding request
  2. Remove from outstanding queue
  3. Write block to storage
  4. If piece complete, verify hash
  5. Send requests to refill pipeline

Handling choke:

  1. Receive choke message
  2. Mark all outstanding requests as rejected
  3. Clear request queue
  4. Do not send new requests
  5. Optionally cancel outstanding requests (not required by spec)

Handling unchoke:

  1. Receive unchoke message
  2. Immediately send requests up to pipeline depth
  3. Fast refill maximizes throughput

Block cancellation (endgame mode): When download nearly complete, broadcast same request to all peers. On piece arrival, send cancel to all other peers. Prevents wasting time on slow peer for last few blocks.