Understanding the BitTorrent Protocol: Part 3
One of the most elegant aspects of BitTorrent is how it solves the “free rider problem” that plagued earlier peer-to-peer networks. In systems like Napster or Gnutella, users could download without uploading, consuming bandwidth without contributing back. This created an unsustainable ecosystem where a few generous users shouldered the burden for everyone else.
BitTorrent takes a different approach. Through an algorithm called “choking,” it creates a decentralized system of reciprocity. If you help others download, they help you. If you don’t contribute, you get slow speeds. There’s no central authority enforcing this. Every peer makes independent decisions based purely on local information, yet the emergent behavior is a healthy swarm where cooperation thrives and freeloading is punished.
But choking isn’t just about fairness. It also solves critical technical problems with TCP performance and bandwidth allocation. Understanding the choking algorithm is essential to understanding why BitTorrent works so well at scale.
Contents
- The Problems Choking Solves
- Core Concepts
- Leecher Choking Algorithm
- Seeder Choking Algorithm
- Optimistic Unchoking
- Anti-Snubbing
- Game Theory Analysis
The Problems Choking Solves
Before diving into how the algorithm works, let’s understand why it exists. The choking algorithm isn’t solving one problem - it’s solving three distinct but related problems simultaneously.
TCP Congestion Control
Here’s something that surprises many people: you can’t just upload to everyone at once. If you try to upload to 50 peers simultaneously with limited bandwidth, your performance will collapse. This isn’t a BitTorrent problem - it’s a TCP problem.
TCP has built-in congestion control. When a TCP connection detects packet loss (which happens when your upload bandwidth is saturated), it assumes the network is congested and slows down. It enters “congestion avoidance” mode, cutting its transmission rate. The more connections you’re trying to maintain with limited bandwidth, the more packet loss you get, and the more TCP connections back off. Eventually, you end up with 50 connections all running at a fraction of their potential speed.
There’s another issue: TCP’s slow start algorithm. When a new connection starts or recovers from congestion, it doesn’t immediately use full bandwidth. It starts slow and gradually increases speed. If you keep choking and unchoking peers rapidly, connections never have time to ramp up to full speed. You’re constantly stuck in slow start.
The solution is simple: limit the number of simultaneous uploads. If you have 1 Mbps upload capacity, you’re better off uploading to 4 peers at 250 Kbps each than to 50 peers at 20 Kbps each (which will actually perform worse due to TCP overhead). Those 4 connections will maintain healthy TCP windows, avoid packet loss, and sustain high throughput.
BitTorrent typically uses 3-4 regular upload slots. This isn’t arbitrary - it’s chosen to balance between serving multiple peers (good for swarm health) and maintaining good per-connection throughput (good for TCP performance).
The Free Rider Problem
Peer-to-peer networks have a fundamental economic problem. Uploading costs you bandwidth but doesn’t directly benefit you. Downloading benefits you but costs others their bandwidth. The rational selfish strategy is to download as much as possible while uploading as little as possible. If everyone follows this strategy, the network collapses.
This is the free rider problem, and it destroyed earlier P2P systems. Studies of Gnutella and Kazaa showed that roughly 70% of users contributed nothing while 1% of users provided 50% of all content. The generous minority subsidized the selfish majority, which wasn’t sustainable.
BitTorrent’s choking algorithm creates consequences for free riding. Here’s how it works: if you’re not uploading to me, I won’t upload to you. This is called “tit-for-tat” reciprocity. Every 10 seconds, each peer evaluates who’s giving them the best download rates and unchokes those peers. If you’re a free rider providing nothing, you won’t rank highly in anyone’s evaluation, so you’ll stay choked by everyone.
The beauty is that this happens automatically and locally. There’s no global reputation system, no central authority, no coordination between peers. Each peer simply makes selfish decisions based on who helps them most, and the emergent system-wide behavior punishes free riding.
But there’s a nuance here: the punishment isn’t absolute. Free riders don’t get zero service - they just get poor service. They can still download via optimistic unchokes (which we’ll discuss later), just much more slowly than cooperative peers. This creates the right incentive: if you want good speeds, contribute. But even free riders get enough to potentially change their behavior.
Resource Discovery
Here’s a subtle problem the choking algorithm solves: how do you discover which peers have the best upload capacity?
When you first connect to the swarm, you don’t know which peers are on fast connections and which are on slow ones. Some peers might be on gigabit fiber. Others might be on mobile connections. You need a way to test peers and allocate your download bandwidth to the fastest ones.
The regular unchoke round does this. Every 10 seconds, you re-evaluate based on measured download rates. If a peer was giving you 500 KB/s but now they’re only giving you 50 KB/s (maybe their network got congested), they’ll drop in your rankings and get choked. Meanwhile, a peer who was choked might have great capacity - you just didn’t know it yet. That’s what optimistic unchoking discovers.
This creates a continuous optimization process. The system constantly adapts to changing network conditions. If a peer’s connection degrades, they get replaced. If a new peer joins with excellent bandwidth, they get discovered and prioritized. All of this happens automatically through the choking algorithm.
Core Concepts
What is Choking?
Choking is a temporary refusal to upload. When peer A chokes peer B, A will not send any piece data to B. However - and this is crucial - B can still upload to A. Choking is unilateral and directional.
This is fundamentally different from blocking or disconnecting. The TCP connection remains open. They continue exchanging control messages (have, interested, not interested). B can still request pieces from A; A will just reject those requests (or ignore them in older implementations). And critically, the choke can be reversed. In 10 seconds, A might unchoke B if B starts contributing upload bandwidth.
Think of it like this: choking is saying “I’m not willing to help you right now,” not “I’m severing our relationship.” It’s a negotiating position, not a final decision.
Connection States
Every peer connection has four possible states, determined by two binary decisions on each side:
From your perspective (what you send to them):
- Am I choking them? (yes/no)
- Am I interested in them? (yes/no)
From their perspective (what they send to you):
- Are they choking me? (yes/no)
- Are they interested in me? (yes/no)
Data can only flow when both conditions are met:
- The sender has unchoked the receiver
- The receiver is interested in the sender
If I want to download from you, I must be both interested in you AND you must unchoke me. Similarly, if you want to download from me, you must be interested in me AND I must unchoke you.
This creates a four-state space per direction:
| My Choke State | My Interest | Meaning |
|---|---|---|
| Choking | Not Interested | We’re not exchanging anything |
| Choking | Interested | I want your pieces but won’t give you mine |
| Not Choking | Not Interested | I’ll give you pieces but don’t need yours |
| Not Choking | Interested | Active exchange possible |
The choking algorithm determines when to transition between these states. Interest is calculated based on piece availability (do they have pieces I need?). Choking is calculated based on reciprocity and upload slot allocation.
Upload Slots
An “upload slot” is the right to be unchoked. If a peer has 4 upload slots and 20 interested peers, only 4 of those peers will be unchoked at any time. The other 16 will be choked, waiting for their turn.
BitTorrent typically uses 4 upload slots total:
- 3 regular unchoke slots (allocated based on reciprocity)
- 1 optimistic unchoke slot (allocated randomly)
Some high-bandwidth peers might use more (5-6 slots), but going beyond that has diminishing returns. Remember the TCP congestion control problem: too many simultaneous uploads hurt everyone’s performance.
The algorithm’s job is to decide which peers get these precious upload slots.
Leecher Choking Algorithm
When you’re still downloading (you’re a “leecher”), your goal is simple: maximize your download speed. The choking algorithm helps you achieve this through reciprocity.
Regular Unchoke Round
Every 10 seconds, the leecher performs a regular unchoke round:
-
Filter to interested peers: Only consider peers who are interested in you (they want pieces you have). No point unchoking peers who don’t want anything from you.
-
Measure download rates: For each interested peer, calculate how fast you’ve been downloading from them over the last 20 seconds (a rolling window).
-
Sort by download rate: Order peers from fastest to slowest.
-
Unchoke top 3: Unchoke the three peers with the highest download rates. These get your regular upload slots.
-
Choke everyone else: All other previously unchoked peers (except the optimistic unchoke) get choked.
This is pure reciprocity in action. You’re rewarding the peers who help you most. If peer A gives you 500 KB/s and peer B gives you 50 KB/s, peer A gets an upload slot and peer B doesn’t.
Here’s a concrete example. You’re connected to 10 interested peers with these download rates:
Peer A: 800 KB/s
Peer B: 600 KB/s
Peer C: 500 KB/s
Peer D: 400 KB/s
Peer E: 300 KB/s
Peer F: 200 KB/s
Peer G: 100 KB/s
Peer H: 50 KB/s
Peer I: 25 KB/s
Peer J: 10 KB/s
After the unchoke round:
- Unchoked: A, B, C (top 3)
- Choked: D, E, F, G, H, I, J
The next unchoke round in 10 seconds might produce different results if the rates change. Maybe peer D’s network improves and they give you 700 KB/s. They’d jump to position 2, and peer C would get choked.
Seeder Choking Algorithm
Once you’ve completed downloading and become a seeder, the game changes. You can no longer use download rate to make decisions because you’re not downloading anything. You need a different strategy.
Different Goals, Different Strategy
As a leecher, your goal is selfish: maximize your own download speed. But as a seeder, your goal should be altruistic: maximize the swarm’s health. You want to distribute pieces efficiently so the swarm grows and strengthens.
This is where seeders become particularly important. If seeders just uploaded to random peers, the swarm would be inefficient. Instead, seeders use a clever strategy: upload to peers who are already downloading quickly (as evidenced by them being recently unchoked).
Why? Because those peers are the most likely to turn around and upload to others. If you upload to a peer on a slow connection, those pieces get stuck with them. If you upload to a peer on a fast connection who’s actively participating, those pieces get redistributed quickly. You’re creating a multiplier effect.
Round-Robin by Recency
Here’s the seeder algorithm, executed every 10 seconds:
-
Filter to interested peers: Only consider peers interested in you.
-
Sort by last unchoke time: Order peers by when they were most recently unchoked (by anyone, not just you), with the most recent first.
-
Unchoke top 3: Unchoke the three peers who were most recently unchoked.
-
Choke everyone else: Choke all others (except optimistic unchoke).
For tie-breaking (if two peers were unchoked at exactly the same time), prefer the peer with the higher upload rate to you (measured by how fast they’re downloading from you).
This creates a round-robin effect among the actively downloading peers. If a peer is being unchoked by others frequently, it means they’re contributing upload bandwidth and participating actively. By continuing to upload to them, you’re helping strong contributors become even more effective distributors.
Example: You’re seeding to 8 interested peers. Their last unchoke times:
Peer A: 2 seconds ago
Peer B: 5 seconds ago
Peer C: 8 seconds ago
Peer D: 12 seconds ago
Peer E: 20 seconds ago
Peer F: 30 seconds ago
Peer G: 45 seconds ago
Peer H: 60 seconds ago
You unchoke A, B, C (most recently unchoked). In the next round, maybe D gets unchoked by another peer, bumping them up. You’d then unchoke D, A, B, and choke C.
Initialization Phase
There’s a special case when you first become a seeder. For the first 30 seconds, the algorithm works differently to promote diversity:
First 20 seconds:
- Unchoke 3 peers by recency (as normal)
- Unchoke 1 random peer (promotes randomness)
Next 10 seconds:
- Unchoke top 4 peers by recency (no random selection)
Why this dual-phase approach? The random selection in the first phase prevents all seeders from uploading to the exact same peers. If every seeder used pure recency-based selection from the start, they might all converge on the same few peers, creating hotspots. The initial randomness distributes the load.
After the randomization phase, you switch to pure efficiency (top 4 by recency) to ensure you’re helping the most active participants.
Optimistic Unchoking
This is where the elegance of BitTorrent really shines. The regular unchoke algorithm has a catch-22: new peers can’t get unchoked because they’re not providing good download rates, but they can’t provide good download rates because they don’t have any pieces to trade (since everyone has them choked).
The Bootstrap Problem
Imagine you’re a brand new peer joining the swarm. You have zero pieces. Every other peer in the swarm is running the reciprocity algorithm. They’re measuring download rates and unchoking the top contributors. Since you have nothing to trade, your download rate to them is zero. By the reciprocity rules, you’ll never get unchoked.
Without some escape hatch, new peers would be stuck forever. The swarm would be closed to newcomers. This would be catastrophic for BitTorrent’s growth and resilience.
How Optimistic Unchoking Works
Every 30 seconds, each peer performs an optimistic unchoke:
-
Select one interested peer at random (excluding currently unchoked peers and the peer who is already the optimistic unchoke)
-
Unchoke them for 30 seconds regardless of reciprocity
-
After 30 seconds, select a different peer for the next optimistic unchoke round
This is called “optimistic” because you’re giving them a chance without proof of contribution. You’re being optimistic that they might turn out to be a good peer.
The optimistic unchoke serves multiple purposes:
Bootstrapping new peers: That brand new peer with zero pieces? They might get lucky and become your optimistic unchoke. You give them pieces, they can start trading with others, and they bootstrap into the swarm. Without this, the swarm would calcify.
Discovering better peers: Maybe one of your choked peers is actually on a gigabit connection, but you don’t know it because you haven’t given them a chance. The optimistic unchoke tests them. If they turn out to have excellent upload to you, they’ll graduate to being a regular unchoke in the next round.
Recovering from bad situations: Sometimes all your currently unchoked peers go bad (network issues, they stop uploading, whatever). The optimistic unchoke continuously explores alternatives, giving you a way to find better peers.
Maintaining altruism: Pure reciprocity would create a closed system where only established peers benefit. Optimistic unchoking injects altruism into the system, giving everyone a chance. This is necessary for long-term swarm health.
New Peer Priority
There’s a clever twist: newly connected peers get 3x higher probability of being selected for optimistic unchoke compared to peers who’ve been around a while.
Why? Because established peers have already had multiple chances to prove themselves through past optimistic unchokes. New peers need help bootstrapping quickly. By giving them higher selection probability, you ensure they get a chance within their first minute or two of joining, rather than waiting 10+ minutes.
The math works like this: if you have 10 choked interested peers, and 2 of them connected in the last 30 seconds (new peers), your selection weights might be:
New Peer 1: weight 3
New Peer 2: weight 3
Old Peer 1: weight 1
Old Peer 2: weight 1
Old Peer 3: weight 1
Old Peer 4: weight 1
Old Peer 5: weight 1
Old Peer 6: weight 1
Old Peer 7: weight 1
Old Peer 8: weight 1
Total weight: 14
New peers have 6/14 = 43% chance of selection
Old peers have 8/14 = 57% chance of selection
Even though new peers are only 20% of the pool (2 out of 10), they have 43% chance of being chosen. This dramatically speeds up their integration into the swarm.
Anti-Snubbing
Sometimes the regular algorithm and optimistic unchoking aren’t enough. You can get into a situation where you’re stuck with no good peers and need a more aggressive recovery mechanism.
When Everything Goes Wrong
Imagine this scenario: You’re downloading from 5 peers. Your upload capacity isn’t great (maybe you’re on asymmetric DSL with limited upload). Because of your limited upload, other peers in the swarm start preferring different peers over you. Gradually, all 5 of your peers choke you. Now you’re sitting there, interested in downloading from them, but they’re all refusing to serve you.
This is called being “snubbed.” You’re not disconnected - the TCP connections are still open. But you’re choked by everyone you want to download from. In this state, you’re dependent on optimistic unchokes from those peers, which only happen every 30 seconds from each peer. You might wait several minutes before getting any data.
This is particularly problematic if you’re on a slow upload connection. The reciprocity algorithm naturally deprioritizes you, and you can get stuck in a situation where you can’t download fast enough to acquire pieces, which means you can’t upload, which means you stay deprioritized. It’s a vicious cycle.
Detection and Response
The anti-snubbing algorithm watches for this situation. If you haven’t received any piece data from a peer for more than 60 seconds, even though you’re interested in them and they claim to have pieces you need, you mark them as snubbing you.
Once you detect you’re being snubbed by a peer, you take action: stop uploading to them (except as optimistic unchoke). They’re not helping you, so you don’t help them. This is reciprocity taken to its logical conclusion.
But here’s the critical part: you also increase your optimistic unchoke rate. Instead of doing one optimistic unchoke every 30 seconds (the normal behavior), you start doing multiple optimistic unchokes simultaneously.
This creates an exception to the “exactly one optimistic unchoke” rule. When you’re being snubbed, you need to explore more aggressively. You’re essentially saying “these peers aren’t working out, I need to cast a wider net to find better options.”
Recovery Mechanism
The additional optimistic unchokes usually help you recover quickly. Within a couple of rounds (1-2 minutes), you’ll typically find new peers who will serve you. Once you start receiving data again and building up reciprocity with these new peers, you drop back to normal single optimistic unchoke behavior.
The beauty of anti-snubbing is that it’s self-correcting. It only activates when you’re in a bad state, and it automatically deactivates when you’ve recovered. There’s no manual intervention needed, no configuration to tweak. The algorithm just adapts.
Example timeline:
0:00 - Downloading normally from peers A, B, C
0:10 - Peer A chokes you (they found a better peer)
0:20 - Peer B chokes you (network congestion on their end)
0:30 - Peer C chokes you (they became a seeder and switched algorithms)
0:40 - No data received for 60s total, enter anti-snubbing
0:40 - Perform 3 optimistic unchokes (peers D, E, F)
0:50 - Peer D starts uploading to you (200 KB/s)
1:00 - Peer E starts uploading to you (150 KB/s)
1:10 - Now receiving data again, exit anti-snubbing
1:10 - Return to 1 optimistic unchoke
Within 90 seconds of entering the bad state, you’ve recovered and found new peers. The swarm is resilient.
Game Theory Analysis
Let’s step back and look at the choking algorithm through the lens of game theory. Why does this system encourage cooperation when the rational selfish strategy seems to be freeloading?
Tit-for-Tat Strategy
The regular unchoke algorithm is an implementation of a game theory strategy called “tit-for-tat.” This strategy became famous through Robert Axelrod’s tournaments in the 1980s, where different strategies competed in repeated prisoner’s dilemma games.
Tit-for-tat works like this:
- Start by cooperating
- Then, do whatever the other player did in the previous round
- If they cooperated, you cooperate
- If they defected, you defect
In BitTorrent terms:
- New peers get optimistic unchokes (starting cooperation)
- Then, peers who upload to you get unchoked (reciprocating cooperation)
- Peers who don’t upload to you get choked (reciprocating defection)
Axelrod found that tit-for-tat was remarkably successful. It’s “nice” (never defects first), “provocable” (retaliates against defection), “forgiving” (returns to cooperation if the other player does), and “clear” (easy to understand, so others can adapt to it).
BitTorrent’s choking algorithm has all these properties:
- Nice: Optimistic unchoking gives everyone a chance
- Provocable: If you don’t upload to me, I choke you
- Forgiving: Start uploading again and you’ll get unchoked in the next round
- Clear: The rules are simple and deterministic
Nash Equilibrium
A Nash equilibrium is a state where no player can improve their outcome by changing strategy unilaterally. In BitTorrent, the equilibrium is “everyone cooperates.”
Here’s why: If you’re uploading to peers who upload to you, you’re maximizing your download speed. If you try to free ride (download without uploading), you get choked by everyone, and your download speed plummets. The only peers who will serve you are those doing optimistic unchokes, which is infrequent and slow.
Compare the outcomes:
Strategy A: Cooperate (upload to top contributors)
- You get unchoked by 3-4 peers with good upload capacity
- You download at 500+ KB/s
- You’re contributing upload bandwidth, but this doesn’t hurt you (you have the bandwidth anyway)
Strategy B: Freeload (don’t upload to anyone)
- You get choked by everyone running reciprocity
- You only get served by random optimistic unchokes (every 30s per peer)
- You download at 50 KB/s or less
- You saved upload bandwidth, but gained nothing
The cooperative strategy dominates. You can’t improve your outcome by switching to freeloading. This is a stable equilibrium.
Why Free Riding Fails
Free riders don’t get completely shut out (thanks to optimistic unchoking), but they get significantly degraded service. Studies have shown that free riders typically get 10-20% of the speed that cooperative peers get.
This creates exactly the right incentive structure:
- Cooperation is rewarded: Upload to others, and you’ll get fast downloads
- Free riding is punished but not prevented: You can still download, just slowly
- Partial cooperation is possible: You can be a “partial free rider” with limited upload, and you’ll get intermediate speeds
This gradient of outcomes is important. If free riders got no service at all, they’d have no incentive to start cooperating. But because they get some service (just slow), they see what they’re missing and have an incentive to contribute to improve their speeds.
The system also handles different network capacities fairly. If you’re on a slow connection with limited upload, you’ll naturally rank lower in reciprocity calculations. But you’ll still get some service (via optimistic unchokes and from other slow peers), and you’re contributing what you can. The algorithm doesn’t require everyone to have equal capacity - it just requires that you contribute proportionally to benefit proportionally.