<liclass="toctree-l1 current has-children"><aclass="reference internal"href="../index.html">Comparison</a><inputchecked=""class="toctree-checkbox"id="toctree-checkbox-1"name="toctree-checkbox-1"role="switch"type="checkbox"/><labelfor="toctree-checkbox-1"><divclass="visually-hidden">Toggle navigation of Comparison</div><iclass="icon"><svg><usehref="#svg-arrow-right"></use></svg></i></label><ulclass="current">
<liclass="toctree-l2 current has-children"><aclass="reference internal"href="index.html">P2P</a><inputchecked=""class="toctree-checkbox"id="toctree-checkbox-2"name="toctree-checkbox-2"role="switch"type="checkbox"/><labelfor="toctree-checkbox-2"><divclass="visually-hidden">Toggle navigation of P2P</div><iclass="icon"><svg><usehref="#svg-arrow-right"></use></svg></i></label><ulclass="current">
<liclass="toctree-l3 current current-page"><aclass="current reference internal"href="#">BitTorrent</a></li>
<liclass="toctree-l2 has-children"><aclass="reference internal"href="../data/index.html">Data Structures</a><inputclass="toctree-checkbox"id="toctree-checkbox-5"name="toctree-checkbox-5"role="switch"type="checkbox"/><labelfor="toctree-checkbox-5"><divclass="visually-hidden">Toggle navigation of Data Structures</div><iclass="icon"><svg><usehref="#svg-arrow-right"></use></svg></i></label><ul>
<spanid="index-0"></span><spanid="id1"></span><h1>BitTorrent<aclass="headerlink"href="#bittorrent"title="Permalink to this heading">#</a></h1>
<p>Bittorrent is unarguably the most successful p2p protocol to date, and needless to say we have much to learn walking in its footsteps.</p>
<sectionid="summary">
<h2>Summary<aclass="headerlink"href="#summary"title="Permalink to this heading">#</a></h2>
<p>There are a number of very complete explanations of BitTorrent as a protocol, so we don’t attempt one here outside of giving an unfamiliar reader a general sense of how it works.</p>
<sectionid="torrents">
<h3>Torrents<aclass="headerlink"href="#torrents"title="Permalink to this heading">#</a></h3>
<p>Data is shared on BitTorrent in units described by <codeclass="docutils literal notranslate"><spanclass="pre">.torrent</span></code> files. They are <aclass="reference external"href="https://en.wikipedia.org/wiki/Bencode">bencoded</a> dictionaries that contain the following fields (in Bittorrent v1):</p>
<ulclass="simple">
<li><p><codeclass="docutils literal notranslate"><spanclass="pre">announce</span></code>: The URL of one or several trackers (described below)</p></li>
<li><p><codeclass="docutils literal notranslate"><spanclass="pre">info</span></code>: A dictionary which includes metadata that describes the included file(s) and their length. The files are concatenated and then split into fixed-size pieces, and the info dict contains the SHA-1 hash of each piece.</p></li>
</ul>
<p>For example, a directory of three random files has a (decoded) <codeclass="docutils literal notranslate"><spanclass="pre">.torrent</span></code> file that looks like this:</p>
<spanclass="w"></span><spanclass="nt">"pieces"</span><spanclass="p">:</span><spanclass="w"></span><spanclass="s2">"<long string of concatenated hashes>"</span>
<spanclass="w"></span><spanclass="p">}</span>
<spanclass="p">}</span>
</pre></div>
</div>
<p>The contents of a torrent file are then uniquely indexed by the <codeclass="docutils literal notranslate"><spanclass="pre">infohash</span></code>, which is the hash of the entire (bencoded) <codeclass="docutils literal notranslate"><spanclass="pre">info</span></code> dictionary. are an abbreviated form of the <codeclass="docutils literal notranslate"><spanclass="pre">.torrent</span></code> file that contain only the info-hash, which allows downloading peers to request and independently verify the rest of the info dictionary and start downloading without a complete <codeclass="docutils literal notranslate"><spanclass="pre">.torrent</span></code>.</p>
<p>BitTorrent v2 extends traditional <codeclass="docutils literal notranslate"><spanclass="pre">.torrent</span></code> files to include a <spanclass="target"id="index-1"></span>Merkle Tree which generalizes the traditional piece structure with some nice properties like being able to recognize unique files across multiple <codeclass="docutils literal notranslate"><spanclass="pre">.torrent</span></code>s, etc.</p>
</section>
<sectionid="trackers">
<h3>Trackers<aclass="headerlink"href="#trackers"title="Permalink to this heading">#</a></h3>
<p>To connect peers that might have or be interested in the contents of a given <codeclass="docutils literal notranslate"><spanclass="pre">.torrent</span></code> file, the <codeclass="docutils literal notranslate"><spanclass="pre">.torrent</span></code> (but not its contents) are uploaded to a <spanclass="target"id="index-2"></span>Tracker. Peers interested in downloading a <codeclass="docutils literal notranslate"><spanclass="pre">.torrent</span></code> will connect to the trackers that it indicates in its <codeclass="docutils literal notranslate"><spanclass="pre">announce</span></code><aclass="footnote-reference brackets"href="#announcelist"id="id2"role="doc-noteref"><spanclass="fn-bracket">[</span>1<spanclass="fn-bracket">]</span></a> metadata, and the trackers will return a list of peer IP:Port combinations that the peer can download the file from. The downloading (leeching) peer doesn’t need to trust the uploading (seeding) peers that the data they are sending is what is specified by the <codeclass="docutils literal notranslate"><spanclass="pre">.torrent</span></code>: the client checks the computed hash of each received piece against the hashes in the info dict, which is in turn checked against the info hash.</p>
<p>Trackers solve the problem of <spanclass="target"id="index-3"></span>Discovery by giving a clear point where peers can find other peers from only the information contained within the <codeclass="docutils literal notranslate"><spanclass="pre">.torrent</span></code> itself. Trackers introduce a degree of brittleness, however, as they can become a single point of failure. Additional means of discovering peers have been added to BitTorrent over time, including <aclass="reference external"href="http://www.bittorrent.org/beps/bep_0005.html"><spanclass="target"id="index-4"></span>Distributed Hash Tables</a>, <aclass="reference external"href="http://www.bittorrent.org/beps/bep_0011.html">Peer Exchange</a></p>
<p>Beyond their technical role, BitTorrent trackers also form a <strong>social space</strong> that is critical to understand its success as a protocol. While prior protocols like <spanclass="target"id="index-5"></span>Gnutella (of <spanclass="target"id="index-6"></span>Limewire/<spanclass="target"id="index-7"></span>Kazaa fame) had integrated search and peer discovery into the client and protocol itself, separating trackers as a means of organizing the BitTorrent ecosystem has allowed them to flourish as a means of experimenting with the kinds of social organization that keeps p2p swarms healthy. Tracker communities range from huge and disconnected as in widely-known public trackers like ThePirateBay, to tiny and close-knit like some niche private trackers.</p>
<p>The bifurcated tracker/peer structure makes the overall system remarkably <em>resilient</em>. The trackers don’t host any infringing content themselves, they just organize the metadata for finding it, so they are relatively long-lived and inexpensive to start compared to more resource- and risk-intensive piracy vectors. If they are shut down, the peers can continue to share amongst themselves through DHT, Peer Exchange, and any other trackers that are specified in the <codeclass="docutils literal notranslate"><spanclass="pre">.torrent</span></code> files. When a successor pops up, the members of the old tracker can then re-collect the <codeclass="docutils literal notranslate"><spanclass="pre">.torrent</span></code> files from the prior site, and without needing a massive re-upload of data to a centralized server repopulate the new site.</p>
<p>See more detailed discussion re: lessons from BitTorrent Trackers for social infrastructure in “<aclass="reference external"href="https://jon-e.net/infrastructure/#archives-need-communities">Archives Need Communities</a>” in <spanid="id3">[<aclass="reference internal"href="../../references.html#id13"title="Jonny L. Saunders. Decentralized Infrastructure for (Neuro)science. 2022-08-31. URL: http://arxiv.org/abs/2209.07493 (visited on 2023-03-01), arXiv:2209.07493, doi:10.48550/arXiv.2209.07493.">Saunders, 2022</a>]</span></p>
<h3>Protocol<aclass="headerlink"href="#protocol"title="Permalink to this heading">#</a></h3>
<p>Peers that have been referred to one another from a tracker or other means start by attempting to make a connection with a ‘handshake’ that specifies the peer is connecting with BitTorrent and any other protocol extensions it supports.</p>
<p>There are a number of subtleties in the transfer protocol, but it can be broadly summarized as a series of steps where peers tell each other which pieces they have, which they are interested in, and then sharing them amongst themselves.</p>
<p>Though not explicitly in the protocol spec, two prominent design decisions are worth mentioning (See eg. <spanid="id4">[<aclass="reference internal"href="../../references.html#id8"title="Arnaud Legout, G. Urvoy-Keller, and P. Michiardi. Rarest first and choke algorithms are enough. In Proceedings of the 6th ACM SIGCOMM on Internet Measurement - IMC '06, 203. ACM Press, 2006. URL: http://portal.acm.org/citation.cfm?doid=1177080.1177106 (visited on 2018-11-09), doi:10.1145/1177080.1177106.">Legout <em>et al.</em>, 2006</a>]</span> for additional discussion).</p>
<li><p><strong>Peer Selection:</strong> Which peers should I spent finite bandwidth uploading to? BitTorrent uses a variety of <strong>Choke</strong> algorithms that reward peers that reciprocate bandwidth. Choke algorithms are typically some variant of a ‘tit-for-tat’ strategy, although rarely the strict bitwise tit-for-tat favored by later blockchain systems and others that require a peer to upload an equivalent amount to what they have downloaded before they are given any additional pieces. Contrast this with <aclass="reference internal"href="ipfs.html#bitswap"><spanclass="std std-ref"><spanclass="target"id="index-8"></span>BitSwap</span></a> from IPFS. It is by <em>not</em> perfectly optimizing peer selection that BitTorrent is better capable of using more of its available network resources.</p></li>
<li><p><strong>Piece Selection:</strong> Which pieces should be uploaded/requested first? BitTorrent uses a <strong>Rarest First</strong> strategy, where a peer keeps track of the number of copies of each piece present in the swarm, and preferentially seeds the rarest pieces. This keeps the swarm healthy, rewarding keeping and sharing complete copies of files. This is in contrast to, eg. <aclass="reference internal"href="#SWARM"><spanclass="xref myst">SWARM</span></a> which explicitly rewards hosting and sharing the most in-demand pieces.</p></li>
</ul>
</section>
</section>
<sectionid="lessons">
<h2>Lessons<aclass="headerlink"href="#lessons"title="Permalink to this heading">#</a></h2>
<p>(This section is mostly a scratchpad at the moment)</p>
<sectionid="adopt">
<h3>Adopt<aclass="headerlink"href="#adopt"title="Permalink to this heading">#</a></h3>
<ulclass="simple">
<li><p>Eventually had to add a generic ‘extension extension’ (<aclass="reference external"href="http://www.bittorrent.org/beps/bep_0010.html">BEP 10</a>), where on initial connection a peer informs another peer what extra features of the protocol it supports without needing to make constant adjustment to the underlying BitTorrent protocol. This pattern is adopted by most p2p protocols that follow, including <aclass="reference internal"href="../social/nostr.html#nostr"><spanclass="std std-ref">Nostr</span></a> which is almost <em>entirely</em> extensions.</p>
<ul>
<li><p>These extensions are not self-describing, however, and they require some centralized registry of extensions, see also <aclass="reference internal"href="ipfs.html#ipfs"><spanclass="std std-ref">IPFS</span></a> and its handling of codecs, which curiously build a lot of infrastructure for self-describing extensions but at the very last stage fall back to a single git repository as the registry.</p></li>
</ul>
</li>
<li><p><codeclass="docutils literal notranslate"><spanclass="pre">.torrent</span></code> files make for a very <strong>low barrier to entry</strong> and are extremely <strong>portable.</strong> They also operate over the existing idioms of files and folders, rather than creating their own filesystem abstraction.</p></li>
<li><p>Explicit peer and piece selection algorithms are left out of the protocol specification, allowing individual implementations to experiment with what works. This makes it possible to exploit the protocol by refusing to seed ever, but this rarely occurs in practice, as people are not the complete assholes imagined in worst-case scenarios of scarcity. Indeed even the most selfish peers have the intrinsic incentive to upload, as by aggressively seeding the pieces that a leeching peer already has, the other peers in the swarm are less likely to “waste” the bandwidth of the seeders and more bandwidth can be allocated to pieces that the leecher doesn’t already have.</p></li>
</ul>
</section>
<sectionid="adapt">
<h3>Adapt<aclass="headerlink"href="#adapt"title="Permalink to this heading">#</a></h3>
<ulclass="simple">
<li><p><strong>Metadata</strong>. Currently all torrent metadata is contained within the tracker, so while it is possible to restore all the files that were indexed by a downed tracker, it is very difficult to restore all the metadata at a torrent level and above, eg. the organization of specific torrents into hierarchical categories that allow one to search for an artist, all the albums they have produced, all the versions of that album in different file formats, and so on.</p></li>
<li><p>Give more in-protocol tools to social systems. This is tricky because we don’t necessarily need to go down the road of DAOs and make strictly enforceable contracts. Recall that it is precisely by relaxing conditions of “optimality” that BitTorrent makes use of all resources available.</p></li>
<li><p><strong>Cross-Swarm Indexing</strong> - BitTorrent organizes all peer connections within swarms that are particular for a given <codeclass="docutils literal notranslate"><spanclass="pre">.torrent</span></code> file. We instead want to be able for a set of socially connected peers to be able to share many files.</p></li>
<li><p><strong>Anonymity</strong> This is also a tricky balance - We want to do three things that are potentially in conflict:</p>
<olclass="arabic simple">
<li><p>Make use of the social structure of our peer swarm to be able to allocate automatic rehosting/sharding of files uploaded by close friends, etc.</p></li>
<li><p>Maintain the possibility for loose anonymity where peers can share files without needing a large and well-connected social system to share files to them</p></li>
<li><p>Avoid significant performance penalties from guarantees of strong network-level anonymity like Tor.</p></li>
</ol>
</li>
<li><p><strong>Trackers</strong> are a good idea, even if they could use some updating. It is good to have an explicit entrypoint specified with a distributed, social mechanism rather than prespecified as a hardcoded entry point. It is a good idea to make a clear space for social curation of information, rather than something that is intrinsically bound to a torrent at the time of uploading. We update the notion of trackers with <aclass="reference internal"href="../../federation.html#peer-federations"><spanclass="std std-ref">Peer Federations</span></a>.</p></li>
</ul>
</section>
</section>
<sectionid="references">
<h2>References<aclass="headerlink"href="#references"title="Permalink to this heading">#</a></h2>
<p>Or, properly, in the <codeclass="docutils literal notranslate"><spanclass="pre">announce-list</span></code> per (<aclass="reference external"href="http://www.bittorrent.org/beps/bep_0012.html">BEP 12</a>)</p>