Introduction

All of this is very work in progress :) plz do not rely on any of the descriptions or statements here, as they are all effectively provisional.

This site describes the implementation of the p2p linked data protocol in {{#cite saundersDecentralizedInfrastructureNeuro2022 }}

Overview

p2p-ld

Background

  • Semweb/Linked Data
  • Limitations/differences of existing p2p

Use

  • How is this intended to be used? by whom? in what contexts?

Roadmap

  • Development roadmap and timeline!

Comparison

All of this is TODO. Comparison to existing protocols and projects (just to situate in context, not talk shit obvs)

"The big ones"

  • BitTorrent
  • IPFS

"The research ones"

  • Dat
  • Hypercore

Social

  • ActivityPub/Fediverse
  • Secure Scuttlebutt
  • Matrix

Semweb/LD

  • SOLID
  • Nanopubs

To be categorized

  • Agregore
  • Arweave
  • CAN
  • Chord
  • Earthstar
  • Freenet
  • Manyverse
  • P2panda
  • SAFE
  • Storj
  • Swarm

Points of comparison

  • not append-only
  • metadata

P2P Concepts

Overview of the various concepts that p2p systems have to handle or address with links to the sections where we address them!

  • Definitions - Terms used within the protocol spec
  • Protocol - The protocol spec itself, which encompasses the following sections and describes how they relate to one another.
  • Identity - How each peer in the swarm is identified (or not)
  • Discovery - How peers are discovered and connected to in the swarm, or, how an identity is dereferenced into some network entity.
  • Data Structures - What and how data is represented within the protocol
  • Querying - How data, or pieces of data are requested from hosting peers
  • Evolvability - How the protocol is intended to accommodate changes, plugins, etc.

Additionally, p2p-ld considers these additional properties that are not universal to p2p protocols:

  • Vocabulary - The linked data vocabulary that is used within the protocol
  • Encryption - How individual messages can be encrypted and decrypted by peers
  • Federation - How peers can form supra-peer clusters for swarm robustness, social organization, and governance
  • Backwards Compatibility - How the protocol integrates with existing protocols and technologies.

Out of Scope

What should explicitly be left out of the protocol?

Implementation

Things that are described in the spec, but details are left up to the implementation

  • codecs: the spec describes how to define a codec, but does not include any codecs.

Definitions

Protocol

Connection

  • When connecting to a peer, a peer MUST advertise its own connections to other peers whose discoverability permissions allow it
    • eg. a peer can desig

Requests

Sharding

Every link has an implicit backlink that can be accepted/denied by the owner of the referenced object.

If a link is proposed from a blocked identifier, the proposed link is automatically dropped

Identity

How is an individual peer identified?

  • Cryptographic identity
  • Web of trust/shared identity
  • External verification/discovery via DNS and other out of band means.

Instances

A given identity can have 0 or many instances - a manifestation of the peer within a particular server and runtime.

Each instance indicates a collection of peers

When connecting to a peer, the peer MUST tell the connecting peer of the instances that are within its permission scope.

Aliases

A given identity can have 0 or many bidirectional links indicating that the identity is sameAs another

  • eg. a fediverse account can indicate a cryptographic identity and then be used equivalently.
  • Verification aliases MUST have a backlink from the original identity
  • Subscribers to a given identity MUST store and represent the known aliases and treat them as equivalent
  • Other accounts can give an alias to an identity that MAY be accepted (by issuing a backlink) or denied (by ignoring it).

Succession

An identity has a specific field indicating whether it is "active" or "retired," and can issue a special top-level link with given permission scope indicating the identity that succeeds it. - eg in the case of harrassment, one can hop identities and only tell close friends.

Beacons

Any peer can operate as a "Pub" (in the parlance of SSB) or a bootstrapping node, where a dereferenceable network location (eg. DNS) can be resolved to a

A given identity can have 0 or many static inbound references that can resolve a network

Discovery

How do we find people and know how to connect to them?

  • Bootstrapping initial connections
  • Gossiping
  • Hole punching

Data Structures

Triplet graphs similar to linked data fragments with envelopes. decoupling content addressing from versioning

  • Merkel DAGs
  • Envelopes
  • Versioning
  • Typed objects with formatting

Containers

  • Packets of LD-triplets that contain
    • Hash of triplets
    • Encryption Info (if applicable)
    • Permissions scope
    • Signature
  • Anything that can be directly referenced without local qualifier is a container.
    • Triplets within a container can be referenced with the query syntax
  • Containers also behave like "feeds"
    • Eg. one might put their blog posts in @user:blog or
  • The account identifier is the top-level container.
  • Ordering:
    • Every triple within a scope is ordered by default by the time it is declared
    • A container can declare its ordering (see vocabulary)
  • Naming:
    • Each container intended to be directly referenced SHOULD contain a name so it can be referenced w.r.t its parent: @<ACCOUNT>:<name>
    • Each container can also be indicated numerically
  • Identity: Each container is uniquely identified by the hash of its contents and the hash of the account identifier.
  • Format: A container can specify one or several ways it can be displayed
  • Capabilities: A container can specify different capabilities that another account can take (eg. "Like", "Upvote", "Reply")
    • Capabilities should also contain a permissions scope, if none is present, the global scope is assumed.

Triplets

  • Triplet format
    • Objects require a shortname that can be hierarchically indexed from
  • Types/Schema
  • Including intrinsic notion of nesting
    • every object can have blank/positionally indexed children
    • every triple can have blank/positionally indexed "qualifiers" like RDF-star or wikidata's qualifiers.

Schema

Codecs

See IPLD Codecs and Linked Data Platform spec

Means of interacting with binary data.

Describes

  • Format

Versioning

  • A given container has an identity hash from its first packing
  • A given triple can be contained by

Vocabulary

Imports

  • skos:sameAs - for declaring that a given triplet is equivalent to another.

Container

  • ordering - how the children are to be ordered
    • declaration - makes numerical references stronger, but less predictable.
    • alphabetic - makes numerical references weaker, but more predictable

Social

  • Containers of other accounts
  • proxy identites: a given identity can specify a collection of alts that can only be resolved with the correct permission scope - so eg. a public account that is stable can be linked to by an abusive user, but they won't be able to resolve a more private alt.
  • Peer Relationship Types
    • Other peers can be given special roles that allow them to operate on behalf of the peer in mutually independent ways:
    • Keybearer - also share a given private key,
  • Visibility
    • A peer can indicate that it is visible to a given scope as defined by a collection of peers and associated rules.
    • eg. a "close friends" collection could be given the visibility rule to make a peer visible to n-deep friends of friends.
    • A

Querying

How do we find peers that have subgraphs that are responsive to what we want?

Syntax

Location

How to refer to a given container, eg.

@user:containerName:childName

or numerically

@user:containerName:{0}

Children

Version

How to refer to a specific version of a container

References without version qualification indicate the most recent version at the time of containerizing the links.

Query Fragments

Using blank subgraphs to specify queries

Encryption

How can we make it possible to have a protocol that is "open" when it is intended to, but also protects privacy and consent when we need it to?

Federation

Making supra-peer clusters with explicit governance and policies for rehosting and sharing!

  • Creating federations of peers

Sharding

Splitting data across multiple peers within a federation

Moderation

Federations MUST maintain a list of

Backwards Compatibility

  • HTTP
  • Bittorrent
  • IPFS
  • ActivityPub

HTTP Servers

  • Using existing HTTP servers as web-seed like things.
  • Use codecs to indicate the format and metadata of existing files
  • Use HTTP servers as backup mirrors that behave like peers, and how peers can indicate them as mirrors for a given container

BitTorrent

See BEP 52 - Bittorrent V2

IPFS

ActivityPub

  • Mappings:
    • Container <-> feed

Evolvability

Sketchpad

Dummy change to check that we don't invalidate the rust cache on CI.

System Diagram

Just a stub to check if mermaid works

erDiagram
	IDENTITY {
		string hash
	}
	INSTANCE {
		string ip
		string client
	}
	BEACON {
		string uri
	}
	IDENTITY ||--o{ INSTANCE : runs
	BEACON }o--|{ INSTANCE : links
	BEACON }o--|| IDENTITY : represents

Graph Data Model

  • Triplets
  • Containers
  • Codecs