Article

How DefraDB Uses Peer-to-Peer Networking to Make Application Data — and Communication — More Secure, Scalable, & User-Centric

// November 20, 2023

Hello, welcome back to another edition of Source Network Concept Explainers. This time, we'll go deep into another critical element of the Source Network — and DefraDB stack — Peer-to-Peer (P2P) networking architectures, which enables DefraDB to be globally available and replicable on any device.

While our use of the word networking doesn't pertain to human beings building connections on LinkedIn or navigating slightly awkward conference meetups, there is still some powerful connection within the technology at hand. In our case, however, this "connection" pertains to easily connecting (and managing) data across nodes, servers, and utility applications.

DefraDB, our NoSQL, decentralized database, relies on peer-to-peer (P2P) networking architectures to make the database globally available and replicable on any device.

This is an essential element of its design — and reflective of our prioritization of user-centric, decentralized frameworks.

But first, let's explain what exactly peer-to-peer networking is. Read to the end for access to the open bar (just kidding).

Peer-to-Peer (P2P) networking architectures vs. Client-Server relationships

Client-Server Architecture

A graphical representation of a typical client-server network architecture

Peer-to-peer (P2P) networking architectures are decentralized communication networks where each participant (i.e., a data-querying user or developer) or node has equal status and can act both as a client and a server. This differs from traditional, centralized communication networks, where a central server or hierarchy controls how information is shared or accessed between clients and servers.

Just think of accessing any traditional website online: you type in a URL, and then a request is sent to a domain name system (DNS) server that looks up the IP address that the specific URL (like the URL of this very blog) is connected to. In the background, you then receive an IP address back to your computer, letting you know where to access the website.

Any centralized social media application where people connect online and access info is another example of a centralized communication network. In these networks, the centralized server stores all platform and user data (profiles, posts, relationships, user authentication, images, etc), while clients are the devices used to access the platforms. The server stores and manages the platform's data in a centralized database, overseeing complete control over the server-to-client relationship.

While client-to-server architecture can enable elements like easier networking coordination, implementation and oversight of security measures, and policy enforcement, this is often at the cost of resource efficiency, network security, scalability, and the opportunity for users and applications to truly own their data.

Peer-to-Peer Network Architecture

A graphical representation of a typical peer-to-peer network.

In a peer-to-peer network, there is no central server or hierarchy “calling the shots” (or controlling the data), and nodes, servers, and computers are functionally the same across the network. This removes the classic server/client relationship outlined above. In P2P networks, every node collaboratively participates to achieve the cumulative end goal of the network — the nodes can communicate directly with each other to share all kinds of things — from files to assets and processing power.

P2P architectures, including blockchains like Bitcoin or Ethereum, are widely used for various use cases and their applications, including processing financial transactions, enabling decentralized communication, and distributed computing across a network of devices.

The advantages of P2P networks are numerous and more resilient than their traditional counterparts. Decentralized networks prevent the single points of failure in centralized, client-to-server networks because each node has equal status. If one node is compromised, the system can still function and provide the required resources (bandwidth, storage, processing power, etc).

Additionally, peer-to-peer networks are even potentially more cost-effective as there is no need for a dedicated central server, meaning lower infrastructure costs. Still, P2P networks also have their disadvantages: complex network management, the potential for malicious nodes, and issues around scalability.

DefraDB’s Use of Peer-to-Peer Networking, Built on LibP2P

To enable our peer-to-peer capabilities, we’ve built our entire networking stack on top of the LibP2P library developed as part of the IPFS, a project, and set of composable, peer-to-peer protocols that address, route, and transfer content-addressable data in decentralized file systems. In LibP2P, each node is identified by its multi-address (instead of a centralized, accessible IP address), and each node has its keypair that is used to derive this multi-address. Other nodes and external clients use multi-addresses to communicate, exchange records, or facilitate verification requests.

LibP2P & Peer IDs

As its acronym might suggest, LibP2P includes "peers" as a critical component of its network. A Peer Identity (often written as PeerID) is what the overall peer-to-peer network uses as a unique reference point, identifier for each peer, and verifiable link between a peer and its public cryptographic key (i.e., its Peer ID). In LibP2P, each peer controls its private key, which protects its secrets from all other peers in the network. Every private key then fits a corresponding public key, which is what's shared with other peeps.

Together, the public and private keys form a "key pair," which allows users to establish secure communication channels with each other. A Peer ID is conceptually the cryptographic hash of a peer's public key. It can be used to verify that the public key used to secure the channel is identical to the key used to identify the peer itself.

A graphical representation of the parts that make up a multiaddress

Peer IDs can also be used in multi addresses (or multiaddrs in LibP2P) , whereas a Peer ID can be encoded into a multiaddrs as a /p2p with the Peer ID set up as a parameter. These peer-to-peer addresses can be integrated into other multiaddrs to compose entirely new multiaddrs that then can be combined with a transport address to produce dynamic and useful new content address. That’s a lot of addresses, indeed!

What do peers look like in-action in web3 networks and actual user-focused applications?

An interesting environment to look at is decentralized social media, which leverages peer-to-peer networks to enable users with dynamic, and use case-specific communication — without the need for a centralized, client-to-server architecture. Utilizing LibP2P’s Publish/Subscribe system, peers (i.e., users) can gather around the topic they’re most interested in via the ability to subscribe to a specific topic. Peers can also send messages to topics themselves, which are delivered to every peer who’s subscribed to them. This function is perfect for chat-focused applications where people can share messages specific to the topics they are most interested in, as well as the ability to attach files to specific topics.

In this kind of peer-to-peer public/subscribe system, all peers who are part of the network participate in delivering messages to each other, which empowers these systems with more reliability, speed, and efficiency (the network isn’t overwhelmed with excess, useless messages), scale (topics can handle large throughput of messages), simplicity, and resilience (peers can join and leave the network without disrupting it, preventing single, centralized points of failure).

Peering also can create more interest and network efficiency around topics and conversations, which LibP2P enables via something called gossipsub, which enables peers to connect to a topic via visibility into either a full message or meta-data only part of a message.

Overall, this enables more dynamic management of how people engage with messages in peer-to-peer networks, how the flow of traffic is made more efficient, and how the content users see and share is what they’re most interested in.

How DefraDB Uses LibP2P

Due to DefraDB’s use of Merkle CRDTs, the addressing process diverges slightly from the original intended goals of a content-addressable data store.

Most notably, instead of each record having a CID (Content-Addressable Data) that updates every time the record is updated, Merkle CRDTs create a single Merkle DAG that evolves, and the design of DefraDB assigns a static Document Key ID to the entire Merkle DAG. This helps developers in some respect because, much like traditional Document Stores, they request data with a single ID.

However, to meet the requirements of verification, we need to know the current CID of the head node in the Merkle-CRDT DAG. DefraDB uses the LibP2P Pub/Sub messaging system to let peers in the network collaboratively share and synchronize their updates and what the current Head of the CRDT Merkle-DAG is.

Any node that is interested in a particular Document Key’s change can join the Documents Publication/Subscription channel and track the head updates of a document. Once a peer has a newly proposed Head, they resolve it with the usual content-addressable functions of the network (since the new Head is just the CID of the latest node in the Merkle DAG). Any peer who already has the entire Merkle-CRDT up to the defined Head may fulfill a request from any peer for the request Head. Additionally, they can verify that state.

Clients that do not operate DefraDB peers can still interact with the network using a client library, which communicates over LibP2P or gRPC. Both methods allow clients to Query the data much like any other database. One notable distinction is that clients may query the database directly from their local DefraDB, instead of going through some API. This is possible because DefraDB is user-centric, and all access control is based on user ownership of data, which is facilitated through keypairs.

This is starkly different from traditional databases that use client tokens or basic username/password authentication for access, which wouldn't be secure if access was given directly to users' client devices.

Networking for the Next-Generation of Applications

DefraDB's strategic use of Peer-to-Peer (P2P) networking architectures, built on the LibP2P library, is another example of how our database and tech stack empowers a user-centric, decentralized path forward for applications and their data. By leveraging P2P networks, we've made DefraDB accessible to anyone who wants the benefits of decentralization — from improved security free of single points of failure to collaborative communication and resource sharing, cost-effectiveness, and enhanced reliability, speed, and efficiency.

Our integration of LibP2P's many unique components, like its Publish/Subscribe system, particularly in future-facing industries like edge compute, smart cities, and smart industries (including AI), exemplifies the dynamic and user-centric vision we have for customizable online interactions. We believe in the potential of sophisticated network architecture that gives people more control and even joy over how their data is used, stored, and managed within the environments they engage with every day.

If you’re ready to connect DefraDB to your application to manage your application’s data — and scale your product — contact us on our website.

Explore our GitHub and developer portal for additional documentation on our technology.

Thanks for reading! — Source Network

Dive Deeper

// January 06, 2025

AI at the Edge: The Source of the Singularity

// December 23, 2024

Distributing Data in Every Sector: Tools to Change the World

Stay up to date with latest from Source.

Unsubscribe any time. Privacy Policy