Article

Are You Being Served? The Issue with Indexers

Source Team

Thought Leadership

// October 03, 2024

To achieve a truly resilient and free open internet that restores privacy and sovereignty, you need to decentralize every aspect of its function. A chain is only as strong as its weakest link. We all know by now the threat of centralized control and how blockchain acts as a countervailing force against systems ruled by unilateral authority.

We also know that true decentralization is more than just blockchain - that it involves data management through databases, distributed server architecture, computing power, relayers, content delivery networks, identity, governance mechanisms and more. These things often use blockchain as a trust mechanism (e.g SN’s SourceHub), but also encompass all other functions of the internet. The data stored on the blockchain or blockchains is crucial to how the larger system works, like blood around the body.

Efficient Data Access Difficulties

Accessing that data, however, is hard. Looking for data you need on blockchains which by their nature have the data sequentially appended is hard. Even casually scouring a block explorer for a single transaction made by a single wallet has a certain degree of difficulty. Expand that notion to an app which needs real time data based on ever growing blockchain events, or having to dig deep in the transaction log history for multiple disparate data points, all of which are needed fast to execute a given function - and you begin to see the problem. Data retrieval for blockchains is slow, expensive, or complex. Developers having to manually sift through every block to find the data they need is completely unviable to run or even create the applications at the speed they need to thrive.

You need, of course, indexing. Think of Google’s empire, it built on the fact it indexed the internet’s information better than its competitors - and that’s essentially all the only competitive advantage they needed to become what they are today. They’re still better at it than everyone else, 25 years later - and look at the funnel through which most people access the internet, and the power it grants them. Indexing is a skill - and an important one. Indexing just means organization, taking raw data and making it easy to query, search, find, and ultimately use what you need.

We take indexing for granted every time we open a document and look for a relevant section - the words you see on the page are searchable only due to the work of librarian bots organizing behind the scenes. Except indexers aren’t just for humans trying to find information, but for applications, websites, web services and everything in between when they query datasets. Often a company indexes their own data and makes it accessible via an API. Yet for things like blockchain (just like the early internet before it), you need third party indexers.

Why Centralized Indexers are a Problem

See where the problem emerges? Just as Google dominates search and thus has incredible degrees of censorship, control, and kingmaking ability, so the same is true of blockchain indexers. Let’s be clear, these services are vital - we are not casting aspersions. Merely highlighting that if you rely on the priest to translate liturgical Latin, you can’t be sure if the messages conveyed are those of your God or those of the priest himself. The priest may be accurate with their translation and conscientious with their message. Or they may not, and use it to foster a culture of indulgences. Sometimes, you really do need to shoot the messenger.

Blockchain indexers like Infura, QuickNode, and Covalent do outstanding work, but while they remain centralized and a source of trust for the data they deliver, the more their power accumulates and the more they become open to corruption. We may trust these services, but they are for-profit enterprises and the definition of trusted third parties - and that’s exactly against the mission of building a decentralized, trustless internet. They are a single point of failure that has snuck in. Just as Google may push a certain webpage to the top, so one of these indexers could conveniently ignore or push blockchain data that serves their goals.

Practical Problems with Third Party Indexers

Quite apart from ideological concerns, there are serious practical ones. An indexer, like any other centralized entity, can be hacked or otherwise compromised, leading to mass downtime for all the apps they serve - potentially ruining or at least inflicting downtime on thousands of services. What’s more, the mass profusion and L2 and L3 blockchains all needing these indexer’s APIs results in a cost-spike that then harms the viability of new applications to thrive as the drag on their bottom line becomes too punitive. Niche chains or app-chains will have to rely on their own solutions, which can be impractical at best and simply impossible at worst. This creates a market with too-high barriers to entry that then stifles true innovation and leads to the opposite of the open internet we are seeking to build.

Ah-ha! You say - but what about decentralized indexers? The most prominent, The Graph, is indeed a great potential solution to many of these problems - or at least is marketed as such. Yet The Graph also suffers from the black box problem. Yes, the storage and the serving of the data is decentralized, but the core part of its function - the indexing engine itself, the thing that actually puts the data into those nodes - is completely private. The data it puts into the nodes is not verifiable whatsoever. We don’t have any way of guaranteeing the veracity of the information that The Graph puts into its nodes - we are only able to verify the delivery of it.

Why Data Needs Decentralization at the Source

You see, to decentralize the internet, it needs to be decentralized at the source. Its ground-up construction needs to operate in an always-verifiable fashion. The data itself must be authenticated by its very provenance. That’s what Source Network is building with its suite of tools, a way for data to be organized, managed, retrieved and stored where anyone can ensure its veracity through their own means and by their own actions, and any app can use it for their function - without having to have the information indexed and served to you through expensive API requests. The self-sovereign data indexing processes possible with DefraDB ensure dApps and their developers can inspect the indexing process themselves and validate the integrity of the data. The data is not being served through opaque systems, but transparent ones. There are no black boxes in a Source Network constructed stack, only open windows. Yes, to build a fully decentralized open internet - every part must be decentralized - otherwise what’s the point?

That may sound like it’s hard work, but Source Network’s tools make it much, much easier - and the benefits of scalability, interoperability, sovereignty, and economic benefits, make it worth it.

Dive Deeper

// September 23, 2024

Why Siloed Data, not Liquidity, Is the True Nemesis to Decentralization

Source Team

Thought Leadership

// August 01, 2024

Data at the Heart of DePin

Source Team

Thought Leadership

Stay up to date with latest from Source.

Unsubscribe any time. Privacy Policy