Begun, the data wars have. The last decade of Big Tech fighting to harvest your data was just a skirmish. The real battle begins now. Large tech corporations like Google, Meta and Microsoft have made sweeping changes to their ToS to legitimize ever grander larceny of your private information, and wall up their own data output from use by other players as far as possible.
AI Needs Your Data
AI is driving this. ‘Scale is all you need’, is the current AI credo. The larger the volume of data in the training set; the better the performance of the AI model. Big Tech are now doing everything they can to pillage larger and larger swathes of data both from their users and each other in order to give their AI models a competitive edge. Of course, there are other ways to refine LLMs to increase performance, but for now we haven’t even begun to explore the ceiling for how far we can take AI using pure volume of data alone, and the current route to progress is set.
Which means the digital landscape becomes even more treacherous for the rest of us. Already, tech companies have the ability to capture more data than they currently target. Your thumbprint patterns on your smartphone, your eye movements on a webpage through your webcam, the things you type but then delete before you search.
Corporate Overreach
Legalese holds them back, but Big Tech has been the subject of countless data compliance lawsuits that you have to wonder what they are doing that the authorities don’t know about. In the frenzied rush for competitive AI advantage, more than a few ToS’s may be temporarily junked, or altered. A few rules may be bent, or flat out broken. OpenAI has already been sued for its copyright infringement.
They don’t care, they’re financed. Ask for forgiveness, not permission, etc. Words on a page mean nothing without money and the state (and ultimately violence) to back it up. You think they’re going to side with you over Microsoft? With current genpop excitement over AI burning strong, we may all find ourselves handmaidening into a data dystopia even darker than the one we currently reside in. One where every digital action recorded to feed the promethean furnace of AI forging.
AI is cool, yes. AI is important, yes. Advancing AI models advances human progress, agreed. Yet we should not erode what is left of the freedom and sanctity of our data to achieve it. Using blockchain as a keystore for verifiable data behavior in a way that prevents TOS abuse is possible. Creating data-management technology with decentralized access controls, data permissioning and authentication for enforcing the smart contract is key. With its transparent accountancy, and immutable history, blockchain is a perfect antidote to this wanton guzzling of personal data by Big Tech, while also providing a theoretical solution to crowdsourcing global data at higher volume than ever before to train AI models. But by itself a blockchain is insufficient for total data transparency, we need further full-stack decentralized solutions for all software handling data.
Source Network Makes Data Sacred
Source Network’s tools allow for radical data ownership both personal and professional. AI developers or blockchain protocols can use DefraDB, LensVM, the Orbis Secrets Engine and the SourceHub blockchain to decentralize their tech stack. Access Control Policies (ACPs) deployed on SourceHub are the decentralized trust mechanism for interacting with the modular DefraDB database . It allows projects to manage data storage, access, and use without ever needing to use AWS, GCP, or Azure. Or any other potential vector for data harvesting by overreaching Big Tech companies that you’ve surrendered custody of your data to, willingly or otherwise.
DefraDB is infra-agnostic, however. If you are still only on the path to decentralization, you can deploy it comfortably in a cloud environment. Of course, if on a centralized server, then the owner of that server can restrict access or terminate it, but they will not be able to use the encrypted database it stores.
The rise of Edge computing for AI and Big Tech using the IoT’s to capture yet more unwarranted information to train their models poses another threat. Source Network alleviates it, with DefraDB’s modular deployable database backed by SourceHub means Edge devices can be maintained locally instead of sending data to a central server. Data for models operating can be willingly given by participating Edge devices, with the data provided used only for that AI model, rather than every digital fingerprint a device makes being sold in batches on the data market to 3rd party AI companies seeking to exploit it.
A Better Data Market
Source Network creates hope of a more utopian, transparent market for data where companies pay users for access and that access is explicitly and narrowly defined, with permissions maintained by a decentralized ledger ordained by a global network. This situation won’t slow down AI advance, it will speed it up.
Source Network is not in the business of actively creating those marketplaces, though. Rather, we have created the tools that enable application builders to have the freedom to determine their own market for their data, and their own path to its distribution. We don’t want to compete with The Graph, we want to create the tools so that anyone can compete with it or other data play projects by changing the fundamental conditions of the marketplace from data-harvesting to data-contributions.
Rather than the current horror of business overreach, civil suits, reckless intrusion into private data, corporate espionage, and ever more intrusive centralized software built to capture and exploit as much data as possible, we could have a future where all an individual's or company’s data could be used to train models, but they are compensated. Yes, this application does sell your data, every time you use it you receive value for that fact. No, this application does not use your data, and you, the end user, can verify that on-chain.
In extreme cases? Sure, you can track my eye movements, read every word I type, even the ones I delete, and process my browsing history and page-click patterns in real-time AI models, but in return you give me a Universal Basic Income of, say, 1 $mBTC a month, and I can at least comfort myself that I am advancing the greater AI good, and that Roko’s Basilisk might spare me in the end times.
Digital Public Goods
These kinds of digital public goods built from digital public licensing managed by crypto and its access-control and value storage properties is the utopian future that is causing such a huge upward asset trajectory for AI companies and crypto tokens currently. Yet that divine vision turns hellish very fast if inappropriate human safeguards governed by the immutable code of the blockchain are not put in place.
We can do it, we have the tools. Source Network has built them. By ensuring every layer of every modern tech application is decentralized, we can birth AI models as a digital public good and shut down the grim reaping of Big Tech’s data harvest strategy. What’s more, we give everyone both a stake in and a chance to be rewarded by the growth of LLMs and the use of our data building them. We are the source of the resistance in the data wars. We are equipping all builders with the tools to own their data forever. A global consensus that demands peace and prosperity in exchange for the world’s newest most vital resource.