Why The Semantic Web Needs a Blockchain Stack

Data recently surpassed oil in value, and enterprises are scrambling for their share of the wealth. But the data is stuck. Mid-sized companies each use an average of 200 SaaS apps. Online surveillance leaks additional data from both business and consumer apps. Without direct, intelligent, and secure information sharing across applications and organizational boundaries, the “new oil” is blocked to all but a few giants.  

It is also a sign that the client-server model of Web 2.0 has reached the apex of its evolution. Web 3.0, or the Semantic web, would mitigate the chaos by creating a universal ecosystem in which all data is vetted, shared and reused. Instead of pinging between clients and server farms in scattershot silos, applications would operate in the transparent, trustworthy way necessary for machine-to-machine communications to thrive. 

So why isn’t the Semantic web here? Despite progress toward machine-readable standards such as XML and knowledge graphs, there is still no centralized way to secure the integrity of data. Adding cryptography, standardization and cross-boundary query protocols like Blockchain offers a solution to bridge the trust gap and finally take the internet to where it needs to go. 

A Glimpse at the Semantic Web

The Semantic web is the necessary next step in the evolution of the internet. The first iteration of the web gave us widespread access to information. Web 2.0, the Platform web, evolved when we realized that we could build complete infrastructures atop of this information: SaaS, IaaS, the app economy, etc. 

The Semantic web will be the first internet based around machine-to-machine (M2M) rather than human interaction. Next-gen technologies such as autonomous vehicles, AI assistants, Machine Learning and cryptocurrencies rely on the efficacy of M2M communications. It will also serve the routine needs of data-driven organizations by unifying data assets, allowing secure sharing and collaboration across enterprise, supplier and governance walls; and enabling relationships within bloated application stacks. 

Whereas today’s Web 2.0 databases are one-to-many (one application serving many parties), the Semantic web would enable seamless relationships between multiple data consumers, data sources, and Integration procedures/platforms. In other words, the Semantic web would realize Web 2.0’s broken promise of data-first architecture and a data-driven, interconnected approach. 

It’s Not Here Yet

Web 3.0 has never fully manifested since the internet giants took control and centralized data in the 2000s. It did, however, spark a fervorous following in standardization (For example: Web Ontologies help universally describe categories (collections), RDF is an atomic format that can underlie and describe any data type, SPARQL is an RDF-consuming Query Language that was built to natively query multiple data sources) with the goal of making all data on the web ‘machine-readable.’

This catalyzed deep analysis initiatives with knowledge graph technology surrounding data in health care, industry and information science. Many mistake the Semantic web as only the ‘standardization project’ of the 2000s, but these standards were a means to a greater end. The true Semantic web — a universally data-driven ecosystem in which all data can be shared and reused across application, enterprise and community boundaries — hasn’t taken complete form. And with emerging applications in ML and AI, a Semantic web of information readable by machines is still very much needed.

Stack, Meet Semantic

The guts of the Semantic web would look different from the current model. Developers will build thin, lightweight application layers to consume and manipulate data rather than building around the client-server stack. Apps will automatically share with each other for better collective decision-making. They will contribute to and leverage trusted sets of shared data without relying on the messy status quo of replication, data lakes, API bloat and redundant harmonization procedures.

This data-first architecture approach to building applications and workflows will live in a semantic ecosystem of multiple data sources. Users will have clear visibility, ownership and stake in the data/identity they produce and use, not some opaque version provided by a centralized authority. Applications that have permission to data across business units, companies, industries and the world can ad hoc query at will and combine data across their connected repositories.

Essentially, the Semantic web will connect all “things,” data sources and computers. It will do so using languages, structures and general frameworks that are vital to the success in connecting the worlds’ data. 

The Hurdle to Mass Adoption

In order to truly facilitate secure data collaboration across entities, trust is a massive hurdle. Opening up repositories of data must be met with a plan to secure information at the source (and not defer security to APIs or rebuild security at various data lakes). How are companies and individuals expected to openly expose their data for the readable web if it could be easily manipulated?

In early 2009, Bitcoin introduced a powerful concept: “In Code We Trust.” Through the combination of ordered cryptography and computational decentralization, Bitcoin showed the world that it is possible to inject trust into exposed information in an open transactional environment.

In many ways, it had the power to solve the trust gap that the Semantic web needed to close in order to move to mass adoption. A trusted digital mechanism for securing the integrity of data through irrefutable cryptography, combined with standardization and cross-boundary query protocols, would surely enable a more intelligent web Framework.

Blockchain’s Data Problem

The machine-readable web requires more than cryptocurrency’s specialty in moving assets. In order to query and leverage data as a readable and malleable asset to power applications, blockchain must manage all data in a usable format for applications. 

For example, public blockchains are optimized and designed for very simple asset transfers — satisfying only minimal data management needs. But what happens when we try to retrofit the same blockchain format to enterprise apps — for example, a supply chain ERP application that manages purchase orders, stakeholders, RFID info and more? Building out a blockchain solution that natively handles data and metadata becomes more challenging — in both development, managing integration between proprietary technologies and in aligning information security across the entire stack.

Project managers have discovered that since most blockchains connect through the “business logic” tier of the application stack, they still need to push data and metadata related to blockchain transactions to a centralized, static database system. The lack of complete data storage on-chain leads to an integration layer that is unnecessary (or as some put it: “trying to force blockchain into a stack that simply rejects it”.)

Blockchain Data Integration: The Missing Link

How can we meld data with blockchain to create a workable infrastructure for the Semantic web? The answer lies in creating a single, queryable blockchain data store upon which developers can build a unified and immutable data set. This would open up the ability to build access controls as coexistent rules directly alongside data, give consumers greater control over data and identity ownership, and enable tamper-resistant technologies. Humans and machines will be able to access a broader set of information to make well-informed decisions. Organizations will form data networks in consortium efforts toward greater industry interoperability. 

As data recently surpassed oil in value, the need to leverage, secure, trust and share it is critical. Blockchain technology is the missing step to interconnecting, democratizing and trusting data. As organizations need to collaborate around shared sets of information at an accelerated rate, Web 3.0 technologies will be pivotal in facilitating intelligent data exchange at scale.

Be sure to read the next Blockchain article: Daily API RoundUp: Geekbot, 3Box, Finnhub, Proofy