The use of graph databases and the popularity of graph analysis have been rapidly growing in recent years. Technology companies such as Google, Facebook and Twitter have built and now use their own proprietary graph databases. Google created a graph database for Knowledge Graph, a project designed to build "a massive graph of real-world things and their connections." Facebook uses graph technology to power its Graph Search engine and its social graph. Twitter uses graph technology to power the Twitter interest graph, a graph database used for user-interest modeling and analysis.
Recent years have seen an explosion of technologies for managing, processing and analyzing graphs. This has been popularized by leading social web properties like Facebook and LinkedIn, together with Google and Twitter. Therefore as you might expect, popular awareness of graph databases has largely focused around the social graph, and various social uses. In parallel to this buzz though, more than 30 of the Global 2000 have in the past 18 months quietly been deploying graph databases across quite a broad range of business critical use cases. While some are social, the majority actually aren’t.
Graph databases consist of structures containing nodes, edges and properties and are based on graph theory. Graph databases are generally faster than traditional databases and are designed to find connections between entities. Graph databases are considered to be one of the best methods of modeling and querying connected data. Graph analysis is used to understand and visualize these connections and can provide insights into the strength or weakness of relationships between entities. Real-world use cases for graph databases and graph analysis include network impact analysis, social graphs, graph search, recommendations, sentiment analysis, personalization, fraud detection, risk management, geographic routing and logistics.
Picture Puzzle and CIE Chromaticity Diagram were built by Yu-Sung Chang. From the Wolfram Demonstrations Project. Image credit: Wolfram
Graph databases and graph analysis can be used by nearly all industries, including social media, telco, healthcare, financial, transportation, retail and education. In the April interview, Eifrem states:
Graph databases are becoming a defining theme for data management, and we are witnessing their rapid mainstream adoption by industries across the board—not just by telcos. As the graph model offers impressive expressiveness and unparalleled flexibility and insight, today’s rapid uptake of graph databases is powering all kinds of solutions. From social and recommendations to risk and portfolio management and governance, from data center management to supply chain and logistics and beyond, graphs are everywhere and are taking over the world.
Last year Glassdoor, an online job search engine and career community, incorporated a graph database providing members with real-time job recommendations. In March, it was reported that Cray CEO Pete Ungaro said that a Major League Baseball team had bought a Cray supercomputer, which has the capacity to quickly process vast amounts of data. Just last month, ProgrammableWeb reported that JustGiving, a leading social and charity giving platform, is building the new GiveGraph engine, which should be completed by the end of 2014. According to JustGiving, GiveGraph is the "world's largest graph of giving behavior and contains 44 million people, [tens of thousands] of causes and 111 million connections."
There is even an annual conference that focuses solely on graph databases and applications. GraphConnect 2014, presented by Neo Technology (developer of Neo4j), features several Neo4j training courses, and speakers include Neo Technology's Eifrem and chief scientist Jim Webber, eBay's Volker Pacher and Elementum's Shashank Tiwari. The conference will take place in San Francisco at the SFJazz Center on Oct. 22.
Below are a few examples of companies using graph technology. These companies were chosen to show a sampling of the market and also because they provide APIs.
Lumiata, a leading predictive health analytics provider, is using graph technology to create what it describes as the "world’s first medical graph." The company gives health care organizations the ability to optimize patient care using "medical science-based graph analytics," which includes metrics such as time, location, behavior and pathophysiology.
Early last year, Lumiata announced that it had raised $4 million in Series A financing from Khosla Ventures. The funding will be used to develop the company's medical graph-based predictive analytics engine. The Lumiata platform can be integrated with existing health care systems and legacy systems using the Lumiata Graph API. The platform also provides ready-to-use reference apps for health systems, payers/insurance and digital health.
MusicGraph is a graph database consisting of more than 7 billion music facts and connections that can be accessed programmatically using the MusicGraph API. Senzari, a leading music technology company, launched the MusicGraph platform and API in December, allowing developers to incorporate MusicGraph functionality such as graph search, playlisting, social URLs and metrics, and musical features into third-party applications.
- Graph Search: Graph Search is a set of APIs (Artist, Album, Track and Look-ahead Querying) that allow applications to provide users with precise music search queries that return results based on MusicGraph's "rich music ontology." This is similar to Facebook’s Graph Search except it shows rich connections in the realm of music.
- Playlisting: The Playlisting API allows applications to incorporate smart playlists and more precise music recommendations.
- Social URLs and Metrics: MusicGraph Social Data APIs include APIs that return data for artist metrics and artist social URLs. There is also a Track Metrics API, which is in early beta and provides information about the play counts and views for music tracks from popular music services such as LastFM and Vevo.
- Musical Features: The Musical Features API allows developers to integrate basic lyrics analysis functionality into applications.
Neo4j is a popular open source graph database written in Java that scales up to several billion nodes and relationships and features true ACID transactions, a powerful traversal framework for high-speed graph queries and much more. Data is stored in a graph that contains nodes and relationships, both with properties. This type of data storage is known as a property graph and is ideal for storing richly connected data. Neo4j allows individual developers, startups, SMBs and enterprises to create their own graph databases and graph database-driven applications.
The platform can be programmatically accessed using the Neo4j REST API, and an object-oriented Java API is available. The REST API is capable of transmitting all responses as JSON streams, able to access Neo4j's built-in graph algorithms, allows querying with Cypher, and can perform many other functions. The Neo4j API is described in the documentation as "a truly RESTful interface relying on hypermedia controls (links) to advertise permissible actions to users."
Tykli is a retrieval and knowledge management platform that uses a proprietary graph analysis algorithm to generate a property graph representing new, relevant relationships between nodes in the source graph. This process allows companies to organize and analyze data, creating a "new data order" that allows for an improved search experience. Real-world use cases for the Tykli platform include e-commerce websites, corporate portals, media archives and enterprise DMS.
The Tykli API provides programmatic access to the platform that can be used for processing and updating data as well as for consuming data from the Tykli platform, which can then be used in third-party websites and applications. The API is RESTful and responses are returned in JSON data format. API endpoints include number of documents connected, retrieve the detail of a term by ID, retrieve categories list and retrieve related terms.
Wolfram Mathematica is very advanced computational software that uses the Wolfram programming language and is described on the company's website as providing "a single integrated, continually expanding system that covers the breadth and depth of technical computing." Mathematica is capable of performing functions across all areas of technical computing, including networks, geometry, visualization and machine learning. The platform can be used for document processing and is capable of extensive data, tree and graph analysis.
Wolfram released Mathematica 10 earlier this month, announcing the addition of 700-plus functions that are a combination of "completely new areas and directions" such as geometric computation, machine learning and geographic computation. This latest release also features a new API framework that can be used to exchange data with external services via an API. Mathematica also includes built-in programmatic access to the Wolfram Alpha API.
The Internet of Things is quickly leading to the Internet of Everything, with millions upon millions of devices, platforms and other physical objects becoming interconnected. Traditional databases are not designed to handle the vast amounts of interconnected data being generated today.
Using the power of graph databases and graph analysis, companies can leverage IoT-generated interconnected data to obtain valuable insights about their business, their customers and their future.
See also ProgrammableWeb's roundup of machine learning and predictive analytics platforms.