ProgrammableWeb's New API Directory Data Model Explained

ProgrammableWeb has made some significant changes to the data model behind its directories. Additionally, we also have more directories as a result of this upgrade. Whereas we once had four directories, now we have six:

Libraries and Frameworks (previously grouped into the same directory with SDKs) have each become first class citizens, getting their own directories, leaving SDKs to have a directory all to their own as well. But, where our new model really shines is in the variety of new metadata that we’ll be keeping in each of the directories, how our directories and our editorial content will be connected under the hood, and the sort of power searching, research, and ProgrammableWeb API that they’ll facilitate further down the road.

Editor’s Note: This article is “under construction”  (last modified on July 8, 2016) and is yet to cover the entirety our new data model. We will continue to update it until it's finished. Additionally, we are constantly applying small nips and tucks to our data model that aren't reflected in this text yet (but will be!).

The Background

It’s been about two years since we replatformed ProgrammableWeb.com, an undertaking that proved to be significantly more daunting than we originally imagined because of all the integrity that had to be maintained across the site’s various data sets and logs. When we first began that exercise, we contemplated some fixes to our directory’s underlying data model; changes that would better reflect current and future trends in the API economy.

For example, more Web and mobile apps and their developers are consuming APIs through platform specific software development kits (SDKs) versus interacting directly with APIs in their code.  SDKs -- whether offered by an API provider or third party developers -- are therefore now standard fare for developer portals. At least the good developer portals. So, to better serve both the developers and the API providers hoping to service them, it was clear to us that we had to do a better job at making record of those SDKs while making them easily searchable on ProgrammableWeb.

Another example has to do with the proliferation of Javascript APIs that are just as likely to be mashed into a Web application as Web APIs. It seems like only a few years ago that AJAX was the primary agent of interactivity within a browser-based application. Today however, between the plethora of both standard browser-based API specifications from the World Wide Consortium like the Media Streams and Capture API and the proprietary APIs that are specific to browsers like Chrome, Firefox and Edge or ones that you can add through frameworks like Dojo, ProgrammableWeb had to find a way to account for the different types of APIs that weren’t necessarily your grandmother’s plain old Web API. 

A third type of API we’re seeing more of are the ones that come built into a product that you typically install; for example SugarCRM. Whereas one company’s implementation of the API that comes with its SugarCRM installation would constitute a Web API (since it has its own unique endpoint), the standard generic SugarCRM API that all such endpoints comply with is what we think of as  “Product API.” In fact, our new data model now accounts for five different API types (discussed later). 

The more we considered the types of changes to our data model that made the most sense, the more that we realized how we were talking about way more of an undertaking than a few nips and tucks. The sort of data model we had in mind was so vastly different than what was in place at the time, that designing and migrating to a new one needed to be an epic project unto itself.

As the new model evolved and took on several important nuances, it became quite evident that we’d have to author a detailed explanation; especially for those who are looking to add an asset to any of our six directories through one of our forms for APIs, SDKs, Sample Source Code, Libraries, Frameworks, and Mashups (Web and mobile apps).  As it turns out, the data model behind ProgrammableWeb can also be very useful in terms of visualizing how the API economy itself is organized.

How Content Is Connected Under ProgrammableWeb’s Hood

ProgrammableWeb has two primary sides to it; Editorial and Directory. Prior to our replatforming exercise in 2014, the two sides operated quite independently of one another. For example, editorial content was classified according to a different set of categorical tags than directory content was. Today, there is a single list of categories that all content -- editorial or directory -- is classified with. These categories serve as one of the spines of ProgrammableWeb. For example, from our Mapping Category page, you can jump to any editorial (articles) or related directory assets that are also tagged for that category.

Example of the Types of information available through a Programmable Web category page

This idea of organizational spines is critical to ProgrammableWeb because our categories are not the only spine to the site. Another spine to which a lot of our editorial and directory assets can be connected is the Platforms/Languages spine. Whereas the Category spine contains content categories like Mapping, Social, and Weather, our Platforms/Languages spine contains classifications for all the major platforms and languages to which editorial (eg: a Javascript tutorial) and assets like SDKs, Libraries, and Sample Source Code can be attributed. This will not only enable ProgrammableWeb to build out centers of gravity for languages and platforms like Javascript, Ruby, PHP, etc, it will enable some pretty fancy power searches and API interactions down the road.

In the end, the idea is for you, the ProgrammableWeb community members, to be able to pivot your view of our content according to any perspective that’s important to you regardless  of whether your point of entry is our Web U/X, power search, future mobile applications, research charts (yes, think spreadsheet data pivots!), or our yet to be published API. In addition to Categories and Languages/Platforms, some of our other spines on which you’ll be able to pivot our content include Companies, Products, and most importantly (and true to our roots), APIs.

APIs: The Epicenter of ProgrammableWeb

For ProgrammableWeb, pretty much all roads lead to APIs. When our editorial content mentions specific APIs, you should be able to quickly find your way to those API directory entries. When you find the SDKs you like, you’ll see (and can find your way to) the APIs they abstract. Our Sample Source Code directory will invariably involve code samples that consume one or more of the APIs that can be found in our directory (either directly or through SDKs). The entries in our Library directory are platform-specific solutions for providing one or more of the APIs in our API directory. And, the frameworks in our Framework directory are often a prerequisite to the libraries, sample source code, and, in a few cases the SDKs featured in our other directories.

So, APIs truly live at the epicenter of ProgrammableWeb. Web-based APIs are what turned the Web into a programmable platform and are also what inspired the birth of ProgrammableWeb in 2005. But, over the years, the Web as a programmable platform has grown to include other important technologies and types of APIs. As we’ve watched the API economy mature, we’ve recognized several opportunities to improve on the data that we’re tracking for each of the assets in our various directories.

With APIs being central to everything we do, the API directory has taken on some serious improvements with more in the works. Whether you are about to input a new API into our directory through our Add API Form, or you’re just trying to get a better understanding of what APIs are and how to think about them, this article will serve as some great background. Additionally, all of our forms have extensive help text that echoes much of the content found in this document. Just mouse over the question icon to the right of each field to activate the help text as shown below:

Help Text Showing the Different Types of APIs on Programmable Web

Here’s a summary of the fields we track in our API directory along with some of the thinking that went into those fields. We will no doubt run into edge cases that stretch our thinking about this structure. Please do not hesitate to contact us at editor@programmableweb.com with comments, feedback, and suggestions regarding the choices we’ve made.

Uniquely Identifying APIs

One area where we are changing our approach has to do with what counts as a distinctly separate API. As we look to the future, it’s imperative that we catalog the various versions and architectural styles of APIs correctly. In the past, we kept one record for every API regardless of the number of versions or supported architectural styles. For example, if Acme API Provider offered the same data or functionality through both a RESTfully-styled API and an XML-RPC Web services-styled API, ProgrammableWeb typically recorded that as a single API that supported multiple protocols.

Moving forward, those will be treated as separate APIs because, in reality, they are separate APIs. By consolidating them into one record, we experienced all kinds of downstream inelegance that made no sense. For example, in all cases like the previous example, if the API provider supported two or more architectural styles like REST and RPC, the provider also offered a different API endpoint (the Internet address where the API could be reached) for each of those styles. Yet, our schema only allowed for one endpoint, forcing us to pick one.

We could have solved the problem with a typical one-to-many table structure in our database. But, ultimately, they are different APIs, often involving different metadata. So, they deserve to be recorded as such.

Additionally, the various versions of APIs were starting to throw us for a loop. When the API economy first got started, there wasn’t a lot of thought being put into API, SDK, or Library versioning (an art unto itself). There were just a lot of version 1.0s of a lot of APIs, SDKs, and Libraries. In many cases, when API providers decided to upgrade to a new API or SDK, they’d just break the old API much to the disappointment of many developers. Over a decade later, hard lessons with respect to proper API and SDK versioning have been learned. For example, lessons about how not to alienate developers with sudden API changes that unexpectedly break their apps.

As a result, many API providers now stagger their API versions, each of which typically lives at a different endpoint. Before shutting down the old version of an API or SDK, API providers will often run the new versions for a year or more in order to give developers plenty of time to migrate. In fact, this is a strongly suggested best practice.  In other words, the API provider is simultaneously supporting two or more versions of an API or SDK at once. In some cases, an even older version of an API could still be operational, but unsupported by the provider.

ProgrammableWeb’s directory had no way of dealing with this more modern scenario of API lifecyle management. So, in the same way that ProgrammableWeb treats the same API under two different architectural styles as two physically different APIs, we will do the same for multiple versions. Once an API provider officially puts an API out to pasture, we will retire its record from all default views. But for those APIs that are being phased out, we will keep their records active while marking them as deprecated and eventually, their records will point to the newest recommended version.

API Name

Moving forward, as we complete the migration of our data model from old to new, the official title of our directory entries will be algorithmically derived from a combination of the API name for it, its architectural style, the word “API” (to distinguish it from some other type of asset like an SDK), its version, and in some cases, some special text that notes it is an unofficial API. For example:

  • Concatenation Logic: {API Name} {Architectural Style} API v{version number}
  • New Relic REST API v2.0
  • Twitter Streaming API v1.1
  • Unofficial Spotify REST API v1.0

In these two cases, the actual API Names are “New Relic” and “Twitter” respectively. API Name is a required field on our API entry forms and per the help text for that field, the name of the API in most cases should include the name of the company at the beginning. “API” should not be included in the name you provide and very importantly, your API’s different resources should not be treated as separate API directory entries (yes, some API provider have tried this!).

API Provider’s Home Page

Today, we are looking for this field to be manually filled out with the URL to the home page of the company or organization that is providing the API. This is typically different from any of the API-specific pages (home page to the developer portal, documentation, etc.). Over time, as we evolve our company/organization vocabulary (one of several vocabularies that are are part of ProgrammableWeb’s taxonomy) and we look to associate API’s with the companies and organizations that are providing them, this field will be automatically populated on the basis of the associated company/organization and the home page data found in that vocabulary.

API Portal / Home Page

This is the URL to the main landing page for the API. Basically, this is the API's "home page" that points to all other relevant pages (documentation, registration, API console etc.) for this specific API. This is a required field.

API Endpoint

Mainly for RESTful APIs (but also for some others), this is the base URL used in API calls. This may not applicable to all APIs and is therefore not a required field. For example, if the API's type is "Browser” (see below regarding API types), then the API very likely does not have a URL endpoint. Another example is with product APIs where each installation of a product that comes with APIs (SugarCRM for example), will have a completely different endpoint.

Version

Though not a required field, we strongly recommend its inclusion since most APIs eventually get versioned. A “V” or “v” for “Version” is not necessary. As said earlier, moving forward, since API providers often operate more than one version of an API at the same time, each version of an API should be recorded separately in our directory. Also, as said earlier, the contents of this field are used to algorithmically derive ProgrammableWeb’s title for each directory entry.

API Type

The API’s type is one of the most important new fields that we’re using to help better classify APIs, thereby facilitating more interesting queries (through power search and ProgrammableWeb’s own API; two features on our road map). API Type is a required field that’s powered by a drop down with five choices; Browser, Product, Standard, System/Embedded, and Web/Internet.

After many years of running our directories and watching where the Web as a programmable platform is heading, we believe that in order to best serve ProgrammableWeb’s audience, it’s important to support all of the different types of APIs that can be mashed together into something new and innovative. Web and mobile app developers, as it turns out, don’t always rely on Web APIs alone to build their applications. Now, thanks to modern Web and mobile development technologies and platforms, more APIs of different types can join the party. With these five types, we believe we can capture more than 99 percent of the APIs that developers might be looking for in the course of building their apps.  

It is important to note that we had to wrestle with some difficult decisions when it came to distinguishing between these different API types. It is based on our many years of recording information about APIs in our directories and is mainly designed to facilitate the long term utility of our database to the API economy. Here’s a deeper look at each of the five API types:

A Web/Internet API is pretty straightforward. It is a networkable API that is addressable via a unique endpoint from across the Internet. It is agnostic to architectural style (see “Architectural Style” below), supported payload types (see Supported Request and Response Formats below), or even the underlying protocols. The grand majority of the APIs in ProgrammableWeb’s  directory are of this type.

In recent years, we have noticed how many products are shipping with APIs bundled into them. As such, we have set aside an API type of Product. If the API is networkable, but it's built into a product in a way that those who install will inherit all aspects of the API for their own implementation of that product, then it is a product API. Examples are the networkable APIs that come bundled into products like SugarCRM. In our API directory, there should be one entry for the SugarCRM API as a Product API. It would not have an endpoint since the API’s endpoint doesn’t technically exist until the software is installed. Then any organizations that wish to make their implementation of that SugarCRM API discoverable in our directory would add that implementation as a Web/Internet API. Whereas the original SugarCRM API of type product would like be titled “SugarCRM REST API v1,” ACME Industrials’ implementation of that API (as a result of installing SugarCRM and activating the API) would be titled “ACME Industrials SugarCRM REST API v1.”  Alternatively, ACME Industries might elect to call its implementation of the SugarCRM API something completely different. For example, ACME Industries Customer List API.

Thanks to the 1-2-3 punch of HTML5, CSS 3, and Javascript, browser-based applications are nipping at the heels of their native counterparts in terms of performance, functionality, and usability. A major contributor to that technical prowess is a relatively new class of standard and proprietary Javascript APIs that respectively come from the World Wide Web Consortium (W3C) and the browser makers like Google, Mozilla, Microsoft, Apple, and Opera. These are Browser APIs.

We believe the distinction between W3C browser APIs and proprietary (browser-specific APIs) is very important. For example, if a developer wants their Web app to be able to check the battery status of the local device (regardless of whether it’s a desktop, tablet, smartphone, or some other device with a browser), there’s a W3C standard API for that, and all the major browsers support it across all the major platforms (Windows, OS X, Android, iOS, etc). This way, the same Javascript code that calls the Battery Status API on one browser works on all browsers (so long as they support the W3C standard). However, not all Javascript APIs have been standardized by the W3C. Whereas the various browser makers might be talking about working together on a new standard (in other words, it hasn’t even entered the W3C stream of consciousness yet), other APIs are simply proprietary to one or two browsers. When relying on those APIs, developers have to take special precautions in their code to avoid calling them when an end-user is using an unsupported browser.

Our long term goal at ProgrammableWeb is to make it easy to research all of these browser-based APIs and to easily discover which ones have W3C support and which ones are proprietary to specific browsers.

Though rare, you should select the Standard API type when the API in question is a standard (or proposed standard) non-browser API specification that other APIs might comply with. Examples of this are Google's Mobile Data Plan API Specification or the various OpenStack API Specifications for which there are multiple implementations. Another example is the CWR file standard which was created by CISAC for publishers and societies to share discographic data. Eventually, an API specification for working with this standard turned up and it is this API spec that would be an API of the type "Standard" as well. Any Web/Internet or Product APIs that comply with such a standard specification should be entered into the API directory separately.

For example, whereas the open specification (just the specification) for an OpenStack API would be classified as a Standard API, RackSpace’s implementation of that API in the cloud should be viewed as a Web/Internet API and Red Hat’s implementation of that API that comes with the locally installable Product name here should be viewed as a product API. Why? Whereas Rackspace’s implementation involves a single shared Web-based endpoint for all developers, Red Hat’s implementation might involve separate endpoints for every installation of its products. Furthermore, a body of source code that developers can use to provision an OpenStack-compliant API as part of any software they’re developing should be entered into ProgrammableWeb's Library Directory and then linked to the entry for the corresponding OpenStack Standard API in ProgrammableWeb’s API Directory.

When an API is mashable into a Web apps but is specific to a local operating system or device, ProgrammableWeb views that as a System/Embedded API.  For example, an API so that a Web app may access the fingerprint sensor on a smartphone. Eventually, like with cameras and audio, these hardware related APIs may eventually get covered by the W3C's Javascript standardization efforts. But until they do, developers must work with proprietary APIs that vary from operating system to operating system and device to device.

Stay tuned as we have many more details to add to this document over the coming weeks and months.
 
David Berlind is the editor-in-chief of ProgrammableWeb.com. You can reach him at david.berlind@programmableweb.com. Connect to David on Twitter at @dberlind or on LinkedIn, put him in a Google+ circle, or friend him on Facebook.
 

Comments