Why Running an API Directory is Harder than it Looks (And Why API Providers Should Care)

Neatly concealed in this tale about researching a financial API for inclusion in ProgrammableWeb's API directory is some prescriptive advice for API providers who want to best optimize their developer experiences for maximum developer discoverability, explorability, and comprehension. Unlike a lot of the other prescriptive content we offer on ProgrammableWeb, this article doesn't offer advice on what to do. But I think you'll agree; after following our journey, you'll have some idea of how to minimize the friction in your own developer messaging (as well as a bit more insight into how we do what we do here at ProgrammableWeb!). 

Out of ProgrammableWeb's three major functional areas -- our article content, our research, and our directories (like the API and SDK directories) -- we are best known for our directories. They drive over half of our overall traffic. I often describe our directories as the place where the two sides of the API community coin -- the API providers and API consuming developers -- congregate to find each other. Thousands of developers from around the world come to our directories every day to look for APIs to include in their next Web, desktop, mobile, or server app. Not coincidentally, in a fish-where-the-fish-are sort of way, ProgrammableWeb's directories are widely acknowledged and promoted as one of the best places for API providers to list and improve the discoverability of their wares to developers. It's a symbiotic relationship.

But running directories like the ones we run is more complicated than it looks; at least if you're constantly optimizing to best serve the information needs of that community. Ideally, as suggested by projects like apis.json and the schema.org WebAPI schema, this can all be done by machine, requiring very little human intervention. If the entire API economy could agree to a common machine-readable schema and all API providers could publish something for each API that conforms to that schema, problem solved. Right? 

Well, that's a couple "ifs" right there and there are many more.  This isn't to say that standards can't or won't be useful. In fact, ProgrammableWeb will only get better as such standards take root because of how a goodly half of our battle lies in just finding the assets that should be listed in our directories.

But, the minute you dig into the real information needs of an end user looking to take advantage of such directories, you begin to discover all kinds of nuances and edge cases that are not only difficult to capture in such standards, but that can foil the sort of searches and queries the standards were meant to enable in the first place. Offering the best possible outcomes to developers and API providers means that humans still have to keep close watch over the User Interface and solve for great experiences. Do you remember the days when there was a relatively simple approach to informing a search engine like Google or Yahoo about your website? You just loaded your webpages with HTML meta tags. Search engines would crawl them, and automagically, there was nothing left to do, right? Search engines worked perfectly. OK. Maybe not. It wasn't long before that approach proved insufficient for a variety of reasons, not the least of which was how meta tags were gamed in hopes of achieving better search engine rankings. The fact that Google, Bing, Yahoo and other search engines still exist is essentially proof that the Web search experience is too easily corrupted if left to some basic algorithms and schemas with no human intervention in the final outcome; the search result. 

If we could reduce our directory to a flat table of the most meaningful fields, we'd do it. But the nature of the API economy and the tastes of its consituents are in constant flux. Capturing the meaningful data in a way that results in meaningful outcomes ultimately falls back to the maintenance of a highly relational database (RDBMS) that catalogs a healthy amount of metadata per asset and that's underpinned by a series of human controlled vocabularies. The relational nature is extremely important when you start to think about how the more mature API providers are running multiple architectural styles (eg: REST and RPC) for each of their APIs, and then for each of those styles, they have multiple versions operating in various stages of "production"; everything from a beta version through multiple "current releases" through their retired (but not necessarily deactivated) versions. Or how about the simple one-to-many relationship of APIs to SDKs? If for example, you're looking for all of the SDKs for the Twitter API regardless of whether those SDKs are from Twitter itself, or some third party, we've got you covered. Just go to our profile for the Twitter API and click on the SDKs tab.  

Where do human controlled vocabularies come in handy? Never mind the all the unique categories, architectural styles, and supported request and response formats that we have to keep track of (and keep unique), you can imagine the way in which we might easily get thrown for a loop when the terms " SDK", " Library", "client", and "binding" are all used to refer to the same exact type of asset. To help our readers actually find what they're looking for -- like some language specific tooling to more easily consume an API in the language of their choice -- our directory has to be very opinionated when it comes to the unique classification of that tooling (regardless of what the provider of the tooling calls it). Provider X may call that asset a library. But we call it an SDK and classify it accordingly.

Be sure to read the next API article: Financial APIs Continue to See Big Growth