Reaching a Million APIs and What to do When We Get There

Guest Author
Jun. 05 2012, 08:27AM EDT

This guest post is by Steven Willmott. Steven is the CEO of 3scale networks, a company that provides infrastructure services for over 100 APIs. The content draws on a presentation made at Gluecon in May 2012.

The number of APIs available across the public Internet has grown phenomenally in the past few years – with ProgrammableWeb’s directory now reaching 6000 APIs listed, with a 1000 APIs added in just the last 3 months.

While this number is large, it remains tiny compared to the many millions of sites that make up the World Wide Web. However, as code frameworks in major languages and platforms of various types make it increasingly easy to launch and operate APIs it is likely that “web scale” thinking will be needed to manage the resulting API Web.

In a recent talk at Gluecon we discussed how the number of APIs is likely to accelerate significantly in the coming years and also tried to drill down into a number of areas that will likely see big changes and challenges in a world of millions of APIs. These areas were 1) Discovery, 2) Selection, 3) Reuse, 4) Engagement, 5) Monitoring.

While some of the topics that follow have obvious parallels with today’s Web, others are different in structure and potentially require some new thinking on the evolution of the Web.

Discovery

The early web transitioned quickly from a “Directory” paradigm to a “Search” paradigm when it came to finding Web Sites that were of interest. As ProgrammableWeb’s directory grows, it seems likely a similar shift would make sense for APIs. However, there are significant problems in making this happen. In particular, with the well-documented shift from SOAP/XML to REST/JSON APIs, very few APIs on the web now have any form of formal description at all. Formats such as Swagger, Google’s API Discovery Service format and WADL solve part of the problem but they are not widely adopted, nor do they provide significant coverage of all the elements such descriptions may need (in particular they tend to provide functional / structural information only). Even worse, unlike the early Web, there is no natural reason for APIs to link to each other – even those organizations that do provide formal descriptions often do not link to them and there is no common convention on where they may be located on a given domain.

The challenges for discovery are therefore threefold: 1) evolution of current description formats – in particular for the REST world, 2) encouraging wider adoption of these formats and lastly 3) finding a way for descriptions to be more easily discoverable without the need for manual connections. Some of this work is already ongoing – with providers such as ourselves adopting Swagger Specs for all APIs (for 3scale Swagger support see here) and Apigee using WADL to power its consoles. While none of this means these description languages are perfect it should hopefully increase adoption and debate on how formats should evolve!

Selection

The second big challenge area is that, once services have been discovered, it is hard to determine which are fit for purpose and trustworthy. Trust indicators on the Web have evolved significantly over time with domain names, branding visuals, SSL certificates, Google/Alexa ranking and third party references all playing a role in enabling a Web user to evaluate whether or not a website might be trustworthy. Even then, phishing attacks are a regular occurrence.

For APIs the issue is arguably even more important. The use of a rogue API could compromise extremely large amounts of data and transactions, regular failures on an API can endanger dependent application critically. However, the normal web trust indicators are missing or much reduced in an API world. For example, there is no requirement that APIs use current browser root certificates – leading to a potential loss of part of the web of trust. Furthermore current API infrastructure naturally focuses on identifying the consumer of an API and the rights they have, rather than the identity of the provider of the API – something that will likely need to change as the number of APIs explodes.

The challenge in this area is to find effective means to validate the identity of API Providers as well as API Consumers and associate sufficient levels of trust in automated ways without requiring manual / human intervention at every step.

Reuse

Component based, modular, reusable software is frequently thought of as the holy grail of all software engineering. APIs are arguably just the latest iteration of this dream – the hope being that a sea of available services can act as sufficiently simple building blocks in order to build new complex applications (and in turn new APIs).

With today’s technology however, we remain as far from this dream for APIs as we have in previous technology generations (arguably further since Service Descriptions remain so limited in many cases), yet for APIs to succeed such re-usability is more necessary than ever before.

For current REST APIs, semantics are based to some extent HTTP, REST principles and Data type descriptions (which all provide a baseline) but primarily on a very large amount of developer interpretation. The operational semantics of the interactions are effectively hard coded into client side software – making applications brittle and APIs difficult to re-use in any automated way. Arguably this is a little like writing a new browser for every web site – a fun exercise for some but unsustainable at scale.

Luckily there are likely two important trends which will help ease this problem as the number of APIs scales: 1) richer semantic specifications and 2) convergence of API interfaces themselves.

In the first area, HATEOS principles (for example) may help build clients in more flexible ways so API structure can be dynamically interpreted. Semantic service description such as OWL-S or WSMO further may help define the meaning of interfaces more precisely and in general data-dictionaries / ontologies in OWL / RDFS or some other format may help share common data models. These trends are all positive but will likely take time to take forms in which they are widely re-usable and efficient. One could characterize this approach as “top down”: deriving all possible interfaces from sets of agreed principles and rules.

While the top down approach will take a significant amount of time, it seems likely that in parallel a “bottom up” process will bear more fruit in the short and medium term – “convergence” between API interfaces in form and structure. It seems unlikely that there are 50 (sensible) ways to design an email-send API (to pick an example) and hence there are already similarities between APIs that are recognizable. What seems likely to occur is the emergence of increasing numbers of reusable conventions and models that reduce the number of distinct interface types even as the number of APIs increases. While copyright issues (see commentary the recent court case between Oracle and Google) are obviously an important concern here and companies will often wish to preserve competitive advantage by offering API functionality others do not, there seems to be potential for core concepts to become relatively widely shared – reducing integration cost for everybody.

The challenge in this area would seem to be finding ways to encourage sensible conventions and constructs which reduce the cost of implementing against new APIs as they arrive. The hope is perhaps that a type of “interface commons” can be established that embodies core shared interfaces that are widely re-used, yet allows enough extension mechanisms for companies to differentiate in specialist areas. The alternative of millions of APIs each with radically different interfaces requiring bespoke code is unlikely to provide very much value.

Engagement

Today’s workflow for engagement with an API is typically something akin to: locating the developer portal (developer.xyz.com), signing up to agree terms, obtaining credentials, reading documentation and then subsequently writing custom code to call the API. This process can be better or worse depending on the portal / documentation / workflows in place but still primarily requires a developer to jump over a significant number of hurdles before getting any real work done.

In a world of millions of APIs, this process would impose impossible overheads. Alone the development of millions of API portals that would require 10’s and 100’s of millions of developer hours to explore and interact with seems highly implausible. In reality the engagement process needs to evolve radically to something much more automated. Upon discovery of the API, it likely needs to be possible to:

  1. Automatically locate code to plug the API into your current application stack,
  2. Identify yourself and the provider,
  3. Engage in a service contract,
  4. Generate access credentials ,
  5. Begin making calls,

All without the need for any human intervention - apart from possibly to approve the process. While this may seem like science fiction and much will not be possible until better descriptions and reuse technologies emerge – it seems clear that API engagement will need to be radically more simplified and automated than it is today.

The challenges in achieving this depend on descriptions, code libraries and also on the emergence of automated ways of establishing the identities of the parties and setting up a service contract. Today’s API developer portals are certainly improving and becoming more automated, with providers like ourselves and others pushing the envelope – however there remains a significant way to go before reaching such levels of automation.

Monitoring and SLAs

With today’s Web, the main protagonists in its day-to-day use are human users – this drives both UI design and our approach to reliability. While site downtime is already a major risk for many Web businesses, humans as users are in general adaptive and can work around minor issues or changes. This changes radically for APIs – since the data/service consumers are software systems, often with highly specific implementations and often beyond to control of their originators (think 100,000 iPhone app installs which would all need an update if anything changes). This scenario puts a much higher premium on reliability, stability and change management than the current Web requires.

Early APIs were characterized by SLAs that essentially said “Good Luck” and little else. While today’s API SLAs are becoming stronger and more sophisticated (see here for a recent post on useful terms for your SLA), developers are often still wary of change. More generally, every use of an external service generates a significant external dependency that must be managed. We are arguably just at the beginning of understanding how to do this at large scales – the interconnected nature of a Web based on APIs is very positive for innovation, but can also lead to complex failure modes.

Monitoring APIs for up/downtime and change is inherently sensible, but also a significant undertaking for an application developer. The challenge in this area is arguably how to scale monitoring and make it sufficiently detailed in terms of API process steps that validation as strong as that produced in now standard code test frameworks is possible. Companies such as Web Metrics already provide step based API monitoring and multiple vendors provide API uptime monitors but these will all likely need to evolve over time.

Current APIs are very much individually unique and diverse, however as the number of APIs grows it is very likely we’ll need automated means to discover and make use of them.

No doubt some of the thoughts (such as automated engagement of APIs) seem highly speculative. They may indeed take a long time to come to pass, however what seems certain is that, as a minimum, the sheer volume of APIs force us to rethink our infrastructures.

Some of the challenges involved are likely competitive – with infrastructure providers, open source and others striving to find the best solution – but others seem to be largely collaborative, with value coming in consensus and convention in order to come up with good shared approaches. We look forward to contributing where we can and to seeing what solutions emerge!

Guest Author

Comments

Comments(1)

User HTML

  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.