How ESPN Quintupled Efficiency in Its Massive, Real-Time API Program

Some of the most exciting and sometimes most disappointing APIs are popping up in media. As traditional television looks to reach its new multi-device, multi- Platform, commercial-hating, social media-addicted audience, the API is a way for networks to reuse and recycle content in a way that broadens the customer base. Operating in an area completely disrupted by Netflix, its APIs and its competitors like Hulu and Amazon Fire, broadcast companies are now leveraging next-generation APIs to fight back these challengers and raise bar on the competition.

Top global sports enterprise ESPN is one such company that brings together a TV channel, an enormous website, and localized apps for many devices, all in response to a wide range of data caps and bandwidths for mobile.

Senior Director of APIs at ESPN Manny Pelarinos, speaking at last year’s QCon, explains how the company works to leverage APIs “to push data at extreme scale to all of our fans.” We’re talking about millions of concurrently connected fans, with about half a billion API calls each day and about 20,000 requests per second (RPS) during peak times. After all, it all comes down to ESPN’s mission of serving sports fans, anytime, anywhere. They facilitate this using APIs that enable:

  1. Personalization by morphing the website into what you as the viewer find interesting. Pelarinos admitted that this makes caching that potentially massive data a particular challenge.
  2. Globalization for fan localization in local countries, like the U.S., U.K., and India. The ESPN APIs have to worry about supporting different languages, time zones, and regions.
  3. Product-first design and architecture.

ESPN’s API program started out with a handful of partners for whom they built custom feeds. As demand grew, this approach was not scalable. About seven years ago, ESPN released its Version 1.0 REST API, which opened up to around 500 partners. This API was based on reliability, like the event API and content API each were working in the same way for different partners. But this method led to relatively large APIs stuffed to the gills with data.

As Pelarinos said, it truly wasn’t customizable to use, and while they were offering everything to their customers, they weren’t consuming these APIs internally as well.

So the ESPN Web team requested a different API paradigm, a Version 2.0 that balanced customization with reusability, all built on REST.

When we last spoke about the ESPN API on ProgrammableWeb, we were talking about the decision to deprecate the public API — to not fully retire it right away to allow for backwards compatibility, and to communicate to consumers that they would no longer be supporting it. This was back in 2014 when ESPN realized that third-party apps were profiting off its data. Since then the company has made a transition to strategic partnership APIs, which allow it to retain the potential for Big Data monetization. This was at the same time that Netflix did a similar shutdown of its public API, perhaps foreseeing the depreciation of public APIs as a business strategy.

But the ESPN API platform didn’t stop there.

How ESPN Evolved Its Massive API Platform

The result of this change was the Binder API platform, which split everything into one of two distinct service tiers: core APIs and product APIs, with the latter sitting on top of the first. As you can see in the diagram below, the core APIs sit on top of a data layer. From here, they have a series of APIs for different functions and data collections, like players and teams, that pull from each other. Then, within the product API layer, any developer can build custom APIs that allow for groupings of calls.

“We built a [domain-specific language] DSL and a server that makes it extremely easy for our developers — and not just our API developers, it can have other developers like iOS folks, Android engineers, etc. — in the company to build their own product APIs, so they can do easy things like compositing. They can group together multiple, random API calls, so we can make an API call to the Twitter API and join it with events,” Pelarinos said.

However, when you composite things, “you’re likely to end up with a big API like we did with V1,” he said, referring to their first API version.

In the case of the ESPN API platform, they got around this by making it easy to trim out the unnecessary fat. They use a DSL to strip out anything the API consumer isn’t interested in calling, allowing for greater developer productivity and platform efficiency. They accomplish this by allowing the API consumers to include specific parameters — like if they only wanted ID, name and date — which by default excludes everything else. Of course, for a larger API and larger chunks of data, the users can just exclude what they don’t want.

How can this be used? While Pelarinos admits much of the ESPN data is cacheable, the essential personalization part is not. While caching all your preferences would be impossible, the Binder platform can make a single call to the personalization sub-platform to pull the IDs of an individual user. They have broken it up to have the fan call first followed by potentially millions of Asynchronous calls because in the five- to 15-second caching period, someone else’s preferences would have already called on golf results or a certain team’s results. This brief caching period doesn’t put a strain on servers, while still being more than adequately long considering the amount of calls the ESPN APIs make.

Pelarinos also put forth the example of the ESPN Now API, which sorts relevant and recent content, from disparate sources, like social media, news sections, and blogs. In this situation, consumers can make one call to the search engine core API, followed by asynchronous calls for specific content that they want to pull together. This prevents ESPN from having to cache countless combinations at scale.

The key components of the ESPN API platform based on microservices are as follows:

  • Caching with three tiers of caching, set up in a way to handle anti-stampeding
  • Asynchronous Programming
  • DSL + Groovy, the latter of which Pelarinos said originally had some performance issues, but now caching mitigates this
  • Tools and Dashboards, including Trace for tracing binder requests for debugging core calls, HAR [HTTP archive utility] to visualize the Trace data, and a combination of Grafana and OpenTSDB for customized metrics data visualization.

Finally, as Pelarinos mentioned, ESPN has achieved this entire API transformation by auto- Scaling with Amazon Web Services (AWS), so the company is only paying for the infrastructure it needs.

What other API transformations have you seen that we should be writing about? Tell us in the comments below

Be sure to read the next Business article: ProgrammableWeb's Most Interesting APIs in 2017: Business and Productivity