Back in 2008, a dubious public looked on as Netflix announced plans to move to the cloud. Adrian Cockcroft, Netflix's cloud architect, explained the general reaction stating, "everyone said we were completely crazy [….] They didn’t believe we were actually doing that, they thought we were just making stuff up." Despite public opinion, Cockcroft understood Netflix needed to push away from monolithic vertically-scaled datacenters and move towards a cloud-based microservices architecture. The potential benefits in a structural overhaul starkly overshadowed any foreseeable disadvantages. Fast forward roughly 8 years, and Netflix has become one of the first major corporations to exist completely in the public cloud. This article from SmartBear looks at how their architectural shift revealed significant improvements in performance, development, and scalability resulting in a remarkable period of growth. Netflix’s success with a cloud-based microservices architecture has been so remarkable that, in hindsight, it’s hard to imagine that Netflix would have moved in any other direction.
After a single missing semicolon led to a major database corruption in 2008, Netflix understood they had to change their approach. Their vertical monolithic data structure meant separate compartments of code were intimately woven together; an error in one section could cause issues for the whole system. By segmenting their data into a network of horizontally connected API services, Netflix could minimize the frequency and severity of code errors and related issues. This is known as separation of concerns, or as Cockcroft calls it, “Loosely coupled service oriented architecture with bounded contexts”. He admits that by having more systems, more time would be spent managing those systems, but this approach would translate into greater system-wide stability. Cockroft explains that these API services would break at different times causing smaller, localized problems, rather than losing “the whole system at once”, which, in turn, resulted in less manual intervention for Netflix engineers.
This compartmentalized structure also meant the developmental process was less intrusive. Netflix was able to build and test new services with a relatively low-risk of impacting their service. As such, Netflix was free to diversify their developmental portfolio. Netflix created 30+ separate engineering teams capable of independently developing and implementing new services on a variety of new platforms. To put that in perspective, prior to their architectural shift, Netflix would “have at least 10 minutes of downtime to put in the new schema,” every two weeks.
Yet, their cloud migration and horizontal shift offered Netflix much more beyond the realm of stability and flexibility. By employing Amazon Web Services (AWS) for their cloud computing, Netflix was able to take quantum leaps forward in terms of scalability. Previously, Netflix datacenters proved to be bulky and slow to adapt. Changes in capacity could take days and different components were unable to scale at different rates. Additionally, Netflix was expanding so rapidly that they struggled to build data centers quickly enough to meet user demands. However, AWS offered Netflix the ability to change their capacity within minutes, differentiate rates between components, and expand unencumbered. "We don't have to plan capacity in advance, we don't need to ask permission of other people to build things for us, and we don't worry about running out of space or power”, Cockcroft explains.
As of early 2016, Netflix announced their expansion of services to over 130 new countries. Their rapid expansion is due, in no small part, to the very nature of their cloud-based microservices architecture. Although their migration took the better part of a decade, it offered lasting solutions to long-standing issues and has paved the way for other major corporations to follow suit.