gRPC in the Real World: The Kubernetes Container Runtime Interface

In previous installments of this series, we looked at the historical events that led to the creation of gRPC as well as the details that go with programming using gRPC. We discussed the key concepts of the gRPC specification. We took a look at the demonstration application we created especially for this series. And, we examined how to use the auto-generation tool, protoc provided by gRPC to create boilerplate code in a variety of programming languages to speed the gRPC development. We also talked about how to bind statically and dynamically to a protobuf when programming under gRPC. In addition, we created a number of lessons on Katacoda's interactive learning environment that support the concepts and practices we covered in the introductory articles.

Having presented the basics required to understand what gRPC is and how it works, we're now going to do a few installments about how gRPC is used in the real world. In this installment, we're going to look at how gRPC is used by Kubernetes in its Container Runtime Interface (CRI) technology.

But, before we go into the particulars about gRPC and Kubernetes CRI, we need to answer a question asked by many who are coming to gRPC for the first time: Why aren't we seeing that much gRPC on the front-end? It's a question that's been asked by many and well worth answering. So, let's.

Where's gRPC?

Since it's release as an open-source project in 2015, gRPC has enjoyed growth in enterprises large and small. Yet, for all its popularity as a server-side technology, gRPC has little presence in public-facing APIs. This is due mostly to two reasons. First, gRPC relies on HTTP/2 as its transport protocol. While the major client-side browsers have supported HTTP/2 since 2015, as of July 2020 less than half of the websites on the Internet support the protocol on the server-side. The traction for using gRPC between clients and web servers just isn't there yet.

The second reason why public-facing adoption of gRPC has been slow to take hold is that clients using a particular gRPC based API need to have access to the same schema definition used by the server. (The schema definition for a given gRPC API is stored in a protobuf file.)

Having to share a common protobuf file is a significant constraint when compared to an API format such as REST which uses HTTP/1.1 and requires consuming clients to have no foreknowledge of the data structures provided by the API. With REST you simply call a URL and some data is returned in a self-describing data format such as JSON, XML or YAML.

In short, the complexity of gRPC makes it challenging to adopt for standard, commercial web sites, and public APIs. However, the technology is flourishing on the server-side. As Matt Debergalis, CTO at Apollo GraphQL, one of the leading companies that publish tools and servers for the GraphQL says,

"A lot of customers build the data graph on top of gRPC. In a typical company, you've got hundreds of services now, and gRPC is the best technology for the APIs for those microservices because it's so focused; it's so efficient. It's designed for that "inside the data center" use case, but it's not the right technology for connecting to your applications.

So there you have it. gRPC is indeed used a lot, but for the most part it's hidden from public view. It's used to facilitate lightning fast, efficient communication between backend services on the server side, and very often in cases where datacenter resources are auto-scaling up and down in response to loads that are fluctuating in real-time.

And, one of the prime examples of how gRPC is put to use in the real world is in the Kubernetes Container Runtime Interface (K8S CRI); a technology that's virtually synonymous with such auto-scaling. One of the key features of Kubernetes is container orchestration. The K8S CRI is a critical component for managing containers under Kubernetes. And, gRPC is woven into the fabric of the orchestration technology. Let's take a look.

Kubernetes: Using gRPC in the Container Runtime Interface

In order to understand how gRPC is used as the communication mechanism for the Container Runtime Interface (CRI) you need to have a high-level understanding of how Kubernetes works, particularly about the role that containers play in its architecture.

Kubernetes is a service management and container orchestration technology that's intended to support distributed applications that run at web-scale. The essential logic behind the Kubernetes architecture is that an application's or API's functionality is represented in Kubernetes by a resource called a service. A service is an abstraction of the application on the network. The actual logic that the given service represents resides in another abstract resource called a pod.

Understanding Kubernetes Services and Pods

Figure 1 below shows an example of three services that exist in an application. One service provides access functionality. Another provides catalog information and the third provides buy functionality. Each of these services can be identified on the network by an IP address or DNS name. Hence, consumers using the application will call the service on the network accordingly. However, the service has no functionality of its own. Rather a service's functionality is provided by logic that resides in one or many pods to which a service is bound. (To learn more about Kubernetes service binding take a look at ProgrammableWeb's interactive lesson on Katacoda found here.)

Figure 1: In Kubernetes, application logic resides in pods that are represented on the network by services

Figure 1: In Kubernetes, application logic resides in pods that are represented on the network by services

As mentioned above, a pod is an abstract resource. A pod is an organization unit for hosting Linux containers. A container is a mechanism for encapsulating and isolating a process that executes programming logic. (See Figure 2, below.)

Figure 2: A pod is an abstract organization unit for hosting one or many Linux containers

Figure 2: A pod is an abstract organization unit for hosting one or many Linux containers

Examples of processes that run in a container are web servers, message brokers, databases, and other types of executable binaries. A pod can host one or many containers in which each container's functionality is unique. In other words, it's quite possible to have a pod that hosts both a web-server container and a database container. However, be advised that configuring a pod is not a matter of just including a random number of containers to host. Defining the structure of a pod with multiple containers is a complex undertaking that requires experience in implementing Kubernetes architectures.

The important thing to know is this: In Kubernetes, a service represents functionality to the network. That functionality resides in pods. The implementation of the functionality in a given pod is executed in the containers that are hosted in the pod.

Which brings us to containers. Containers do not appear in Kubernetes by magic. They need to be made and they need to be made in an ephemeral manner. Kubernetes is a dynamic technology. It can scale its resources up and down to meet the needs of the moment. This includes creating and destroying containers on demand.

Guaranteeing the state of a container

There is an abstract resource in Kubernetes called a deployment. A deployment's job is to guarantee that all the containers that are supposed to be running in a given Kubernetes deployment are indeed running. This is important because Kubernetes guarantees that the state defined for the cluster will always be maintained.

State guarantee is a very powerful feature of Kubernetes. Also, as mentioned above, it requires a good deal of control over container management.

This is where the Container Runtime Interface comes into play. While Kubernetes is in charge of binding services to pods and also guaranteeing that the pods that are supposed to be running are, indeed running, it's the container runtime that does the work of actually making the container(s) that a pod needs.

The mechanics of container realization

Before we go into the container runtime and the role that gRPC plays in the container realization process, it's useful to know the mechanics behind container realization.

In Kubernetes, a virtual machine is called a node. A Kubernetes cluster is composed of a controller node that controls the activities in an array of constituent worker nodes. In short, the controller node is the boss and the worker nodes do the worker. (See Figure 3, below.)

Figure 3: The organizational hierarchy of a Kubernetes Cluster

Figure 3: The organizational hierarchy of a Kubernetes Cluster

One of the activities that the controller node coordinates with the worker nodes is the creation and destruction of the Linux containers associated with pods.

Each worker node in a Kubernetes cluster has an agent named kubelet. You can think of kubelet as the node's foreman. It takes orders from the controller plane to do some work on its node and then makes sure that the work gets done. One of kubelet's jobs is to create and destroy containers on its worker node.

However, kubelet does not do this work. (Remember, kubelet is a foreman.) Rather, it tells the Container Runtime Interface (CRI) to do the work. (See Figure 4, below)

Figure 4: The kubelet instance running in each Kubernetes worker node tells the CRI to create containers in response to a notification from the API server running on the Kubernetes Controller node

Figure 4: The kubelet instance running in each Kubernetes worker node tells the CRI to create containers in response to a notification from the API server running on the Kubernetes Controller node

gRPC and the CRI

The way that kubelet tells the CRI what to do is by interacting with a gRPC server that's embedded within the CRI. (See Figure 5, below.)

Figure 5: kubelet interacts with the Container Runtime Interface using gRPC to create and destroy containers on a worker node

Figure 5: kubelet interacts with the Container Runtime Interface using gRPC to create and destroy containers on a worker node

When it's time to create or destroy a container on a node, kubelet sends a message to the gRPC server running on the node's CRI instance to do the deed, then the CRI interacts with the container runtime engine installed on the worker node to do what is necessary.

For example, when kubelet wants to create a container, it uses its gRPC client to send a CreateContainerRequest message to the RPC (remote procedure call) function CreateContainer() that's hosted on the CRI component. The CreateContainer function and CreateContainerRequest are shown below in Listing 1.

// CreateContainer creates a new container in specified PodSandbox
rpc CreateContainer(CreateContainerRequest) returns (CreateContainerResponse) {}

message CreateContainerRequest {
    // ID of the PodSandbox in which the container should be created.
    string pod_sandbox_id = 1;
    // Config of the container.
    ContainerConfig config = 2;
    // Config of the PodSandbox. This is the same config that was passed
    // to RunPodSandboxRequest to create the PodSandbox. It is passed again
    // here just for easy reference. The PodSandboxConfig is immutable and
    // remains the same throughout the lifetime of the pod.
    PodSandboxConfig sandbox_config = 3;
}

Listing 1: The gRPC function and message type used to create a container using the Kubernetes Container Runtime Interface

The CRI, in turn, sends the creation request to the actual container runtime installed on the node. The container runtime creates the container.

Kubernetes allows you to install one from a variety of container runtimes on a node. You can install the tried and true Docker runtime, but there are other runtimes that can be installed, for example, containerd, rkt or cri-o, pronounced, cree-oh. (Choosing the container runtime best suited to the given Kubernetes installation provides an added degree of flexibility when customizing a cluster.)

Once container creation is completed, the CRI returns a CreateContainerResponse message as defined in the protobuf file that is shared by both gRPC client and server. The definition of CreateContainerResponse is shown below in Listing 2.

message CreateContainerResponse {
    // ID of the created container.
    string container_id = 1;
}

Listing 2: The CRI gRPC server returns a CreateContainerResponse message that has the unique identifier of the container created.

Creating and destroying containers are but two of the activities executed from the Container Runtime Interface. There are others, such as stopping a container, starting it up again, listing the containers in a pod, and updating a container's configuration information, to name a few.

gRPC drives all the message exchanges that happen between kubelet and the CRI. Keep in mind that exchanging messages between kubelet and the CRI needs to happen lightning fast, sometimes on the order of nanoseconds. A typical Kubernetes cluster running at web-scale can have tens of thousands of containers in force running among dozens, maybe hundreds of nodes. Hence, speed and efficiency are critical in the communication pipeline. gPRC fits the bill and then some.

Putting It All Together

When it comes to using gRPC in the real-world, Kubernetes is the 800 lb. gorilla. Both Kubernetes and gRPC started out at Google, so it's natural that both technologies should loom large and easily on the technology landscape. But, as mentioned at the beginning of this piece, adoption is growing steadily. There are many other companies using gRPC in their tech stack. There's a reason why gRPC is fast, efficient, and reliable. Mission-critical applications that drive a data center need this type of power. As mentioned above, gRPC fits the bill and then some.

Be sure to read the next API Design article: How Kubernetes Exemplifies A Truly API Driven Application