GraphQL APIs for Everyone: An In-Depth Tutorial on How GraphQL Works and Why It's Special

This is Part 2 of the ProgrammableWeb API University Guide to GraphQL: Understanding, Building and Using GraphQL APIs.

In the first installment of this series we introduced you to GraphQL from a historical perspective. We looked at GraphQL in terms of the evolution of the internet, from the early days of publishing web pages under static HTML, then onto dynamically data-driven web pages and from there to desktop and mobile applications that used APIs as the primary way to work with web-based data. Finally, we talked about the arrival of GraphQL, first created by Facebook and then introduced to the worldwide programming community.

At a high level, GraphQL was created to allow web programmers working in both browser-based and native mobile formats to write queries that define exactly the data needed from an extensive graph of data in a single response. Also, whereas other APIs, particularly those that are RESTful, require executing queries over multiple trips to the network in order to create a dataset that is complete and useful to the consuming application, creating rich datasets in GraphQL can be accomplished in a single query.

GraphQL is concise and powerful. Also, its use of the object graph as its underlying data framework makes it very adaptable to the semantic web's promise of unifying all the data available on the internet in a meaningful, navigable way.

Now that we've laid the general groundwork for understanding GraphQL, it's time to take a detailed look at the details of working with the technology. We're going to cover the basics of the GraphQL query language. We'll look at the details of the root operations supported by GraphQL; query, mutation, and subscription. We'll cover GraphQL's schema discovery feature, introspection. Finally, we'll look at how GraphQL implements the spirit of object inheritance using interfaces and the extends keyword. Throughout this series and for the sake of consistency, wherever examples are needed to help make a point, we stick to the same fundamental use case involving movies, actors, and a simple graph that they could be a part of.

Understanding The GraphQL Query Language

The structure of the GraphQL query language is a declarative format that looks something like a cross between JSON and Python. The query language uses the curly bracket syntax to define a set of fields within an object (aka entity). But, unlike the way JSON uses commas to delimit a field, a GraphQL query uses line breaks or white spaces. Listing 1 below shows an example of a GraphQL query and the result of that query.

QueryResult
{
  movie(id: "6fceee97-6b03-4758-a429-2d5b6746e24e"){
    title
    releaseDate
    directors{
      firstName
      lastName
      dob
    }
    actors{
      firstName
      lastName
      roles{
        character
      }
    }
  }
}
{
  "data": {
    "movie": {
      "title": "The Man Who Fell to Earth",
      "releaseDate": "1976-04-18",
      "directors": [
        {
          "firstName": "Nicholas",
          "lastName": "Roeg",
          "dob": "1928-08-15"
        }
      ],
      "actors": [
        {
          "firstName": "David",
          "lastName": "Bowie",
          "roles": [
            {
              "character": "Thomas Jerome Newton"
            }
          ]
        },
        {
          "firstName": "Rip",
          "lastName": "Torn",
          "roles": [
            {
              "character": "Nathan Bryce"
            }
          ]
        },
        {
          "firstName": "Candy",
          "lastName": "Clark",
          "roles": [
            {
              "character": "Mary-Lou"
            }
          ]
        },
        {
          "firstName": "Buck",
          "lastName": "Henry",
          "roles": [
            {
              "character": "Oliver Farnsworth"
            }
          ]
        }
      ]
    }
  }
}

Listing 1: The GraphQL query on the right defines a result shown on the left

The meaning behind the query in Listing 1 is as follows: "Show me information about a movie according to the unique id, 6fceee97-6b03-4758-a429-2d5b6746e24e. The information to return is the movie title and release date, Also show me the directors of the movie, according to firstName, lastName, and dob. And, return the collection of actors in the movie according to the firstName, lastName and the role or roles the actor played."

The result of the query defined on the left side of Listing 1 is shown on the right side of the listing.

The nice thing about GraphQL is that the syntax requires no special knowledge other than understanding how to define query parameters and also how to organize the fields to display in a given data entity and its subordinates. There are no special keywords such as SELECT, FROM, GROUPBY, JOIN that you typically find in SQL. GraphQL is simply about defining the data you want from an object graph according to a parent object and its subordinates. Let's take a closer look at the details.

Applying a GraphQL Query to an Object Graph

Object graphs are similar to relational databases in that they both describe relationships between entities. However, relational databases describe entity relationships by using SQL queries to join tables of data together according to a key field while also describing the columns to display from each table. Thus, SQL queries can be quite long and complicated.

Figure 1 below shows a traditional approach to defining relationships in a relational database.

Figure 1: A traditional way to describe one-to-many relationships in a relational database

Figure 1: A traditional way to describe one-to-many relationships in a relational database

In order to determine a movie's director(s) and actors according to the tables in a relational database as shown above, you would need to write a SQL query that joins the tables, Person and MovieDirector to the Movie table. And to determine the actors in a movie we'd need to join the Person table to the MovieActor table, join the MovieActor table to the MovieRole table and then join the MovieRole table to the Movie table. That's a lot of joining.

The GraphQL query language takes a simpler approach. Take a look at Figure 2, which is an object graph that describes movies, actors, directors and roles.

Figure 2: An object graph that describes the relationships between movies, actors, roles and directors

Figure 2: An object graph that describes the relationships between movies, actors, roles and directors

Each entity, Director, Actor, Movie, and Role is a node. The relationship between each node, as shown in Figure 2 is an edge. (Node and edge are terms used in discrete mathematics to describe the parts of a graph.) In this case there is only one edge - has - which describes a single relationship; has.

The GraphQL we'd write to display the title, releaseDate, directors, actors and roles for each actor for a given movie, according to a unique identifier is shown below and above in Listing 1.

{
  movie(id: "6fceee97-6b03-4758-a429-2d5b6746e24e"){
    title
    releaseDate
    directors{
      firstName
      lastName
      dob
   }
    actors{
      firstName
      lastName
      roles{
        character
      }
    }
  }
}

(Please be advised using the name id for the movie identifier parameter is conventional, yet arbitrary. The unique identifier for the movie could just as easily been named, movieId. It all depends on the way that the system designer decides the pattern used to name the unique identifier of an object.)

The query shown above will return all the directors and actors from the underlying data storage technology used by the GraphQL API. Notice that the fields for each implicit director in the directors collection of a movie are defined within a set of curly brackets. Also, notice that the fields for each implicit actor in the actors collection are also defined between curly brackets. Finally, notice too that the field roles is also a collection that is part of each actor, implicitly. (Actors can play a number of roles in a single movie, as did Peter Sellers in the movie, Dr. Strangelove.) The query displays the character field for each role the actor played in the movie, again defining the field to display between the curly brackets associated with the collection.

As we've mentioned a number of times, the GraphQL query language allows you to easily declare exactly the fields for the data you want to show. Thus, in order to see the particular movie according to title and releaseDate, we write:

{
  movie(id: "6fceee97-6b03-4758-a429-2d5b6746e24e"){
    title
    releaseDate
}

The query above will return only the title and release date for the movie according to the id, 6fceee97-6b03-4758-a429-2d5b6746e24e. (In the GraphQL query language, query parameters are defined within parentheses.) Should we want to see the title and releaseDate of all the movies on record, we'll write the following GraphQL query:

{
  movies {
    title
    releaseDate
}

If wanted to see only the title and actors by first and last name for a particular movies, we write:

{
  movie(id: "6fceee97-6b03-4758-a429-2d5b6746e24e"){
    title
    actors{
      firstName
      lastName
   }
  }
}

Should we want to see all the title and releaseDate for all the movies in the datastore, along with all the directors for each movie, displaying the firstName, lastName and dob of each director returned, the GraphQL query we write is:

{
  movies {
    title
    releaseDate
    Directors {
      firstName
      lastName
      dob
   }
  }
}

The important thing to understand is that GraphQL allows developers to define queries according to the nodes in the underlying object graph. Also, the developer can define exactly the fields to display per particular node or collection of nodes.

Another important thing to know is that, while we used a singular and plural naming convention to distinguish an object from a collection of objects, (movie vs. movies), in GraphQL, query naming is custom and arbitrary. And, naming a query does not magically implement behavior. Query behavior must be created in the API implementation. (We'll discuss how to define and implement queries later in this article.)

In a real-world implementation of GraphQL, a developer will need to declare a query named, movie and a query named, movies. And that developer will need to define the parameters that go with each query, if any, as well as the structure of the data returned by the query. Also, the developer will need to program the behavior that the query requires. This is a lot of work, but it is not a make it up as you go along endeavor.

The GraphQL specification describes exactly how to define queries along with the data structures they return. Also, GraphQL defines the mechanism for implementing a query's behavior. This mechanism is called the resolver. A resolver is a function that gets written in the particular language of the GraphQL implementation. For example, given how the Apollo GraphQL solution is written in node.js, resolvers intended to work with an Apollo-based GraphQL API would are written in node.js as well. Figure 3 below shows the relationship between the query, movies and the resolver that implements the behavior for the query.

Figure 3: A resolver is a function the provides the behavior for a given query

Figure 3: A resolver is a function the provides the behavior for a given query

Queries and resolvers, as well as mutations, subscriptions, and custom types are all part of the GraphQL type system and the root operation types, which is what we'll look at next.

Representing Data Models as Object Types

The way that data structures are defined in GraphQL is according to an object type. Object types are described using a description format special to GraphQL. The structure of the format is as follows:

type TypeName {
  fieldName: fieldType
  fieldName: fieldType
  fieldName: fieldType
}

The following describes the various parts of the type declaration structure:

type is a GraphQL reserved word

TypeName is the name of the type. This name can be a GraphQL operation such as Query, Mutation or Subscription. Also, similar to a JSON object, the TypeName can name a custom object type, for example Actor or Movie.

fieldName is the name of a field in the object (for example id, firstName or lastName). If the containing type of the fieldName is a Query, each fieldName will describe a particular query published by the API. If the containing type is a Mutation, each fieldName will describe a mutation published by the API. If the containing type is a Subscription, each fieldName will describe the behavior for evented message transmission to external parties subscribed to the event.

In addition to supporting specific and custom type objects, the GraphQL specification supports the scalar types, String, Int, Float, Boolean and ID. ID denotes a unique identifier. Also, the specification supports arrays of scalar values and object types. For example, [String] indicates an array of the scalar value, String. [Actor] indicates an array of the custom object type, Actor. Please note that in GraphQL an array is defined by putting a type or scalar value between opening and closing square brackets.

All values in GraphQL are declared explicitly according to type. GraphQL does not support implicit type declaration. This is a key difference between GraphQL and HTTP-based RESTful APIs. Whereas RESTful HTTP APIs allow for payloads to be serialized according to formats supported by HTTP (some of which allow implicit typing), JSON is GraphQL's only supported format for payload serialization and typing is required (explicit).

Listing 2 below shows a declaration of the custom object type, Person.

type Person {
   id: ID
   firstName: String
   lastName: String
   dob: Date
   knowsConnection: [Person]
   likesConnection: [Person]
   marriedToConnection: [Person]
   divorcedFromConnection: [Person]
}

Listing 2: The type, Person is an example of the custom object type described in GraphQL's type definition format

Let's take a look at the details of the declaration of the type, Person. The type, Person publishes eight fields: id, firstName, lastName, dob, knowsConnection, likesConnection, marriedToConnection, divorcesFromConnection. The field, id is of type, ID. ID is a built-in scalar type special to GraphQL. ID is intended to describe a unique identifier. An ID is typically a string, but GraphQL expects that the string is a UUID and not human readable. GraphQL implementations such as Apollo are not obligated to auto-generate unique IDs for the ID field. That responsibility lies with the API provider.

The fields, firstName and lastName are of scalar type String. String is another one of the types built-in to GraphQL. The field, dob is of type Date. Date is a custom scalar. GraphQL allows you to define custom scalar types. Custom scalars are useful in situations in which a single value with special validation and parsing rules needs to be supported. The fields knowsConnection, likesConnection, marriedToConnection, and divorcesFromConnection are arrays of Person types, as denoted by the square brackets.

The concept of connections is one that is evolving in GraphQL. Conceptually you can think of a connection as an association between two objects in an object graph. (The term, edge is used in discrete mathematics to indicate a connection between two nodes.) A convention is developing among GraphQL developers in which a category of an edge that exists between two nodes is called a connection, with the naming convention being, categoryConnection, hence knowsConnection, indicating that the connection between two nodes is that one node knows the other.

We're going to take an in-depth look at connections as well a pagination techniques for controlling large lists associated with a connection in Part 3 of this series, How To Design, Launch, and Query a GraphQL API Using Apollo Server.

GraphQL Operations and Resolvers

Whereas a RESTful API's operations depend on the underlying protocol's verbs (eg: HTTP and its verbiage such as GET, PUT, POST, etc.), GraphQL eschews HTTP's command set and supports three root operation types; Query, Mutation, and Subscription. The sections that follow provide examples of these root operations types both in terms of declaration and execution using the GraphQL query language.

Query

A Query is, as the name implies, an operation type that has fields that describe how to get data from a GraphQL API. For those who are familiar with HTTP-based APIs, a GraphQL query most closely correlates to an HTTP GET. Listing 3, below shows an example of the implementation of a Query type in GraphQL's type definition format.

type Query {
   persons: [Person]
   person(id: ID!): Person
   movies: [Movie]
   movie(id: ID!): Movie
   triples: [Triple]
   triplesByPredicate (predicate: Predicate!): [Triple]
}

Listing 3: Each property in a Query type describes a query for getting from the GraphQL API.

You'll notice that in Listing 3 above there are a number of fields defined within the type, Query. Each field in the Query operation type describes a particular query supported by the API. The field persons defines a query literally named persons that returns an array of Person objects. (Again, an array is indicated by the square brackets.) The field person(id: ID!) indicates a query named person that has a parameter, id of type ID. The exclamation symbol means that a value must be provided for the parameter. The query will use the value assigned to id to do a lookup for the particular person. (Please be advised that naming the unique identifier parameter id is a matter of convention used by developers implementing GraphQL types. That the field name for unique identifier happens to be a lower-case name of similar to the GraphQL scalar type, ID is purely coincidental.)

Defining query parameters in the way shown above is part of the GraphQL specification. Later on, in Part 3 of this series. we'll take a look at how the Apollo Server implementation of GraphQL passes query parameter values onto actual query behavior. The important thing to understand now is that you declare parameters by name and type within a set of parentheses in the field definition of a particular query.

As you can see, the pattern for declaring a query that returns an array and a query that returns an object also applies to other fields in the Query definition. The query movies returns an array of Movie< objects. The query movie(id: ID!) returns a particular Movie.

However, notice that while the query, triples supports the "plural" pattern by returning an array of Triple objects, the query triplesByPredicate(predicate: Predicate!) is different for two reasons. (A Triple is custom object we created for our demonstration application that accompanies this series. Triple is not a reserved keyword in GraphQL.) First, the name of the query triplesByPredicate differs from the convention we've seen thus far. The usual pattern for query naming is plural and singular according to type; movies and movie, for example. Yet, triplesByPredicate violates this convention. This is OK because there will come a time when some queries will need to be quite specific. There is nothing in GraphQL that dictates how queries need to be named. The plural/singular pattern is conventional.

Continue on page 2.

Be sure to read the next GraphQL article: Hands-On: How To Design, Launch, and Query a GraphQL API Using Apollo Server

 

Comments (0)