GraphQL (GQL) is a data query language used commonly in modern web and mobile applications as a key part of the technology stack. GQL simplifies fetching data from a server to a client via an API call. This article recaps some thoughts from a post by carvesystems.com, covering the five most common GraphQL vulnerabilities, how to use a GQL “goat” to exemplify vulnerabilities, and some tooling to evaluate GQL implementation.
Understanding a little more about what GQL is will help clarify the concepts in this article. CarveSystems contributor Aiden elaborates:
"GraphQL is a standardized language for describing and making queries to APIs. Originally built by Facebook in 2015 for use in their mobile applications, GraphQL provides a number of benefits to application developers when compared to a traditional REST API":
- "Client applications are able to request only the information they need, minimizing the amount of data sent."
- "GraphQL allows for more complicated queries to be represented, reducing the number of API requests that must be made."
- "All input data is type-checked against a schema defined by the developer, assisting with data validation."
- Benefits accompanying GQL come with a corresponding level of complexity and associated security vulnerabilities.
Explore a Demo API
To illustrate these vulnerabilities, Carve has built a sample API with the issues built-in for exemplification, using minimal notes service (this allows API clients to create their own public and private notes, as well as allowing them to see other users' public posts). You can find the source code for the API sample here
. You can run your own instance locally with Node, or with a Docker container (this is available at carvesystems/vulnerable-graphql-api
on Docker Hub. "The demo application exposes a webserver with an instance of the GraphiQL IDE for experimentation, which is available on port 3000."
Vulnerability 1: Inconsistent Authorization Checks
The most commonly found issue in GraphQL-based applications is flawed authorization logic. Carve elaborates, “While GraphQL helps implement proper data validation, API developers are left on their own to implement authentication and authorization methods on top. Worse, the “layers” of resolvers typical to a GraphQL API make doing this properly more complicated – authorization checks have to be present not only at the query-level resolvers, but also for resolvers that load additional data (for example, to load all of the posts for a given user).”
GQL API flaws usually come in one of two forms. Carve explains, “The first, and more common, occurs when authorization functionality is handled directly by resolvers at the GraphQL API layer. When this is done, authorization checks must be performed separately in each location, and any instance where this is forgotten could lead to an exploitable authorization flaw. The likelihood of this occurring increases as the complexity of the API schema increases, and there are more distinct resolvers responsible for controlling access to the same data.”
The demo API created by Carve portrays several methods for retrieval of listing Post objects: a client can retrieve a list of users, then retrieve all their public posts, or simply retrieve a post by its numeric ID. One vulnerability exposed in the demo API exposes the opportunity to retrieve a post by ID, where there are no authorization checks. Vulnerabilities in this vein are simple and commonly found in real-world GQL deployments.
GraphQL design guides advise how to securely perform authorization: the logic should be applied by the business-logic layer, thereby smoothing the way for consistently applied authorization constraints.
Vulnerability 2: Flimsy REST Proxy Layers
An underlying API adapted for use by GraphQL clients with REST proxies can be “implemented in the GraphQL proxy layer by making a request to GET /api/users/1 on the backend API. If implemented unsafely, an attacker may be able to modify the path or parameters passed to the backend API, presenting a limited form of server-side request forgery.”
Properly URL encoding and validating parameters passed to another service can protect against this kind of vulnerability. Leveraging the GraphQL schema to require a number for the file name would be one way to troubleshoot this vulnerability. An alternative tactic would involve implementing validation of input values. “GraphQL will validate the types for you, but leaves format validation to you. A custom scalar type (for example, a AssetId scalar) could be used to consistently apply any custom validation rules that apply for a commonly-used type.”
Vulnerability 3: Skipping Custom Skalar Validation
Raw data with GQL is represented with a Skalar type, and is ultimately passed as input data or returned as output. Carve breaks it down, explaining that there are five built-in scalar types – Int, Float, Boolean, String, and ID (which is really just a string). This basic set of scalar types is sufficient for many simple APIs, but for scenarios where additional raw datatypes are useful, GraphQL includes support for application developers to define their own scalar types. For example, an API might include its own DateTime scalar type, or an extended scalar type that provides extended input validation, such as “odd integers” or “alphanumeric strings.”
If a developer implements their own Skalar type, they will then be responsible for keeping up with sanitization and type validation. A demo pulled in the graphql-type-json library is available here.
Vulnerability 4: Disorganized Rate-Limiting
Implementing rate-limiting and other denial-of-service protections mirror GraphQL APIs in their complexity and difficulty. The number of actions GQL query takes is by nature mutable, and thus takes an erratic amount of server resources. This is why rate-limiting techniques used for REST APIs are not meant to be used for GQL APIs - the REST API strategies are insufficient for GQL APIs.
Vulnerability 5: Introspection Feature Unmasks Public Data
Tacking on veiled features to API endpoints is an appealing prospect to developers who want a bit of functionality tucked away from public view. These features could be shielded from public view with admin-access protection, or with another API endpoint. Carve advises that “a GraphQL feature called introspection makes discovery of hidden endpoints trivially easy. As part of an effort to be developer-friendly, the introspection feature, which is enabled by default in most GraphQL implementations, allows API clients to dynamically query information about the schema, including documentation and the types for every query and mutation defined in the schema. This is used by development tools, like the GraphiQL IDE, to dynamically retrieve the schema if not provided one.” The demo API developed by Carve further illuminates these ideas with a hidden mutation.
Using a more complex solution like GraphQL comes with more complex problems. With that in mind, it solves many of the fundamental issues with data validation commonly found in REST APIs. These commonalities are available for exploration with the Carve demo API.