Stream Catalog GraphQL API Usage and Examples on Confluent Cloud

Stream Catalog leverages GraphQL under the hood and also exposes Stream Catalog GraphQL API for use in your deployments.

Overview

The following sections provide an overview of GraphQL and explain how it is used with the Stream Catalog.

What is it?

GraphQL is a query language for APIs that at its core enables declarative data fetching in order to give clients the power to specify exactly the data they need from an API. It’s a new API standard that provides a more efficient, powerful, and flexible alternative to REST (emphasis on alternative, GraphQL is not a replacement and normally coexists side by side with REST).

Why is it important?

The Confluent Stream Catalog provides a centralized metadata repository for customers in cloud environments. The GraphQL API allows users to take advantage of the graph nature of the Stream Catalog, which is modeled as a graph of entities and relationships, and provides them with a more natural, efficient, and productive way of exploring the catalog.

When to use REST API and when to use GraphQL API

GraphQL only supports search, so the question would be when to use the REST /search API vs GraphQL. The only capability that the REST /search API has over the GraphQL search is searching for business metadata attributes. Currently, business metadata attribute search is not supported in the GraphQL API. Otherwise, the GraphQL search is preferred, as it can search across relationships.

This blog post explains the power of catalog GraphQL API in more detail: How to Find, Share, and Organize Your Data Streams with Stream Catalog

Getting started

The Confluent Stream Catalog provides a centralized repository of schemas and other metadata entities within an environment, as well as the relationships between them. When querying for one or more related metadata entities, GraphQL can be used to return all requested metadata entities within a single response.

The Stream Catalog GraphQL API is a read-only API that only supports queries, and not mutations nor subscriptions.

GraphQL endpoint

The Stream Catalog GraphQL endpoint is https://<SR ENDPOINT>/catalog/graphql. This endpoint supports both POST and GET requests per recommended practices for GraphQL implementations.

GraphQL schema

The GraphQL schema can be introspected using any number of GraphQL tools. The schema can also be seen here.

Entity queries

Fetch list of entities

You can fetch a single entity or multiple entities of the same type using a simple query.

Example: Fetch a list of fields:

query {
  sr_field {
    name
  }
}

Example: Fetch a list of Unified Stream Manager topic entities:

query {
    usm_kafka_topic {
        name
    }
}

Fetch nested entities using relationships

You can fetch a single entity and its related entities by specifying the desired relationships.

Example: Fetch a list of fields and the name of each field’s record, schema, and subject_version:

query {
  sr_field {
    name
    record {
      name
      schema {
        name
        subject_versions {
          name
        }
      }
    }
  }
}

Example: Fetch a list of subject_version entities and the name of each field in the corresponding schemas.

Tip

This example uses an inline fragment with a type condition of sr_record, since a schema can contain several types besides records.

query {
  sr_subject_version(where: {name: {_starts_with: "my_subject"}}) {
    name
    schema {
      id
      types {
          ... on sr_record {
          name
          fields {
            name
          }
        }
      }
    }
  }
}

Filtering using the “where” argument

You can use the where argument to filter results based on some of an entity’s attributes. You can combine filters using the _and/_or operators.

Example: Fetch the field whose name is “field1”:

query {
  sr_field(where: {name: {_eq: "field1"}}) {
    name
    createTime
  }
}

Example: Fetch the field whose name is “field1” and schema ID is 1:

query {
  sr_field(where: {_and: [{name: {_eq: "field1"}}, {id: {_eq: 1}}]}) {
    name
    createTime
  }
}

The following operators can be used in the where argument:

_eq
_gt
_lt
_gte
_lte

For string attributes the following operator can additionally be used:

_starts_with

For date attributes the following operators can additionally be used:

_between
_since

Example: Fetch a field created during a certain period:

query {
  sr_field(where: {createTime: {_between: {start: "2020-01-01T00:00:00" end: "2022-01-01T00:00:00"}}}) {
    name
    createTime
  }
}

Example: Fetch a field created since a certain duration:

query {
  sr_field(where: {createTime: {_since: last_7_days}}) {
    name
    createTime
  }
}

Valid values for the since parameter:

Sort Using the “order_by” Argument

Results can be sorted by using the order_by argument.

Example: Sort the fields in ascending order of the name:

query {
  sr_field(order_by: {name: asc}) {
    name
    createTime
  }
}

The order_by argument can specify that the sort direction is asc (ascending) or desc (descending).

Pagination with the “limit” and “offset” Arguments

Results can be paginated with the limit and offset arguments.

If limit is not set, the default is 100 and the max limit is 10000.

Example: Fetch five (5) fields, starting with the sixth (6th) one:

query {
  sr_field(limit: 5, offset: 5) {
    name
    createTime
  }
}

Filtering by tag with the “tags” argument

Results can be filtered by specifying that results contain one or more tags.

Example: Fetch fields tagged with PII or SECRET:

query {
  sr_field(tags: ["PII", "SECRET"]) {
    name
    createTime
  }
}

Including deleted objects with the “deleted” argument

Normally only active (non-deleted) entities are returned. Deleted entities can additionally be returned by specifying the deleted argument as true.

Example: Fetch all fields, including deleted ones:

query {
  sr_field(deleted: true) {
    name
    createTime
    status
  }
}

GraphQL API usage limitations and best practices

Review the following guidelines and limitations to ensure you use the GraphQL API effectively.

Global sorting of search results

The Catalog GraphQL API doesn’t maintain global sorting for search results when you search for the cf_entity type. Instead, the API sorts usm and Confluent Cloud resources as separate groups.

Example: Consider the following query, which requests cf_entity resources sorted by their creation time:

query {
  cf_entity {
    name
    createTime
  }
}

In the response, the API fetches and lists all USM resources first, then concatenated with the Confluent Cloud resources. This results in the USM resources and Confluent Cloud resources are sorted independently within the search results.

API limits

Query limits

The GraphQL API provides two query limits:

Query complexity limit: The complexity limit is a limit on the total number of data fields in the query. The maximum query complexity is 200.
Query depth limit: The depth limit is a limit on the total depth of the query. The maximum query depth is 20.

Time limits

The GraphQL API provides a maximum time limit of 30 seconds for any GraphQL query.

Rate limits

The GraphQL API provides a maximum rate limit of 25 requests per second.

API reference

See the GraphQL API reference documentation for Stream Catalog here.