Stream Catalog GraphQL API Usage and Examples on Confluent Cloud
Stream Catalog leverages GraphQL under the hood and also exposes Stream Catalog GraphQL API for use in your deployments.
Overview
The following sections provide an overview of GraphQL and explain how it is used with the Stream Catalog.
What is it?
GraphQL is a query language for APIs that at its core enables declarative data fetching in order to give clients the power to specify exactly the data they need from an API. It’s a new API standard that provides a more efficient, powerful, and flexible alternative to REST (emphasis on alternative, GraphQL is not a replacement and normally coexists side by side with REST).
Why is it important?
The Confluent Stream Catalog provides a centralized metadata repository for customers in cloud environments. The GraphQL API allows users to take advantage of the graph nature of the Stream Catalog, which is modeled as a graph of entities and relationships, and provides them with a more natural, efficient, and productive way of exploring the catalog.
When to use REST API and when to use GraphQL API
GraphQL only supports search, so the question would be when to use the REST
/search API vs GraphQL. The only capability that the REST /search API has
over the GraphQL search is searching for business metadata attributes. Currently,
business metadata attribute search is not supported in the GraphQL API.
Otherwise, the GraphQL search is preferred, as it can search across relationships.
This blog post explains the power of catalog GraphQL API in more detail: How to Find, Share, and Organize Your Data Streams with Stream Catalog
Getting started
The Confluent Stream Catalog provides a centralized repository of schemas and other metadata entities within an environment, as well as the relationships between them. When querying for one or more related metadata entities, GraphQL can be used to return all requested metadata entities within a single response.
The Stream Catalog GraphQL API is a read-only API that only supports queries, and not mutations nor subscriptions.
GraphQL endpoint
The Stream Catalog GraphQL endpoint is https://<SR ENDPOINT>/catalog/graphql.
This endpoint supports both POST and GET requests per recommended practices for GraphQL implementations.
GraphQL schema
The GraphQL schema can be introspected using any number of GraphQL tools. The schema can also be seen here.
Entity queries
Fetch list of entities
You can fetch a single entity or multiple entities of the same type using a simple query.
Example: Fetch a list of fields:
query {
sr_field {
name
}
}
Example: Fetch a list of Unified Stream Manager topic entities:
query {
usm_kafka_topic {
name
}
}
Fetch nested entities using relationships
You can fetch a single entity and its related entities by specifying the desired relationships.
Example: Fetch a list of fields and the name of each field’s record, schema, and subject_version:
query {
sr_field {
name
record {
name
schema {
name
subject_versions {
name
}
}
}
}
}
Example: Fetch a list of subject_version entities and the name of each field in the corresponding schemas.
Tip
This example uses an inline fragment with a type condition of sr_record, since a schema can contain several types besides records.
query {
sr_subject_version(where: {name: {_starts_with: "my_subject"}}) {
name
schema {
id
types {
... on sr_record {
name
fields {
name
}
}
}
}
}
}
Filtering using the “where” argument
You can use the where argument to filter results based on some of an entity’s attributes.
You can combine filters using the _and/_or operators.
Example: Fetch the field whose name is “field1”:
query {
sr_field(where: {name: {_eq: "field1"}}) {
name
createTime
}
}
Example: Fetch the field whose name is “field1” and schema ID is 1:
query {
sr_field(where: {_and: [{name: {_eq: "field1"}}, {id: {_eq: 1}}]}) {
name
createTime
}
}
The following operators can be used in the where argument:
_eq_gt_lt_gte_lte
For string attributes the following operator can additionally be used:
_starts_with
For date attributes the following operators can additionally be used:
_between_since
Example: Fetch a field created during a certain period:
query {
sr_field(where: {createTime: {_between: {start: "2020-01-01T00:00:00" end: "2022-01-01T00:00:00"}}}) {
name
createTime
}
}
Example: Fetch a field created since a certain duration:
query {
sr_field(where: {createTime: {_since: last_7_days}}) {
name
createTime
}
}
Valid values for the since parameter:
last_7_dayslast_30_dayslast_monththis_monthtodayyesterdaythis_yearlast_yearthis_quarterlast_quarterlast_3_monthslast_6_monthslast_12_months
Sort Using the “order_by” Argument
Results can be sorted by using the order_by argument.
Example: Sort the fields in ascending order of the name:
query {
sr_field(order_by: {name: asc}) {
name
createTime
}
}
The order_by argument can specify that the sort direction is asc (ascending) or desc (descending).
Pagination with the “limit” and “offset” Arguments
Results can be paginated with the limit and offset arguments.
If limit is not set, the default is 100 and the max limit is 10000.
Example: Fetch five (5) fields, starting with the sixth (6th) one:
query {
sr_field(limit: 5, offset: 5) {
name
createTime
}
}
Including deleted objects with the “deleted” argument
Normally only active (non-deleted) entities are returned. Deleted entities can
additionally be returned by specifying the deleted argument as true.
Example: Fetch all fields, including deleted ones:
query {
sr_field(deleted: true) {
name
createTime
status
}
}
GraphQL API usage limitations and best practices
Review the following guidelines and limitations to ensure you use the GraphQL API effectively.
Global sorting of search results
The Catalog GraphQL API doesn’t maintain global sorting for search results when you search for the cf_entity type.
Instead, the API sorts usm and Confluent Cloud resources as separate groups.
Example: Consider the following query, which requests cf_entity resources sorted by their creation time:
query {
cf_entity {
name
createTime
}
}
In the response, the API fetches and lists all USM resources first, then concatenated with the Confluent Cloud resources.
This results in the USM resources and Confluent Cloud resources are sorted independently within the search results.
API limits
Query limits
The GraphQL API provides two query limits:
- Query complexity limit
The complexity limit is a limit on the total number of data fields in the query. The maximum query complexity is 200.
- Query depth limit
The depth limit is a limit on the total depth of the query. The maximum query depth is 20.
Time limits
The GraphQL API provides a maximum time limit of 30 seconds for any GraphQL query.
Rate limits
The GraphQL API provides a maximum rate limit of 25 requests per second.
API reference
See the GraphQL API reference documentation for Stream Catalog here.