上文我们介绍了维度模型的实体配置规范,本文继续介绍维度的配置规范。
维度表示数据集中不可聚合的列,这些列是描述数据或对数据进行分类的属性、特征或特征。在dbt语义层的上下文中,维度是语义模型结构中的一部分。它们与实体和度量等其他元素一起创建,用于向数据添加更多细节。在SQL中,维度通常包含在SQL查询的group by子句中。
维度定义及规范
维度表示数据集中不可聚合的列,这些列是描述数据或对数据进行分类的属性、特征或特征。在dbt语义层的上下文中,维度是称为语义模型的更大结构的一部分。它们与实体和度量等其他元素一起创建,用于向数据添加更多细节。在SQL中,维度通常包含在SQL查询的group by子句中。
所有维度都需要name,
type,还可以选择包含expr参数。在同一语义模型中,Dimension的名称必须是唯一的。下面是维度定义规范:
dimensions:
- name: Name of the group that will be visible to the user in downstream tools # Required
type: Categorical or Time # Required
label: Recommended adding a string that defines the display value in downstream tools. # Optional
type_params: Specific type params such as if the time is primary or used as a partition # Required
description: Same as always # Optional
expr: The column name or expression. If not provided the default is the dimension name # Optional
官方文档中每个属性规范参考如下:
Parameter | Description | Required | Type |
---|---|---|---|
name |
Refers to the name of the group that will be visible to the user in downstream tools. It can also serve as an alias if the column name or SQL query reference is different and provided in the expr parameter. Dimension names should be unique within a semantic model, but they can be non-unique across different models as MetricFlow uses joins to identify the right dimension. |
Required | String |
type |
Specifies the type of group created in the semantic model. There are two types: - Categorical: Describe attributes or features like geography or sales region. - Time: Time-based dimensions like timestamps or dates. | Required | String |
type_params |
Specific type params such as if the time is primary or used as a partition. | Required | Dict |
description |
A clear description of the dimension. | Optional | String |
expr |
Defines the underlying column or SQL query for a dimension. If no expr is specified, MetricFlow will use the column with the same name as the group. You can use the column name itself to input a SQL expression. |
Optional | String |
label |
Defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as orders_total or "orders_total" ). |
Optional | String |
meta |
Set metadata for a resource and organize resources. Accepts plain text, spaces, and quotes. | Optional | Dictionary |
维度名称规范
下面条通过实例了解如何在语义模型中使用维度。
维度被绑定到语义模型中的主实体。例如,维度’type是在以transaction为主实体的模型中定义的。Type的作用域是事务实体内,要引用这个维度,你可以使用完全限定的维度名称,即transaction__type。(两个下划线连接)
MetricFlow要求所有语义模型都有主实体,这是为了保证唯一的维度名称。如果数据源没有主实体,则需要设定primary_entity定义主实体。它不一定要映射到该表中的列,并且分配名称不会影响查询生成。我们建议让这些“虚拟主实体”在语义模型中是唯一的。为没有主实体列的数据源定义主实体的示例如下:
semantic_model<