Flink-Paimon 案例

最新推荐文章于 2025-07-12 16:49:45 发布

猫猫爱吃小鱼粮

最新推荐文章于 2025-07-12 16:49:45 发布

阅读量1k

点赞数 9

CC 4.0 BY-SA版权

分类专栏： Flink精通~DataStreamAPI使用文章标签： flink 大数据

本文链接：https://blog.csdn.net/m0_50186249/article/details/135367626

Flink精通~DataStreamAPI使用专栏收录该内容

32 篇文章

订阅专栏

本文详细介绍了如何在Flink环境中部署Paimon，包括下载和配置步骤，创建PaimonCatalog和Table，以及执行OLAP和Streaming查询。教程涵盖了从本地集群启动到数据处理和分析的全过程。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Flink-Paimon 案例

1、下载 Flink Jar 包并解压

tar -xzf flink-*.tgz

2、下载 Paimon Jar 包放进 Flink 的 lib 中

cp paimon-flink-*.jar <FLINK_HOME>/lib/

3、如果运行在 Hadoop 环境，则向 lib 中添加依赖

cp flink-shaded-hadoop-2-uber-*.jar <FLINK_HOME>/lib/

4、修改 flink-conf.yaml 配置，并启动 Flink 本地集群

-- 修改配置
taskmanager.numberOfTaskSlots: 2

-- 启动集群
<FLINK_HOME>/bin/start-cluster.sh

5、启动 Flink SQL client

<FLINK_HOME>/bin/sql-client.sh

6、创建 Paimon Catalog 和 Table

CREATE CATALOG my_catalog WITH (
    'type'='paimon',
    'warehouse'='file:/tmp/paimon'
);

USE CATALOG my_catalog;

-- create a word count table
CREATE TABLE word_count (
    word STRING PRIMARY KEY NOT ENFORCED,
    cnt BIGINT
);

使用 FlinkGenericCatalog

CREATE CATALOG my_catalog WITH (
    'type'='paimon-generic',
    'hive-conf-dir'='...',
    'hadoop-conf-dir'='...'
);

USE CATALOG my_catalog;

-- create a word count table
CREATE TABLE word_count (
    word STRING PRIMARY KEY NOT ENFORCED,
    cnt BIGINT
) WITH (
    'connector'='paimon'
);

7、写数据

-- create a word data generator table
CREATE TEMPORARY TABLE word_table (
    word STRING
) WITH (
    'connector' = 'datagen',
    'fields.word.length' = '1'
);

-- paimon requires checkpoint interval in streaming mode
SET 'execution.checkpointing.interval' = '10 s';

-- write streaming data to dynamic table
INSERT INTO word_count SELECT word, COUNT(*) FROM word_table GROUP BY word;

8、OLAP 查询

-- use tableau result mode
SET 'sql-client.execution.result-mode' = 'tableau';

-- switch to batch mode
RESET 'execution.checkpointing.interval';
SET 'execution.runtime-mode' = 'batch';

-- olap query the table
SELECT * FROM word_count;

9、Streaming 查询

-- switch to streaming mode
SET 'execution.runtime-mode' = 'streaming';

-- track the changes of table and calculate the count interval statistics
SELECT `interval`, COUNT(*) AS interval_cnt FROM
    (SELECT cnt / 10000 AS `interval` FROM word_count) GROUP BY `interval`;

10、退出

-- exit sql-client
EXIT;

-- 停止本地集群
./bin/stop-cluster.sh