Flink写入Kafka的connector变更之源码解析

Flink1.14及之后版本弃用了FlinkKafkaProducer,推荐使用KafkaSink。新API基于Sink接口实现一次性语义,关键在于Committer和GlobalCommitter,它们在两阶段提交协议中扮演角色,确保数据一致性。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

作者: 阳龙生

Flink1.13及其之前

Flink1.13及其之前写入Kafka都可以使用这个类:

org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer

此类继承自TwoPhaseCommitSinkFunction,他是个抽象类,继承了flink的Sink输出类和快照接口。这是我们以前常用做法,两阶段提交,里面有我们常用的开始事务,预提交,提交,和取消事务的方法。

public abstract class TwoPhaseCommitSinkFunction<IN, TXN, CONTEXT> extends RichSinkFunction<IN>
        implements CheckpointedFunction, CheckpointListener {
    protected abstract TXN beginTransaction() throws Exception;
    protected abstract void preCommit(TXN transaction) throws Exception;
    protected abstract void commit(TXN transaction);
    protected abstract void abort(TXN transaction);
}

Flink1.14版本

而在Flink1.14版本该类已经废弃,

/**
 * @deprecated Please use {@link org.apache.flink.connector.kafka.sink.KafkaSink}.
 */
@Deprecated
@PublicEvolving
public class FlinkKafkaProducer<IN>
        extends TwoPhaseCommitSinkFunction<
                IN,
                FlinkKafkaProducer.KafkaTransactionState,
                FlinkKafkaProducer.KafkaTransactionContext> {

官方提示我们用这个类:org.apache.flink.connector.kafka.sink.KafkaSink。

public class KafkaSink<IN> implements Sink<IN, KafkaCommittable, KafkaWriterState, Void> {

但是我们看到这个类并没有继承TwoPhaseCommitSinkFunction,虽然它创建了KafkaWriter,但这个KafkaWriter也没有继承TwoPhaseCommitSinkFunction,那我们Flink1.14是怎么实现分布式事务的呢?让我们分析源码一探究竟,了解官方kafkaSink有利于我们项目中自定义其他数据库sink实现类似一致性语义。

Api的改变:

旧api:

新api:

我们可以看到新版KafkaSink实现的是这个接口interface Sink<InputT, CommT, WriterStateT, GlobalCommT>,而这个接口就是我们实现一次性语义的重要方法:

注意看到,里面有个很重要的状态KafkaWriterState,sink接口有个两个很重要的方法:createCommitter createGlobalCommitter

我们看到以下方法注释再次提到了两阶段提交 2-phase-commit,


    /**
     * Creates a {@link Committer} which is part of a 2-phase-commit protocol. The {@link
     * SinkWriter} creates committables through {@link SinkWriter#prepareCommit(boolean)} in the
     * first phase. The committables are then passed to this committer and persisted with {@link
     * Committer#commit(List)}. If a committer is returned, the sink must also return a {@link
     * #getCommittableSerializer()}.
     *
     * @return A committer for the 2-phase-commit protocol.
     * @throws IOException for any failure during creation.
     */
    Optional<Committer<CommT>> createCommitter() throws IOException;

    /**
     * Creates a {@link GlobalCommitter} which is part of a 2-phase-commit protocol. The {@link
     * SinkWriter} creates committables through {@link SinkWriter#prepareCommit(boolean)} in the
     * first phase. The committables are then passed to the Committer and persisted with {@link
     * Committer#commit(List)}. The committables are also passed to this {@link GlobalCommitter} of
     * which only a single instance exists. If a global committer is returned, the sink must also
     * return a {@link #getCommittableSerializer()} and {@link #getGlobalCommittableSerializer()}.
     *
     * @return A global committer for the 2-phase-commit protocol.
     * @throws IOException for any failure during creation.
     */
    Optional<GlobalCommitter<CommT, GlobalCommT>> createGlobalCommitter() throws IOException;

KafkaSink中实现

   @Override
    public Optional<Committer<KafkaCommittable>> createCommitter() throws IOException {
        return Optional.of(new KafkaCommitter(kafkaProducerConfig));
    }

于是我们来到KafkaCommitter类:


/**
 * Committer implementation for {@link KafkaSink}
 *
 * <p>The committer is responsible to finalize the Kafka transactions by committing them.
 */
class KafkaCommitter implements Committer<KafkaCommittable>, Closeable {



/**
 * This class holds the necessary information to construct a new {@link FlinkKafkaInternalProducer}
 * to commit transactions in {@link KafkaCommitter}.
 */
class KafkaCommittable {

可以看到事务主要是通过上面两个类来处理的,从状态中的恢复和事务提交等问题。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值