经过一系列Transformation转换操作后,最后一定要调用Sink操作,才会形成一个完整的DataFlow拓扑。只有调用了Sink操作,才会产生最终的计算结果,这些数据可以写入到的文件、输出到指定的网络端口、消息中间件、外部的文件系统或者是打印到控制台.
flink在批处理中常见的sink
- print 打印
- writerAsText 以文本格式输出
- writeAsCsv 以csv格式输出
- writeUsingOutputFormat 以指定的格式输出
- writeToSocket 输出到网络端口
- 自定义连接器(addSink)
1、print
打印是最简单的一个Sink,通常是用来做实验和测试时使用。如果想让一个DataStream输出打印的结果,直接可以在该DataStream调用print方法。另外,该方法还有一个重载的方法,可以传入一个字符,指定一个Sink的标识名称,如果有多个打印的Sink,用来区分到底是哪一个Sink的输出。
以下演示了print打印,以及自定义print打印。
package com.bigdata.day03;
import org.apache.flink.api.common.RuntimeExecutionMode;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.sink.RichSinkFunction;
public class SinkPrintDemo {
public static void main(String[] args) throws Exception {
//1. env-准备环境
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setRuntimeMode(RuntimeExecutionMode.AUTOMATIC);
DataStreamSource<String> dataStreamSource = env.socketTextStream("localhost", 8888);
// 打印,普通的打印
// 6> helllo world
//dataStreamSource.print();
dataStreamSource.addSink(new MySink());
// 接着手动实现该print 打印
env.execute();
}
static class MySink extends RichSinkFunction<String> {
@Override
public void invoke(String value, Context context) throws Exception {
// 得到一个分区号,因为要模仿print打印效果
int partitionId = getRuntimeContext().getIndexOfThisSubtask() + 1;
String msg = partitionId +"> " +value;
System.out.println(msg);
}
}
}
package com.bigdata.day03;
import org.apache.flink.api.common.RuntimeExecutionMode;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.sink.RichSinkFunction;
public class Demo01 {
static class MyPrint extends RichSinkFunction<String>{
private String msg;
public MyPrint(){
}
public MyPrint(String msg){
this.msg = msg;
}
@Override
public void invoke(String value, Context context) throws Exception {
int partition = getRuntimeContext().getIndexOfThisSubtask();
if(msg == null){
System.out.println(partition+"> "+value);
}else{
System.out.println(msg+">>>:"+partition+"> "+value);
}
}
}
public static void main(String[] args) throws Exception {
//1. env-准备环境
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setRuntimeMode(RuntimeExecutionMode.AUTOMATIC);
//2. source-加载数据
DataStream<String> data = env.fromElements("hello", "world", "baotianman", "laoyan");