这篇博客就继续接着说如何在虚拟机中安装及配置flume和sqoop
Ⅰ,相关组件(附带下载链接)
hadoop生态圈还有很多的组件。例如Spark,Hbase,hive等等,因为篇幅问题,这些软件在此处不介绍,给出下载链接,其余的安装教程将会在其余的博客给出,(本文需要使用链接中的flume和sqoop,将其安装包上传到root目录)
集群组件下载链接
密码:zccy
Ⅱ,安装Flume
1,解压安装包
将准备好的flume安装包解压
tar -xvf /root/apache-flume-1.8.0-bin.tar.gz
mv apache-flume-1.8.0-bin ./flume
2,设置环境变量
vi .bash_profile
添加下面几条
export FLUME_HOME=/root/flume
export FLUME_CONF_DIR=$FLUME_HOME/conf
export PATH=$PATH:$FLUME_HOME/bin
如图
更新配置
source .bash_profile
3,设置flume-env.sh配置文件
复制模板文件
cp /root/flume/conf/flume-env.sh.template /root/flume/conf/flume-env.sh
vi /root/flume/conf/flume-env.sh
新增下面内容(JAVA_HOME以自己的路径为准)
JAVA_HOME=/root/jdk/jdk1.8.0_144/
JAVA_OPTS="-Xms100m -Xmx200m -Dcom.sun.management.jmxremote"
4,修改 flume-conf 配置文件
复制模板文件
cp /root/flume/conf/flume-conf.properties.template /root/flume/conf/flume-conf.properties
vi /root/flume/conf/flume-conf.properties
编辑如下
# The configuration file needs to define the sources,
# the channels and the sinks.
# Sources, channels and sinks are defined per agent,
# in this case called 'agent'
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# For each one of the sources, the type is defined
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
# The channel can be defined as follows.
a1.sources.r1.channels = c1
# Each sink's type must be defined
a1.sinks.k1.type = logger
#Specify the channel the sink should use