首先切换到Hadoop用户
在官网上下载mahout
然后把文件复制到home目录
执行如下命令
cd /home/hadoop/
sudo su
cp /home/hadoop/apache-mahout-distribution-0.13.0.tar.gz /opt
cd /opt
tar -zxvf apache-mahout-distribution-0.13.0.tar.gz
sudo mv apache-mahout-distribution-0.13.0 mahout
sudo gedit /etc/profile
把如下信息添加到末尾
# set mahout environment
export MAHOUT_HOME=/opt/mahout
export MAHOUT_CONF_DIR=$MAHOUT_HOME/conf
export PATH=$MAHOUT_HOME/conf:$MAHOUT_HOME/bin:$PATH
# set hadoop environment
export HADOOP_HOME=/opt/hadoop
export MAHOUT_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export HADOOP_HOME_WARN_SUPPRESS=not_null
然后执行source /etc/profile
进行测试
打开一个新的终端
启动hadoop服务
start-dfs.sh && start-yarn.sh
然后你会输入好多次密码
接下来输入:
jps
打开一个新的终端输入
mahout
如果输出好多东西,如果不管用,重启即可。
Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.13.0-job.jar
An example program must be given as the first argument.
Valid program names are:
arff.vector: : Generate Vectors from an ARFF file or directory
baumwelch: : Baum-Welch algorithm for unsupervised HMM training
canopy: : Canopy clustering
cat: : Print a file or resource as the logistic regression models would see it
cleansvd: : Cleanup and verification of SVD output
clusterdump: : Dump cluster output to text
clusterpp: : Groups Clustering Output In Clusters
cmdump: : Dump confusion matrix in HTML or text formats
cvb: : LDA via Collapsed Variation Bayes (0th deriv. approx)
cvb0_local: : LDA via Collapsed Variation Bayes, in memory locally.
describe: : Describe the fields and target variable in a data set
evaluateFactorization: : compute RMSE and MAE of a rating matrix factorization against probes
那么恭喜你就安装成功了。
下面的内容没有测试出来,有待商榷。
下载mahout测试数据
wget http://archive.ics.uci.edu/ml/databases/synthetic_control
上传测试数据
hadoop fs -put /home/alex/pcshare/XData/synthetic_control.data /user/root/testdata
使用Mahout中的kmeans聚类算法
执行命令:
mahout -core org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
查看聚类结果
hadoop fs -ls /user/root/output
在官网上下载mahout
然后把文件复制到home目录
执行如下命令
cd /home/hadoop/
sudo su
cp /home/hadoop/apache-mahout-distribution-0.13.0.tar.gz /opt
cd /opt
tar -zxvf apache-mahout-distribution-0.13.0.tar.gz
sudo mv apache-mahout-distribution-0.13.0 mahout
sudo gedit /etc/profile
把如下信息添加到末尾
# set mahout environment
export MAHOUT_HOME=/opt/mahout
export MAHOUT_CONF_DIR=$MAHOUT_HOME/conf
export PATH=$MAHOUT_HOME/conf:$MAHOUT_HOME/bin:$PATH
# set hadoop environment
export HADOOP_HOME=/opt/hadoop
export MAHOUT_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export HADOOP_HOME_WARN_SUPPRESS=not_null
然后执行source /etc/profile
进行测试
打开一个新的终端
启动hadoop服务
start-dfs.sh && start-yarn.sh
然后你会输入好多次密码
接下来输入:
jps
打开一个新的终端输入
mahout
如果输出好多东西,如果不管用,重启即可。
Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.13.0-job.jar
An example program must be given as the first argument.
Valid program names are:
arff.vector: : Generate Vectors from an ARFF file or directory
baumwelch: : Baum-Welch algorithm for unsupervised HMM training
canopy: : Canopy clustering
cat: : Print a file or resource as the logistic regression models would see it
cleansvd: : Cleanup and verification of SVD output
clusterdump: : Dump cluster output to text
clusterpp: : Groups Clustering Output In Clusters
cmdump: : Dump confusion matrix in HTML or text formats
cvb: : LDA via Collapsed Variation Bayes (0th deriv. approx)
cvb0_local: : LDA via Collapsed Variation Bayes, in memory locally.
describe: : Describe the fields and target variable in a data set
evaluateFactorization: : compute RMSE and MAE of a rating matrix factorization against probes
那么恭喜你就安装成功了。
下面的内容没有测试出来,有待商榷。
下载mahout测试数据
wget http://archive.ics.uci.edu/ml/databases/synthetic_control
上传测试数据
hadoop fs -put /home/alex/pcshare/XData/synthetic_control.data /user/root/testdata
使用Mahout中的kmeans聚类算法
执行命令:
mahout -core org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
查看聚类结果
hadoop fs -ls /user/root/output