hadoop jar /opt/program/hadoop-3.1.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar words.txt hdfs:/hadoop:8020/input hdfs://hadoop1:8020/output/wc Unknown program 'words.txt' chosen. Valid program names are: aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files. aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files. bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi. dbcount: An example job that count the pageview counts from a database. distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi. grep: A map/reduce program that counts the matches of a regex in the input. join: A job that effects a join over sorted, equally partitioned datasets multifilewc: A job that counts words from several files. pentomino: A map/reduce tile laying program to find solutions to pentomino problems. pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method. randomtextwriter: A map/reduce program that writes 10GB of random textual data per node. randomwriter: A map/reduce program that writes 10GB of random data per node. secondarysort: An example defining a secondary sort to the reduce. sort: A map/reduce program that sorts the data written by the random writer. sudoku: A sudoku solver. teragen: Generate data for the terasort terasort: Run the terasort teravalidate: Checking results of terasort wordcount: A map/reduce program that counts the words in the input files. wordmean: A map/reduce program that counts the average length of the words in the input files. wordmedian: A map/reduce program that counts the median length of the words in the input files. wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files. [root@master hadoop]#
时间: 2025-06-22 07:27:25 浏览: 12
<think>好的,我现在需要帮助用户解决如何正确运行Hadoop MapReduce的wordcount程序的问题,特别是用户在使用hadoop-mapreduce-examples-3.1.3.jar时遇到了“Unknown program 'words.txt'”的错误。我需要仔细分析用户提供的引用内容,并结合我的Hadoop知识来给出正确的步骤和解决方案。
首先,用户提到引用了三个相关的引用内容。引用[1]提到了运行hadoop jar命令时进程无法向下运行的问题,引用[2]和[3]是关于如何执行MapReduce任务的具体命令,比如计算Pi和单词统计的例子。用户的问题是在运行wordcount时遇到了程序无法识别的问题,错误提示是“Unknown program 'words.txt'”。这可能意味着用户在命令中错误地指定了程序名或路径。
根据Hadoop MapReduce的运行方式,通常的步骤是:编写MapReduce程序,打包成jar文件,然后使用hadoop jar命令提交作业。对于示例程序,比如hadoop-mapreduce-examples.jar,里面可能包含多个不同的作业,如wordcount、pi等。用户需要指定正确的程序名称,并正确设置输入和输出路径。
用户遇到的错误信息“Unknown program 'words.txt'”表明,Hadoop无法识别'words.txt'作为一个有效的程序名。这可能是因为用户错误地将输入文件的位置放在了程序名的位置。例如,正确的命令应该是hadoop jar hadoop-mapreduce-examples.jar wordcount [输入路径] [输出路径],而用户可能在命令中将输入文件的名字写在了程序名的位置,导致Hadoop误以为这是一个程序名,从而报错。
接下来,我需要检查用户提供的引用内容中的命令是否正确。在引用[3]中,用户给出的命令是:hadoop jar hadoop-mapreduce-examples-3.1.3.jar wordcount /opt/test/input/ /opt/test/output/count.txt。这个命令看起来是正确的,其中wordcount是程序名,后面是输入和输出路径。但用户提到遇到“Unknown program 'words.txt'”的错误,可能意味着用户在实际操作中将输入文件名作为程序名,比如错误地写成:hadoop jar hadoop-mapreduce-examples.jar words.txt ...,这会导致Hadoop寻找名为words.txt的程序,而示例jar中并没有这个程序,从而报错。
另外,还需要考虑输入和输出路径的问题。输入路径必须是HDFS上的路径,而不是本地文件系统的路径。用户需要确保输入文件已经上传到HDFS的指定目录中,并且输出目录不存在,因为Hadoop会自行创建输出目录。如果用户指定的输入路径是本地路径而非HDFS路径,或者路径不存在,也会导致错误。
可能的解决方案包括:检查命令是否正确,确保程序名是wordcount而不是输入文件名;确认输入和输出路径在HDFS上存在且正确;检查jar文件的路径是否正确;以及确保Hadoop集群运行正常。此外,用户需要按照正确的步骤创建输入目录、上传文件、执行命令,并检查输出结果。
现在需要将这些分析整理成结构清晰的步骤,帮助用户逐步解决问题。同时,生成相关的问题,以便用户进一步了解可能遇到的其它问题,如权限问题、路径错误等。</think>### 如何正确运行 Hadoop MapReduce WordCount 程序
#### 1. 错误原因分析
错误`Unknown program 'words.txt'`表明 Hadoop 将`words.txt`误认为程序名称,而实际原因是**命令参数顺序错误**。正确语法应为:
```bash
hadoop jar <示例jar路径> wordcount <输入路径> <输出路径>
```
**常见错误示例**:
```bash
hadoop jar hadoop-mapreduce-examples-3.1.3.jar words.txt /input /output # 错误:将输入文件名放在程序名位置
```
---
#### 2. 完整操作步骤
##### (1) 准备输入文件
- **本地创建测试文件**:
```bash
mkdir ~/wordcount_test
echo "hello hadoop hello mapreduce" > ~/wordcount_test/words.txt
```
##### (2) 上传文件到 HDFS
- **创建 HDFS 输入/输出目录**:
```bash
hdfs dfs -mkdir -p /user/$USER/wordcount/input
hdfs dfs -mkdir -p /user/$USER/wordcount/output
```
- **上传本地文件到 HDFS**:
```bash
hdfs dfs -put ~/wordcount_test/words.txt /user/$USER/wordcount/input/
```
- **验证上传**:
```bash
hdfs dfs -ls /user/$USER/wordcount/input # 应看到 words.txt
```
##### (3) 执行 WordCount 程序
- **运行命令**(注意参数顺序):
```bash
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar \
wordcount \
/user/$USER/wordcount/input \
/user/$USER/wordcount/output/result
```
*注:输出目录必须不存在,Hadoop 会自动创建[^3]。*
##### (4) 查看结果
```bash
hdfs dfs -cat /user/$USER/wordcount/output/result/*
```
预期输出:
```
hadoop 1
hello 2
mapreduce 1
```
---
#### 3. 常见问题解决
| 错误现象 | 原因 | 解决方案 |
|------------------------------|-------------------------------|------------------------------|
| `No such file or directory` | 输入路径不存在或权限不足 | 检查 HDFS 路径是否存在,使用`hdfs dfs -ls`确认 |
| `Output directory already exists` | 输出目录已存在 | 删除旧目录:`hdfs dfs -rm -r /output/path` |
| `ClassNotFoundException` | Jar 文件路径错误 | 使用绝对路径:`$HADOOP_HOME/share/.../examples.jar` |
---
§§
1. 如何查看 Hadoop MapReduce 作业的日志信息?
2. 输入文件为多个时,WordCount 如何处理?
3. Hadoop 集群模式下运行 WordCount 需要哪些额外配置?
阅读全文
相关推荐















