Map端的shuffle过程: 1. Map任务将输出的键值对按照键进行分区,每个分区对应一个reduce任务。 2. 对于每个分区,Map任务会将其输出的键值对按照键进行排序,并将排序后的结果写入磁盘。 3. Map任务会将每个分区的元数据信息(包括分区编号、文件名、起始位置和长度等)发送给JobTracker。 Reduce端的shuffle过程: 1. Reduce任务会向JobTracker请求获取Map任务输出的元数据信息。 2. JobTracker会将Map任务输出的元数据信息发送给Reduce任务。 3. Reduce任务根据元数据信息,从Map任务的输出文件中读取对应分区的数据,并将其按照键进行排序。 4. Reduce任务将排序后的结果写入磁盘,作为最终的输出结果。
时间: 2023-06-03 09:06:46 浏览: 112
我理解你的问题是MapReduce计算框架中的shuffle过程是怎样的,是吗?
Map端的shuffle过程包括三个步骤:首先,Map任务将输出的键值对按照键进行分区,每个分区对应一个reduce任务;其次,对于每个分区,Map任务会将其输出的键值对按照键进行排序,并将排序后的结果写入磁盘;最后,Map任务会将每个分区的元数据信息(包括分区编号、文件名、起始位置和长度等)发送给JobTracker。
Reduce端的shuffle过程包括四个步骤:首先,Reduce任务会向JobTracker请求获取Map任务输出的元数据信息;其次,JobTracker会将Map任务输出的元数据信息发送给Reduce任务;然后,Reduce任务根据元数据信息,从Map任务的输出文件中读取对应分区的数据,并将其按照键进行排序;最后,Reduce任务将排序后的结果写入磁盘,作为最终的输出结果。
希望能够解答你的问题,如果还有其他问题,欢迎继续提问。
相关问题
1. 词频统计 对文本文件“world.txt”内容: 第一行hello Spark 第二行hello world 第三行Spark is fast 第四行world is big 编写spark程序,统计每个单词出现次数,用Map和Reduce思路。MapReduce 工作流程主要包括 Map(映射)和Reduce(规约)两个阶段。 (1)当启动一个 MapReduce 任务时,Map 端将会读取 HDFS(Hadoop Distributed File System,Hadoop 分布式文件系统)上的数据,将数据映射成所需要的键值对并输出至 Reduce 端。 (2)Reduce 端接收 Map 端键值对类型的中间数据,并根据不同键进行分组,对每一组键相同的数据进行处理,得到新的键值对并输出至 HDFS。 第一行用from pyspark.sql import SparkSession导入,越简短越好
<think>我们使用Spark来实现词频统计(WordCount),这是MapReduce的经典案例。在Spark中,我们可以使用RDD(弹性分布式数据集)的转换操作来实现Map和Reduce步骤。步骤:1.创建SparkSession2.读取文本文件,创建RDD3.Map阶段:将每行文本拆分成单词,并将每个单词映射为(word,1)4.Reduce阶段:按单词分组并求和5.输出结果6.停止SparkSession注意:在Spark中,我们通常使用`flatMap`进行拆分,然后使用`map`将每个单词转换为元组,接着使用`reduceByKey`进行聚合。参考引用[2]中的步骤,但使用PySpark实现。代码如下:</think>以下是一个使用 Spark MapReduce 思路实现的词频统计程序,基于 PySpark:
```python
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("WordCount").getOrCreate()
# 读取文本文件创建RDD
text_rdd = spark.sparkContext.textFile("world.txt")
# MapReduce步骤
word_counts = text_rdd.flatMap(lambda line: line.split(" ")) \ # Map: 拆分单词
.map(lambda word: (word, 1)) \ # Map: 生成键值对
.reduceByKey(lambda a, b: a + b) # Reduce: 聚合计数
# 输出结果
word_counts.collect()
spark.stop()
```
### 程序说明:
1. **Map阶段**:
- `flatMap`:将每行文本拆分为单词流($word_1, word_2, ..., word_n$)
- `map`:为每个单词生成键值对($(word, 1)$)
2. **Reduce阶段**:
- `reduceByKey`:对相同单词的计数值求和($sum = \sum_{i=1}^{n} 1$)
3. **分布式处理**:
- 自动将数据分区到集群节点并行处理
- 类似 Hadoop MapReduce 但内存计算更快[^2][^3]
### 执行示例:
输入文件 `world.txt` 内容:
```
hello world
hello spark
```
输出结果:
```
[('hello', 2), ('world', 1), ('spark', 1)]
```
---
### 相关问题:
1. Spark 的 `flatMap` 和 `map` 操作有何本质区别?
2. 如何优化上述程序以忽略单词大小写和标点符号?
3. Spark 的 `reduceByKey` 与 Hadoop MapReduce 的 Shuffle 阶段有何异同?[^3]
4. 当处理超大规模文本时,如何避免 `collect()` 导致的内存溢出?
5. Spark 词频统计相比传统 MapReduce 有哪些性能优势?[^2]
2025-07-02 11:15:25,551 INFO - task run command: sudo -u hadoop -E bash /tmp/dolphinscheduler/exec/process/hadoop/16836554651104/18167664743392_10/32581/56672/32581_56672.command 2025-07-02 11:15:25,552 INFO - process start, process id is: 1190 2025-07-02 11:15:26,553 INFO - -> /usr/lib/dolphinscheduler/worker-server/conf/dolphinscheduler_env.sh: line 23: export: `zookeeper.quorum=': not a valid identifier /usr/lib/dolphinscheduler/worker-server/conf/dolphinscheduler_env.sh: line 23: export: `dominos-usdp-fun01:2181,dominos-usdp-fun02:2181,dominos-usdp-fun03:2181': not a valid identifier 2025-07-02 11:15:31,554 INFO - -> SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/lib/hive/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory] 2025-07-02 11:15:31,226 INFO [main] conf.HiveConf (HiveConf.java:findConfigFile(187)) - Found configuration file file:/etc/hive/conf/hive-site.xml 2025-07-02 11:15:32,554 INFO - -> 2025-07-02 11:15:32,428 main ERROR Cannot access RandomAccessFile java.io.FileNotFoundException: /data/log/hive/hive.log (Permission denied) java.io.FileNotFoundException: /data/log/hive/hive.log (Permission denied) at java.io.RandomAccessFile.open0(Native Method) at java.io.RandomAccessFile.open(RandomAccessFile.java:316) at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243) at java.io.RandomAccessFile.<init>(RandomAccessFile.java:124) at org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager$RollingRandomAccessFileManagerFactory.createManager(RollingRandomAccessFileManager.java:232) at org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager$RollingRandomAccessFileManagerFactory.createManager(RollingRandomAccessFileManager.java:204) at org.apache.logging.log4j.core.appender.AbstractManager.getManager(AbstractManager.java:114) at org.apache.logging.log4j.core.appender.OutputStreamManager.getManager(OutputStreamManager.java:100) at org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager.getRollingRandomAccessFileManager(RollingRandomAccessFileManager.java:107) at org.apache.logging.log4j.core.appender.RollingRandomAccessFileAppender$Builder.build(RollingRandomAccessFileAppender.java:132) at org.apache.logging.log4j.core.appender.RollingRandomAccessFileAppender$Builder.build(RollingRandomAccessFileAppender.java:53) at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:122) at org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:1120) at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:1045) at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:1037) at org.apache.logging.log4j.core.config.AbstractConfiguration.doConfigure(AbstractConfiguration.java:651) at org.apache.logging.log4j.core.config.AbstractConfiguration.initialize(AbstractConfiguration.java:247) at org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:293) at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:626) at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:302) at org.apache.logging.log4j.core.async.AsyncLoggerContext.start(AsyncLoggerContext.java:87) at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:242) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:159) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:131) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:101) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:210) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jDefault(LogUtils.java:173) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:106) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:98) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(LogUtils.java:81) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:699) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:323) at org.apache.hadoop.util.RunJar.main(RunJar.java:236) 2025-07-02 11:15:32,430 main ERROR Could not create plugin of type class org.apache.logging.log4j.core.appender.RollingRandomAccessFileAppender for element RollingRandomAccessFile: java.lang.IllegalStateException: ManagerFactory [org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager$RollingRandomAccessFileManagerFactory@5ef6ae06] unable to create manager for [/data/log/hive/hive.log] with data [org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager$FactoryData@55dfebeb] java.lang.IllegalStateException: ManagerFactory [org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager$RollingRandomAccessFileManagerFactory@5ef6ae06] unable to create manager for [/data/log/hive/hive.log] with data [org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager$FactoryData@55dfebeb] at org.apache.logging.log4j.core.appender.AbstractManager.getManager(AbstractManager.java:116) at org.apache.logging.log4j.core.appender.OutputStreamManager.getManager(OutputStreamManager.java:100) at org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager.getRollingRandomAccessFileManager(RollingRandomAccessFileManager.java:107) at org.apache.logging.log4j.core.appender.RollingRandomAccessFileAppender$Builder.build(RollingRandomAccessFileAppender.java:132) at org.apache.logging.log4j.core.appender.RollingRandomAccessFileAppender$Builder.build(RollingRandomAccessFileAppender.java:53) at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:122) at org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:1120) at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:1045) at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:1037) at org.apache.logging.log4j.core.config.AbstractConfiguration.doConfigure(AbstractConfiguration.java:651) at org.apache.logging.log4j.core.config.AbstractConfiguration.initialize(AbstractConfiguration.java:247) at org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:293) at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:626) at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:302) at org.apache.logging.log4j.core.async.AsyncLoggerContext.start(AsyncLoggerContext.java:87) at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:242) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:159) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:131) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:101) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:210) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jDefault(LogUtils.java:173) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:106) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:98) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(LogUtils.java:81) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:699) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:323) at org.apache.hadoop.util.RunJar.main(RunJar.java:236) 2025-07-02 11:15:32,431 main ERROR Unable to invoke factory method in class org.apache.logging.log4j.core.appender.RollingRandomAccessFileAppender for element RollingRandomAccessFile: java.lang.IllegalStateException: No factory method found for class org.apache.logging.log4j.core.appender.RollingRandomAccessFileAppender java.lang.IllegalStateException: No factory method found for class org.apache.logging.log4j.core.appender.RollingRandomAccessFileAppender at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.findFactoryMethod(PluginBuilder.java:236) at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:134) at org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:1120) at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:1045) at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:1037) at org.apache.logging.log4j.core.config.AbstractConfiguration.doConfigure(AbstractConfiguration.java:651) at org.apache.logging.log4j.core.config.AbstractConfiguration.initialize(AbstractConfiguration.java:247) at org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:293) at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:626) at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:302) at org.apache.logging.log4j.core.async.AsyncLoggerContext.start(AsyncLoggerContext.java:87) at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:242) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:159) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:131) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:101) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:210) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jDefault(LogUtils.java:173) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:106) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:98) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(LogUtils.java:81) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:699) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:323) at org.apache.hadoop.util.RunJar.main(RunJar.java:236) 2025-07-02 11:15:32,432 main ERROR Null object returned for RollingRandomAccessFile in Appenders. 2025-07-02 11:15:32,432 main ERROR Unable to locate appender "DRFA" for logger config "root" Hive Session ID = 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 2025-07-02 11:15:32,528 INFO [main] SessionState (SessionState.java:printInfo(1227)) - Hive Session ID = 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 2025-07-02 11:15:33,555 INFO - -> Logging initialized using configuration in file:/etc/hive/conf/hive-log4j2.properties Async: true 2025-07-02 11:15:32,577 INFO [main] SessionState (SessionState.java:printInfo(1227)) - Logging initialized using configuration in file:/etc/hive/conf/hive-log4j2.properties Async: true 2025-07-02 11:15:34,556 INFO - -> 2025-07-02 11:15:33,630 INFO [main] session.SessionState (SessionState.java:createPath(790)) - Created HDFS directory: /tmp/hive/hadoop/63fc22ae-87a3-4d13-b59e-6ea5a99a9941 2025-07-02 11:15:33,652 INFO [main] session.SessionState (SessionState.java:createPath(790)) - Created local directory: /tmp/hadoop/63fc22ae-87a3-4d13-b59e-6ea5a99a9941 2025-07-02 11:15:33,659 INFO [main] session.SessionState (SessionState.java:createPath(790)) - Created HDFS directory: /tmp/hive/hadoop/63fc22ae-87a3-4d13-b59e-6ea5a99a9941/_tmp_space.db 2025-07-02 11:15:33,691 INFO [main] tez.TezSessionState (TezSessionState.java:openInternal(277)) - User of session id 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 is hadoop 2025-07-02 11:15:33,714 INFO [main] tez.DagUtils (DagUtils.java:localizeResource(1159)) - Localizing resource because it does not exist: file:/usr/lib/hive/auxlib/hudi-hadoop-mr-bundle-0.13.0.jar to dest: hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/63fc22ae-87a3-4d13-b59e-6ea5a99a9941-resources/hudi-hadoop-mr-bundle-0.13.0.jar 2025-07-02 11:15:34,365 INFO [main] tez.DagUtils (DagUtils.java:createLocalResource(842)) - Resource modification time: 1751426134293 for hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/63fc22ae-87a3-4d13-b59e-6ea5a99a9941-resources/hudi-hadoop-mr-bundle-0.13.0.jar 2025-07-02 11:15:34,384 INFO [main] tez.DagUtils (DagUtils.java:localizeResource(1159)) - Localizing resource because it does not exist: file:/usr/lib/hive/auxlib/hudi-hive-sync-bundle-0.13.0.jar to dest: hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/63fc22ae-87a3-4d13-b59e-6ea5a99a9941-resources/hudi-hive-sync-bundle-0.13.0.jar 2025-07-02 11:15:35,557 INFO - -> 2025-07-02 11:15:34,776 INFO [main] tez.DagUtils (DagUtils.java:createLocalResource(842)) - Resource modification time: 1751426134737 for hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/63fc22ae-87a3-4d13-b59e-6ea5a99a9941-resources/hudi-hive-sync-bundle-0.13.0.jar 2025-07-02 11:15:34,851 INFO [main] tez.TezSessionState (TezSessionState.java:openInternal(288)) - Created new resources: null 2025-07-02 11:15:34,854 INFO [main] tez.DagUtils (DagUtils.java:getHiveJarDirectory(1058)) - Jar dir is null / directory doesn't exist. Choosing HIVE_INSTALL_DIR - /user/hadoop/.hiveJars 2025-07-02 11:15:35,179 INFO [main] tez.TezSessionState (TezSessionState.java:getSha(854)) - Computed sha: 3420a6126cfea97266fe35b708da5d5f95a5b158cad390dc4124081a39cf906f for file: file:/usr/lib/hive/lib/hive-exec-3.1.3.jar of length: 40.36MB in 321 ms 2025-07-02 11:15:35,191 INFO [main] tez.DagUtils (DagUtils.java:createLocalResource(842)) - Resource modification time: 1715146950410 for hdfs://dominos-usdp-v3-fun/user/hadoop/.hiveJars/hive-exec-3.1.3-3420a6126cfea97266fe35b708da5d5f95a5b158cad390dc4124081a39cf906f.jar 2025-07-02 11:15:35,240 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.task.io.sort.mb, mr initial value=100, tez(original):tez.runtime.io.sort.mb=null, tez(final):tez.runtime.io.sort.mb=100 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.read.timeout, mr initial value=180000, tez(original):tez.runtime.shuffle.read.timeout=null, tez(final):tez.runtime.shuffle.read.timeout=180000 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.speculative.minimum-allowed-tasks, mr initial value=10, tez(original):tez.am.minimum.allowed.speculative.tasks=null, tez(final):tez.am.minimum.allowed.speculative.tasks=10 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.ifile.readahead.bytes, mr initial value=4194304, tez(original):tez.runtime.ifile.readahead.bytes=null, tez(final):tez.runtime.ifile.readahead.bytes=4194304 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.shuffle.ssl.enabled, mr initial value=false, tez(original):tez.runtime.shuffle.ssl.enable=null, tez(final):tez.runtime.shuffle.ssl.enable=false 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.map.sort.spill.percent, mr initial value=0.80, tez(original):tez.runtime.sort.spill.percent=null, tez(final):tez.runtime.sort.spill.percent=0.80 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.speculative.speculative-cap-running-tasks, mr initial value=0.1, tez(original):tez.am.proportion.running.tasks.speculatable=null, tez(final):tez.am.proportion.running.tasks.speculatable=0.1 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.speculative.speculative-cap-total-tasks, mr initial value=0.01, tez(original):tez.am.proportion.total.tasks.speculatable=null, tez(final):tez.am.proportion.total.tasks.speculatable=0.01 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.ifile.readahead, mr initial value=true, tez(original):tez.runtime.ifile.readahead=null, tez(final):tez.runtime.ifile.readahead=true 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.merge.percent, mr initial value=0.66, tez(original):tez.runtime.shuffle.merge.percent=null, tez(final):tez.runtime.shuffle.merge.percent=0.66 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.parallelcopies, mr initial value=50, tez(original):tez.runtime.shuffle.parallel.copies=null, tez(final):tez.runtime.shuffle.parallel.copies=50 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.speculative.retry-after-speculate, mr initial value=15000, tez(original):tez.am.soonest.retry.after.speculate=null, tez(final):tez.am.soonest.retry.after.speculate=15000 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.reduce.slowstart.completedmaps, mr initial value=0.95, tez(original):tez.shuffle-vertex-manager.min-src-fraction=null, tez(final):tez.shuffle-vertex-manager.min-src-fraction=0.95 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.memory.limit.percent, mr initial value=0.25, tez(original):tez.runtime.shuffle.memory.limit.percent=null, tez(final):tez.runtime.shuffle.memory.limit.percent=0.25 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.speculative.retry-after-no-speculate, mr initial value=1000, tez(original):tez.am.soonest.retry.after.no.speculate=null, tez(final):tez.am.soonest.retry.after.no.speculate=1000 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.task.io.sort.factor, mr initial value=100, tez(original):tez.runtime.io.sort.factor=null, tez(final):tez.runtime.io.sort.factor=100 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.map.output.compress, mr initial value=false, tez(original):tez.runtime.compress=null, tez(final):tez.runtime.compress=false 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.connect.timeout, mr initial value=180000, tez(original):tez.runtime.shuffle.connect.timeout=null, tez(final):tez.runtime.shuffle.connect.timeout=180000 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.input.buffer.percent, mr initial value=0.0, tez(original):tez.runtime.task.input.post-merge.buffer.percent=null, tez(final):tez.runtime.task.input.post-merge.buffer.percent=0.0 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.map.output.compress.codec, mr initial value=org.apache.hadoop.io.compress.DefaultCodec, tez(original):tez.runtime.compress.codec=null, tez(final):tez.runtime.compress.codec=org.apache.hadoop.io.compress.DefaultCodec 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.task.merge.progress.records, mr initial value=10000, tez(original):tez.runtime.merge.progress.records=null, tez(final):tez.runtime.merge.progress.records=10000 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):map.sort.class, mr initial value=org.apache.hadoop.util.QuickSort, tez(original):tez.runtime.internal.sorter.class=null, tez(final):tez.runtime.internal.sorter.class=org.apache.hadoop.util.QuickSort 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.input.buffer.percent, mr initial value=0.70, tez(original):tez.runtime.shuffle.fetch.buffer.percent=null, tez(final):tez.runtime.shuffle.fetch.buffer.percent=0.70 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.counters.max, mr initial value=120, tez(original):tez.counters.max=null, tez(final):tez.counters.max=120 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.hdfs-servers, mr initial value=hdfs://dominos-usdp-v3-fun, tez(original):tez.job.fs-servers=null, tez(final):tez.job.fs-servers=hdfs://dominos-usdp-v3-fun 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.queuename, mr initial value=default, tez(original):tez.queue.name=default, tez(final):tez.queue.name=default 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.maxtaskfailures.per.tracker, mr initial value=3, tez(original):tez.am.maxtaskfailures.per.node=null, tez(final):tez.am.maxtaskfailures.per.node=3 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.task.timeout, mr initial value=600000, tez(original):tez.task.timeout-ms=null, tez(final):tez.task.timeout-ms=600000 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):yarn.app.mapreduce.am.job.task.listener.thread-count, mr initial value=30, tez(original):tez.am.task.listener.thread-count=null, tez(final):tez.am.task.listener.thread-count=30 2025-07-02 11:15:35,261 INFO [main] sqlstd.SQLStdHiveAccessController (SQLStdHiveAccessController.java:<init>(96)) - Created SQLStdHiveAccessController for session context : HiveAuthzSessionContext [sessionString=63fc22ae-87a3-4d13-b59e-6ea5a99a9941, clientType=HIVECLI] 2025-07-02 11:15:35,263 WARN [main] session.SessionState (SessionState.java:setAuthorizerV2Config(950)) - METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory. 2025-07-02 11:15:35,328 INFO [main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(441)) - Trying to connect to metastore with URI thrift://dc3-dominos-usdp-fun01:9083 2025-07-02 11:15:35,350 INFO [main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(517)) - Opened a connection to metastore, current connections: 1 2025-07-02 11:15:35,358 INFO [main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(570)) - Connected to metastore. 2025-07-02 11:15:35,358 INFO [main] metastore.RetryingMetaStoreClient (RetryingMetaStoreClient.java:<init>(97)) - RetryingMetaStoreClient proxy=class org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient ugi=hadoop (auth:SIMPLE) retries=1 delay=1 lifetime=0 2025-07-02 11:15:36,558 INFO - -> 2025-07-02 11:15:35,864 INFO [main] counters.Limits (Limits.java:init(61)) - Counter limits initialized with parameters: GROUP_NAME_MAX=256, MAX_GROUPS=500, COUNTER_NAME_MAX=64, MAX_COUNTERS=1200 2025-07-02 11:15:35,864 INFO [main] counters.Limits (Limits.java:init(61)) - Counter limits initialized with parameters: GROUP_NAME_MAX=256, MAX_GROUPS=500, COUNTER_NAME_MAX=64, MAX_COUNTERS=120 2025-07-02 11:15:35,864 INFO [main] client.TezClient (TezClient.java:<init>(210)) - Tez Client Version: [ component=tez-api, version=0.10.2, revision=22f46fe39a7cf99b24275304e99867b9135caba2, SCM-URL=scm:git:https://gitbox.apache.org/repos/asf/tez.git, buildTime=2023-02-08T02:24:56Z, buildUser=jenkins, buildJavaVersion=1.8.0_362 ] 2025-07-02 11:15:35,864 INFO [main] tez.TezSessionState (TezSessionState.java:openInternal(363)) - Opening new Tez Session (id: 63fc22ae-87a3-4d13-b59e-6ea5a99a9941, scratch dir: hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/63fc22ae-87a3-4d13-b59e-6ea5a99a9941) 2025-07-02 11:15:35,884 INFO [main] conf.HiveConf (HiveConf.java:getLogIdVar(5037)) - Using the default value passed in for log id: 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 2025-07-02 11:15:35,884 INFO [main] session.SessionState (SessionState.java:updateThreadName(441)) - Updating thread name to 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main 2025-07-02 11:15:35,954 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:isCompatibleWith(346)) - Mestastore configuration metastore.filter.hook changed from org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook to org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl 2025-07-02 11:15:35,958 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:close(600)) - Closed a connection to metastore, current connections: 0 2025-07-02 11:15:36,152 INFO [Tez session start thread] impl.TimelineReaderClientImpl (TimelineReaderClientImpl.java:serviceInit(97)) - Initialized TimelineReader URI=http://dc3-dominos-usdp-fun02:8198/ws/v2/timeline/, clusterId=dominos-usdp-v3-fun 2025-07-02 11:15:36,342 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(441)) - Trying to connect to metastore with URI thrift://dc3-dominos-usdp-fun01:9083 2025-07-02 11:15:36,344 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(517)) - Opened a connection to metastore, current connections: 1 2025-07-02 11:15:36,345 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(570)) - Connected to metastore. 2025-07-02 11:15:36,345 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.RetryingMetaStoreClient (RetryingMetaStoreClient.java:<init>(97)) - RetryingMetaStoreClient proxy=class org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient ugi=hadoop (auth:SIMPLE) retries=1 delay=1 lifetime=0 Hive Session ID = 4caadf81-0f27-469e-8de0-87e177d910e3 2025-07-02 11:15:36,366 INFO [pool-7-thread-1] SessionState (SessionState.java:printInfo(1227)) - Hive Session ID = 4caadf81-0f27-469e-8de0-87e177d910e3 2025-07-02 11:15:36,385 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] conf.HiveConf (HiveConf.java:getLogIdVar(5037)) - Using the default value passed in for log id: 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 2025-07-02 11:15:36,386 INFO [pool-7-thread-1] session.SessionState (SessionState.java:createPath(790)) - Created HDFS directory: /tmp/hive/hadoop/4caadf81-0f27-469e-8de0-87e177d910e3 2025-07-02 11:15:36,412 INFO [pool-7-thread-1] session.SessionState (SessionState.java:createPath(790)) - Created local directory: /tmp/hadoop/4caadf81-0f27-469e-8de0-87e177d910e3 2025-07-02 11:15:36,420 INFO [pool-7-thread-1] session.SessionState (SessionState.java:createPath(790)) - Created HDFS directory: /tmp/hive/hadoop/4caadf81-0f27-469e-8de0-87e177d910e3/_tmp_space.db 2025-07-02 11:15:36,420 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:openInternal(277)) - User of session id 4caadf81-0f27-469e-8de0-87e177d910e3 is hadoop 2025-07-02 11:15:36,441 INFO [pool-7-thread-1] tez.DagUtils (DagUtils.java:localizeResource(1159)) - Localizing resource because it does not exist: file:/usr/lib/hive/auxlib/hudi-hadoop-mr-bundle-0.13.0.jar to dest: hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/4caadf81-0f27-469e-8de0-87e177d910e3-resources/hudi-hadoop-mr-bundle-0.13.0.jar 2025-07-02 11:15:36,455 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:compile(554)) - Compiling command(queryId=hadoop_20250702111536_a8fe6b15-57e0-4288-895c-6d4f8fd58503): ALTER TABLE ddp_dmo_dwd.DWD_OrdCusSrvDetail DROP IF EXISTS PARTITION(DT='') 2025-07-02 11:15:36,484 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] hooks.ATSHook (ATSHook.java:<init>(146)) - Created ATS Hook 2025-07-02 11:15:36,484 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] hooks.ATSHook (ATSHook.java:<init>(146)) - Created ATS Hook 2025-07-02 11:15:36,484 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] hooks.ATSHook (ATSHook.java:<init>(146)) - Created ATS Hook 2025-07-02 11:15:37,561 INFO - -> 2025-07-02 11:15:36,669 INFO [Tez session start thread] client.AHSProxy (AHSProxy.java:createAHSProxy(43)) - Connecting to Application History server at dc3-dominos-usdp-fun01/10.30.10.60:10200 2025-07-02 11:15:36,686 INFO [Tez session start thread] client.TezClient (TezClient.java:start(388)) - Session mode. Starting session. 2025-07-02 11:15:36,727 INFO [Tez session start thread] client.ConfiguredRMFailoverProxyProvider (ConfiguredRMFailoverProxyProvider.java:performFailover(100)) - Failing over to rm-dc3-dominos-usdp-fun01 2025-07-02 11:15:36,809 INFO [Tez session start thread] client.TezClientUtils (TezClientUtils.java:setupTezJarsLocalResources(180)) - Using tez.lib.uris value from configuration: hdfs:////dominos-usdp-v3-fun/tez/tez.tar.gz 2025-07-02 11:15:36,809 INFO [Tez session start thread] client.TezClientUtils (TezClientUtils.java:setupTezJarsLocalResources(182)) - Using tez.lib.uris.classpath value from configuration: null 2025-07-02 11:15:36,880 INFO [Tez session start thread] client.TezClient (TezCommonUtils.java:createTezSystemStagingPath(123)) - Tez system stage directory hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/63fc22ae-87a3-4d13-b59e-6ea5a99a9941/.tez/application_1740624029612_5078 doesn't exist and is created 2025-07-02 11:15:36,913 INFO [Tez session start thread] conf.Configuration (Configuration.java:getConfResourceAsInputStream(2845)) - resource-types.xml not found 2025-07-02 11:15:36,914 INFO [Tez session start thread] resource.ResourceUtils (ResourceUtils.java:addResourcesFileToConf(476)) - Unable to find 'resource-types.xml'. 2025-07-02 11:15:36,948 INFO [Tez session start thread] Configuration.deprecation (Configuration.java:logDeprecation(1441)) - yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled 2025-07-02 11:15:37,000 INFO [pool-7-thread-1] tez.DagUtils (DagUtils.java:createLocalResource(842)) - Resource modification time: 1751426136955 for hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/4caadf81-0f27-469e-8de0-87e177d910e3-resources/hudi-hadoop-mr-bundle-0.13.0.jar 2025-07-02 11:15:37,005 INFO [pool-7-thread-1] tez.DagUtils (DagUtils.java:localizeResource(1159)) - Localizing resource because it does not exist: file:/usr/lib/hive/auxlib/hudi-hive-sync-bundle-0.13.0.jar to dest: hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/4caadf81-0f27-469e-8de0-87e177d910e3-resources/hudi-hive-sync-bundle-0.13.0.jar 2025-07-02 11:15:38,562 INFO - -> 2025-07-02 11:15:37,601 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:isCompatibleWith(346)) - Mestastore configuration metastore.filter.hook changed from org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl to org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook 2025-07-02 11:15:37,602 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:close(600)) - Closed a connection to metastore, current connections: 0 2025-07-02 11:15:37,603 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:checkConcurrency(285)) - Concurrency mode is disabled, not creating a lock manager 2025-07-02 11:15:37,610 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(441)) - Trying to connect to metastore with URI thrift://dc3-dominos-usdp-fun02:9083 2025-07-02 11:15:37,615 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(517)) - Opened a connection to metastore, current connections: 1 2025-07-02 11:15:37,620 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(570)) - Connected to metastore. 2025-07-02 11:15:37,621 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.RetryingMetaStoreClient (RetryingMetaStoreClient.java:<init>(97)) - RetryingMetaStoreClient proxy=class org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient ugi=hadoop (auth:SIMPLE) retries=1 delay=1 lifetime=0 2025-07-02 11:15:37,676 INFO [Tez session start thread] impl.YarnClientImpl (YarnClientImpl.java:submitApplication(338)) - Submitted application application_1740624029612_5078 2025-07-02 11:15:37,685 INFO [Tez session start thread] client.TezClient (TezClient.java:start(404)) - The url to track the Tez Session: http://dc3-dominos-usdp-fun01:8088/proxy/application_1740624029612_5078/ 2025-07-02 11:15:38,143 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:compile(666)) - Semantic Analysis Completed (retrial = false) 2025-07-02 11:15:38,145 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:getSchema(374)) - Returning Hive schema: Schema(fieldSchemas:null, properties:null) 2025-07-02 11:15:38,149 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:compile(781)) - Completed compiling command(queryId=hadoop_20250702111536_a8fe6b15-57e0-4288-895c-6d4f8fd58503); Time taken: 1.723 seconds 2025-07-02 11:15:38,150 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] reexec.ReExecDriver (ReExecDriver.java:run(156)) - Execution #1 of query 2025-07-02 11:15:38,150 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:checkConcurrency(285)) - Concurrency mode is disabled, not creating a lock manager 2025-07-02 11:15:38,150 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:execute(2255)) - Executing command(queryId=hadoop_20250702111536_a8fe6b15-57e0-4288-895c-6d4f8fd58503): ALTER TABLE ddp_dmo_dwd.DWD_OrdCusSrvDetail DROP IF EXISTS PARTITION(DT='') 2025-07-02 11:15:38,153 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] hooks.ATSHook (ATSHook.java:setupAtsExecutor(115)) - Creating ATS executor queue with capacity 64 2025-07-02 11:15:38,177 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(130)) - Timeline service address: dc3-dominos-usdp-fun01:8188 2025-07-02 11:15:38,295 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:launchTask(2662)) - Starting task [Stage-0:DDL] in serial mode 2025-07-02 11:15:38,414 INFO [ATS Logger 0] hooks.ATSHook (ATSHook.java:createTimelineDomain(155)) - ATS domain created:hive_63fc22ae-87a3-4d13-b59e-6ea5a99a9941(hadoop,hadoop) 2025-07-02 11:15:38,528 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:execute(2531)) - Completed executing command(queryId=hadoop_20250702111536_a8fe6b15-57e0-4288-895c-6d4f8fd58503); Time taken: 0.378 seconds OK 2025-07-02 11:15:38,528 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (SessionState.java:printInfo(1227)) - OK 2025-07-02 11:15:38,529 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:checkConcurrency(285)) - Concurrency mode is disabled, not creating a lock manager Time taken: 2.104 seconds 2025-07-02 11:15:38,529 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] CliDriver (SessionState.java:printInfo(1227)) - Time taken: 2.104 seconds 2025-07-02 11:15:38,530 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] conf.HiveConf (HiveConf.java:getLogIdVar(5037)) - Using the default value passed in for log id: 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 2025-07-02 11:15:38,530 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] session.SessionState (SessionState.java:resetThreadName(452)) - Resetting thread name to main 2025-07-02 11:15:38,530 INFO [main] conf.HiveConf (HiveConf.java:getLogIdVar(5037)) - Using the default value passed in for log id: 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 2025-07-02 11:15:38,530 INFO [main] session.SessionState (SessionState.java:updateThreadName(441)) - Updating thread name to 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main 2025-07-02 11:15:38,533 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:compile(554)) - Compiling command(queryId=hadoop_20250702111538_f08ba63c-7b09-48c8-86fe-f29aa249329c): ALTER TABLE ddp_dmo_dwd.DWD_OrdCusSrvDetail ADD IF NOT EXISTS PARTITION(DT='') 2025-07-02 11:15:38,551 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] hooks.ATSHook (ATSHook.java:<init>(146)) - Created ATS Hook 2025-07-02 11:15:38,551 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] hooks.ATSHook (ATSHook.java:<init>(146)) - Created ATS Hook 2025-07-02 11:15:38,551 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] hooks.ATSHook (ATSHook.java:<init>(146)) - Created ATS Hook 2025-07-02 11:15:38,559 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:checkConcurrency(285)) - Concurrency mode is disabled, not creating a lock manager 2025-07-02 11:15:39,286 INFO - process has exited. execute path:/tmp/dolphinscheduler/exec/process/hadoop/16836554651104/18167664743392_10/32581/56672, processId:1190 ,exitStatusCode:1 ,processWaitForStatus:true ,processExitValue:1 2025-07-02 11:15:39,287 INFO - Send task execute result to master, the current task status: TaskExecutionStatus{code=6, desc='failure'} 2025-07-02 11:15:39,287 INFO - Remove the current task execute context from worker cache 2025-07-02 11:15:39,287 INFO - The current execute mode isn't develop mode, will clear the task execute file: /tmp/dolphinscheduler/exec/process/hadoop/16836554651104/18167664743392_10/32581/56672 2025-07-02 11:15:39,288 INFO - Success clear the task execute file: /tmp/dolphinscheduler/exec/process/hadoop/16836554651104/18167664743392_10/32581/56672 2025-07-02 11:15:39,562 INFO - -> 2025-07-02 11:15:38,630 INFO [pool-7-thread-1] tez.DagUtils (DagUtils.java:createLocalResource(842)) - Resource modification time: 1751426138580 for hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/4caadf81-0f27-469e-8de0-87e177d910e3-resources/hudi-hive-sync-bundle-0.13.0.jar 2025-07-02 11:15:38,630 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:openInternal(288)) - Created new resources: null 2025-07-02 11:15:38,644 INFO [pool-7-thread-1] tez.DagUtils (DagUtils.java:getHiveJarDirectory(1058)) - Jar dir is null / directory doesn't exist. Choosing HIVE_INSTALL_DIR - /user/hadoop/.hiveJars 2025-07-02 11:15:38,666 INFO [pool-7-thread-1] tez.DagUtils (DagUtils.java:createLocalResource(842)) - Resource modification time: 1715146950410 for hdfs://dominos-usdp-v3-fun/user/hadoop/.hiveJars/hive-exec-3.1.3-3420a6126cfea97266fe35b708da5d5f95a5b158cad390dc4124081a39cf906f.jar 2025-07-02 11:15:38,697 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:compile(666)) - Semantic Analysis Completed (retrial = false) 2025-07-02 11:15:38,698 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:getSchema(374)) - Returning Hive schema: Schema(fieldSchemas:null, properties:null) 2025-07-02 11:15:38,698 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:compile(781)) - Completed compiling command(queryId=hadoop_20250702111538_f08ba63c-7b09-48c8-86fe-f29aa249329c); Time taken: 0.165 seconds 2025-07-02 11:15:38,698 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] reexec.ReExecDriver (ReExecDriver.java:run(156)) - Execution #1 of query 2025-07-02 11:15:38,698 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:checkConcurrency(285)) - Concurrency mode is disabled, not creating a lock manager 2025-07-02 11:15:38,698 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:execute(2255)) - Executing command(queryId=hadoop_20250702111538_f08ba63c-7b09-48c8-86fe-f29aa249329c): ALTER TABLE ddp_dmo_dwd.DWD_OrdCusSrvDetail ADD IF NOT EXISTS PARTITION(DT='') 2025-07-02 11:15:38,700 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:launchTask(2662)) - Starting task [Stage-0:DDL] in serial mode 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.task.io.sort.mb, mr initial value=100, tez(original):tez.runtime.io.sort.mb=null, tez(final):tez.runtime.io.sort.mb=100 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.read.timeout, mr initial value=180000, tez(original):tez.runtime.shuffle.read.timeout=null, tez(final):tez.runtime.shuffle.read.timeout=180000 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.speculative.minimum-allowed-tasks, mr initial value=10, tez(original):tez.am.minimum.allowed.speculative.tasks=null, tez(final):tez.am.minimum.allowed.speculative.tasks=10 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.ifile.readahead.bytes, mr initial value=4194304, tez(original):tez.runtime.ifile.readahead.bytes=null, tez(final):tez.runtime.ifile.readahead.bytes=4194304 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.shuffle.ssl.enabled, mr initial value=false, tez(original):tez.runtime.shuffle.ssl.enable=null, tez(final):tez.runtime.shuffle.ssl.enable=false 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.map.sort.spill.percent, mr initial value=0.80, tez(original):tez.runtime.sort.spill.percent=null, tez(final):tez.runtime.sort.spill.percent=0.80 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.speculative.speculative-cap-running-tasks, mr initial value=0.1, tez(original):tez.am.proportion.running.tasks.speculatable=null, tez(final):tez.am.proportion.running.tasks.speculatable=0.1 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.speculative.speculative-cap-total-tasks, mr initial value=0.01, tez(original):tez.am.proportion.total.tasks.speculatable=null, tez(final):tez.am.proportion.total.tasks.speculatable=0.01 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.ifile.readahead, mr initial value=true, tez(original):tez.runtime.ifile.readahead=null, tez(final):tez.runtime.ifile.readahead=true 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.merge.percent, mr initial value=0.66, tez(original):tez.runtime.shuffle.merge.percent=null, tez(final):tez.runtime.shuffle.merge.percent=0.66 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.parallelcopies, mr initial value=50, tez(original):tez.runtime.shuffle.parallel.copies=null, tez(final):tez.runtime.shuffle.parallel.copies=50 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.speculative.retry-after-speculate, mr initial value=15000, tez(original):tez.am.soonest.retry.after.speculate=null, tez(final):tez.am.soonest.retry.after.speculate=15000 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.reduce.slowstart.completedmaps, mr initial value=0.95, tez(original):tez.shuffle-vertex-manager.min-src-fraction=null, tez(final):tez.shuffle-vertex-manager.min-src-fraction=0.95 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.memory.limit.percent, mr initial value=0.25, tez(original):tez.runtime.shuffle.memory.limit.percent=null, tez(final):tez.runtime.shuffle.memory.limit.percent=0.25 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.speculative.retry-after-no-speculate, mr initial value=1000, tez(original):tez.am.soonest.retry.after.no.speculate=null, tez(final):tez.am.soonest.retry.after.no.speculate=1000 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.task.io.sort.factor, mr initial value=100, tez(original):tez.runtime.io.sort.factor=null, tez(final):tez.runtime.io.sort.factor=100 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.map.output.compress, mr initial value=false, tez(original):tez.runtime.compress=null, tez(final):tez.runtime.compress=false 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.connect.timeout, mr initial value=180000, tez(original):tez.runtime.shuffle.connect.timeout=null, tez(final):tez.runtime.shuffle.connect.timeout=180000 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.input.buffer.percent, mr initial value=0.0, tez(original):tez.runtime.task.input.post-merge.buffer.percent=null, tez(final):tez.runtime.task.input.post-merge.buffer.percent=0.0 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.map.output.compress.codec, mr initial value=org.apache.hadoop.io.compress.DefaultCodec, tez(original):tez.runtime.compress.codec=null, tez(final):tez.runtime.compress.codec=org.apache.hadoop.io.compress.DefaultCodec 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.task.merge.progress.records, mr initial value=10000, tez(original):tez.runtime.merge.progress.records=null, tez(final):tez.runtime.merge.progress.records=10000 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):map.sort.class, mr initial value=org.apache.hadoop.util.QuickSort, tez(original):tez.runtime.internal.sorter.class=null, tez(final):tez.runtime.internal.sorter.class=org.apache.hadoop.util.QuickSort 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.input.buffer.percent, mr initial value=0.70, tez(original):tez.runtime.shuffle.fetch.buffer.percent=null, tez(final):tez.runtime.shuffle.fetch.buffer.percent=0.70 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.counters.max, mr initial value=120, tez(original):tez.counters.max=null, tez(final):tez.counters.max=120 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.hdfs-servers, mr initial value=hdfs://dominos-usdp-v3-fun, tez(original):tez.job.fs-servers=null, tez(final):tez.job.fs-servers=hdfs://dominos-usdp-v3-fun 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.queuename, mr initial value=default, tez(original):tez.queue.name=default, tez(final):tez.queue.name=default 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.maxtaskfailures.per.tracker, mr initial value=3, tez(original):tez.am.maxtaskfailures.per.node=null, tez(final):tez.am.maxtaskfailures.per.node=3 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.task.timeout, mr initial value=600000, tez(original):tez.task.timeout-ms=null, tez(final):tez.task.timeout-ms=600000 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):yarn.app.mapreduce.am.job.task.listener.thread-count, mr initial value=30, tez(original):tez.am.task.listener.thread-count=null, tez(final):tez.am.task.listener.thread-count=30 2025-07-02 11:15:38,731 INFO [pool-7-thread-1] sqlstd.SQLStdHiveAccessController (SQLStdHiveAccessController.java:<init>(96)) - Created SQLStdHiveAccessController for session context : HiveAuthzSessionContext [sessionString=4caadf81-0f27-469e-8de0-87e177d910e3, clientType=HIVECLI] 2025-07-02 11:15:38,731 WARN [pool-7-thread-1] session.SessionState (SessionState.java:setAuthorizerV2Config(950)) - METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory. 2025-07-02 11:15:38,735 INFO [pool-7-thread-1] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(441)) - Trying to connect to metastore with URI thrift://dc3-dominos-usdp-fun02:9083 2025-07-02 11:15:38,739 INFO [pool-7-thread-1] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(517)) - Opened a connection to metastore, current connections: 2 2025-07-02 11:15:38,745 INFO [pool-7-thread-1] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(570)) - Connected to metastore. 2025-07-02 11:15:38,745 INFO [pool-7-thread-1] metastore.RetryingMetaStoreClient (RetryingMetaStoreClient.java:<init>(97)) - RetryingMetaStoreClient proxy=class org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient ugi=hadoop (auth:SIMPLE) retries=1 delay=1 lifetime=0 2025-07-02 11:15:38,750 INFO [pool-7-thread-1] client.TezClient (TezClient.java:<init>(210)) - Tez Client Version: [ component=tez-api, version=0.10.2, revision=22f46fe39a7cf99b24275304e99867b9135caba2, SCM-URL=scm:git:https://gitbox.apache.org/repos/asf/tez.git, buildTime=2023-02-08T02:24:56Z, buildUser=jenkins, buildJavaVersion=1.8.0_362 ] 2025-07-02 11:15:38,750 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:openInternal(363)) - Opening new Tez Session (id: 4caadf81-0f27-469e-8de0-87e177d910e3, scratch dir: hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/4caadf81-0f27-469e-8de0-87e177d910e3) 2025-07-02 11:15:38,773 INFO [pool-7-thread-1] impl.TimelineReaderClientImpl (TimelineReaderClientImpl.java:serviceInit(97)) - Initialized TimelineReader URI=http://dc3-dominos-usdp-fun02:8198/ws/v2/timeline/, clusterId=dominos-usdp-v3-fun 2025-07-02 11:15:38,779 ERROR [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] exec.DDLTask (DDLTask.java:failed(927)) - Failed org.apache.hadoop.hive.ql.metadata.HiveException: partition spec is invalid; field dt does not exist or is empty at org.apache.hadoop.hive.ql.metadata.Partition.createMetaPartitionObject(Partition.java:129) at org.apache.hadoop.hive.ql.metadata.Hive.convertAddSpecToMetaPartition(Hive.java:2525) at org.apache.hadoop.hive.ql.metadata.Hive.createPartitions(Hive.java:2466) at org.apache.hadoop.hive.ql.exec.DDLTask.addPartitions(DDLTask.java:1320) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:466) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:210) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2664) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2335) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2011) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1709) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1703) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:218) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:787) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:323) at org.apache.hadoop.util.RunJar.main(RunJar.java:236) 2025-07-02 11:15:38,790 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] reexec.ReOptimizePlugin (ReOptimizePlugin.java:run(70)) - ReOptimization: retryPossible: false FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. partition spec is invalid; field dt does not exist or is empty 2025-07-02 11:15:38,791 ERROR [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (SessionState.java:printError(1250)) - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. partition spec is invalid; field dt does not exist or is empty 2025-07-02 11:15:38,792 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:execute(2531)) - Completed executing command(queryId=hadoop_20250702111538_f08ba63c-7b09-48c8-86fe-f29aa249329c); Time taken: 0.094 seconds 2025-07-02 11:15:38,792 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:checkConcurrency(285)) - Concurrency mode is disabled, not creating a lock manager 2025-07-02 11:15:38,793 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] conf.HiveConf (HiveConf.java:getLogIdVar(5037)) - Using the default value passed in for log id: 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 2025-07-02 11:15:38,793 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] session.SessionState (SessionState.java:resetThreadName(452)) - Resetting thread name to main 2025-07-02 11:15:38,793 INFO [main] conf.HiveConf (HiveConf.java:getLogIdVar(5037)) - Using the default value passed in for log id: 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 2025-07-02 11:15:38,799 INFO [main] tez.TezSessionPoolManager (TezSessionPoolManager.java:closeIfNotDefault(351)) - Closing tez session if not default: sessionId=63fc22ae-87a3-4d13-b59e-6ea5a99a9941, queueName=null, user=hadoop, doAs=false, isOpen=false, isDefault=false 2025-07-02 11:15:38,800 INFO [Tez session start thread] client.TezClient (TezClient.java:stop(731)) - Shutting down Tez Session, sessionName=HIVE-63fc22ae-87a3-4d13-b59e-6ea5a99a9941, applicationId=application_1740624029612_5078 2025-07-02 11:15:38,815 INFO [Tez session start thread] client.TezClient (TezClient.java:stop(777)) - Could not connect to AM, killing session via YARN, sessionName=HIVE-63fc22ae-87a3-4d13-b59e-6ea5a99a9941, applicationId=application_1740624029612_5078 2025-07-02 11:15:38,824 INFO [main] tez.TezSessionState (TezSessionState.java:cleanupDagResources(721)) - Attemting to clean up resources for 63fc22ae-87a3-4d13-b59e-6ea5a99a9941: hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/63fc22ae-87a3-4d13-b59e-6ea5a99a9941-resources; 0 additional files, 2 localized resources 2025-07-02 11:15:38,839 INFO [pool-7-thread-1] client.AHSProxy (AHSProxy.java:createAHSProxy(43)) - Connecting to Application History server at dc3-dominos-usdp-fun01/10.30.10.60:10200 2025-07-02 11:15:38,840 INFO [pool-7-thread-1] client.TezClient (TezClient.java:start(388)) - Session mode. Starting session. 2025-07-02 11:15:38,840 INFO [pool-7-thread-1] client.ConfiguredRMFailoverProxyProvider (ConfiguredRMFailoverProxyProvider.java:performFailover(100)) - Failing over to rm-dc3-dominos-usdp-fun01 2025-07-02 11:15:38,851 INFO [pool-7-thread-1] client.TezClientUtils (TezClientUtils.java:setupTezJarsLocalResources(180)) - Using tez.lib.uris value from configuration: hdfs:////dominos-usdp-v3-fun/tez/tez.tar.gz 2025-07-02 11:15:38,851 INFO [pool-7-thread-1] client.TezClientUtils (TezClientUtils.java:setupTezJarsLocalResources(182)) - Using tez.lib.uris.classpath value from configuration: null 2025-07-02 11:15:38,855 INFO [main] session.SessionState (SessionState.java:dropPathAndUnregisterDeleteOnExit(885)) - Deleted directory: /tmp/hive/hadoop/63fc22ae-87a3-4d13-b59e-6ea5a99a9941 on fs with scheme hdfs 2025-07-02 11:15:38,856 INFO [main] session.SessionState (SessionState.java:dropPathAndUnregisterDeleteOnExit(885)) - Deleted directory: /tmp/hadoop/63fc22ae-87a3-4d13-b59e-6ea5a99a9941 on fs with scheme file 2025-07-02 11:15:38,857 INFO [main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:close(600)) - Closed a connection to metastore, current connections: 1 2025-07-02 11:15:39,565 INFO - FINALIZE_SESSION
<think>我们正在处理一个Hive表分区操作的问题。用户在执行ALTER TABLE时出现错误:"partition specis invalidfield dtdoes notexist oris empty"。根据错误信息,问题可能出在分区字段'dt'上。可能的原因包括:1.表结构中不存在名为'dt'的分区字段。2.在指定分区时,分区字段名写错了(比如大小写不匹配,或者拼写错误)。3.分区值可能为空或者格式不正确。我们需要检查表结构,确认分区字段是否存在,以及分区指定的语法是否正确。根据引用[1]中的示例,添加分区的正确语法是:alter tabletest_table addif notexists partition(pt_d='20190101');注意,这里的分区字段是'pt_d',而不是'dt'。用户的问题中使用的分区字段是'dt',所以我们需要确认用户表的分区字段名。另外,引用[2]中插入数据时指定分区的方式是:partition(dt='2018-08-15'),这里使用的是'dt'。所以用户可能在操作不同的表,或者表结构不一致。因此,建议用户执行以下步骤:1.检查表结构,确认分区字段名称。可以使用命令:`DESCRIBEFORMATTEDtable_name;`,查看分区字段。2.确保在ALTER TABLE语句中使用的分区字段名称与表定义中的完全一致(包括大小写,因为Hive元数据存储的分区字段名是区分大小写的)。如果用户确认分区字段存在,那么可能是分区值的问题。确保分区值不为空,并且格式正确(例如,日期格式应该是字符串,且与表定义中的格式一致)。另外,错误信息中提到了“field dtdoes notexist”,这通常意味着分区字段不存在。所以重点检查分区字段名称。如果用户确实想添加一个名为'dt'的分区字段,但表结构中并没有这个字段,那么需要先修改表结构添加分区字段。但是,Hive表的分区字段是在创建表时定义的,不能通过ALTER TABLE来添加新的分区字段(只能添加新的分区,即已有分区字段的值)。所以,如果表没有'dt'这个分区字段,那么用户需要重新创建表或者使用已有的分区字段。综上所述,解决方案如下:-确认表的分区字段名称:使用`DESCRIBEFORMATTEDtable_name;`查看。-如果分区字段不是'dt',则使用正确的分区字段名称。-如果表没有分区字段,那么需要重新定义表结构(或者用户可能误操作)。下面是一个示例,假设表的分区字段是'pt_d',那么正确的删除分区和添加分区的语句应该是:ALTER TABLEtable_nameDROPIFEXISTS PARTITION (pt_d='value');ALTERTABLE table_name ADDIF NOTEXISTS PARTITION (pt_d='value');注意:删除分区和添加分区的语句中,分区字段名称必须一致。如果用户确实需要修改分区字段,那么这涉及到表结构的修改,通常需要重建表。但是,Hive不支持直接修改分区字段(即不能添加或删除分区字段,只能修改分区字段的值)。所以,如果用户想要改变分区字段,只能创建新表。根据引用[3],对于分区表,ALTERTABLE修改列信息时,如果使用CASCADE关键字,会修改所有分区的元数据,但不会改变实际数据。但这里的问题不是修改列,而是分区字段本身,所以不适用。因此,针对用户当前的错误,最可能的原因是分区字段名写错了或者不存在。我们生成相关问题以帮助用户进一步排查:
§§相关问题§§1.请确认您的表的分区字段名称是什么?可以使用`DESCRIBE FORMATTED table_name;`查看。
2.您执行的具体ALTER TABLE语句是什么?请提供完整的语句以便分析。3.您是否在创建表时定义了分区字段'dt'?如果没有,您需要重新创建表或使用已有的分区字段。
4.您是否尝试过使用正确的分区字段名称(例如,如果实际分区字段是'pt_d',则使用pt_d)?</think>### Hive分区操作错误解决方案
根据错误信息`partition spec is invalid field dt does not exist or is empty`,该问题通常由以下原因引起:
#### 常见原因及解决方案:
1. **分区字段名称错误**
- 📌 检查表结构中是否存在名为`dt`的分区字段:
```sql
DESCRIBE FORMATTED your_table_name;
```
- 确认实际分区字段名(如`pt_d`),确保ALTER语句使用正确字段名[^1]
- ✅ 正确示例:`ALTER TABLE test_table DROP PARTITION (pt_d='20230101')`
2. **分区值格式错误**
- 确保分区值格式与表定义一致(如字符串需加引号)
- ❌ 错误示例:`... PARTITION (dt=20230101)`
- ✅ 正确示例:`... PARTITION (dt='20230101')`
3. **表未定义分区字段**
- 若表创建时未指定分区字段:
```sql
CREATE TABLE non_partitioned_table (...) -- 缺少PARTITIONED BY子句
```
- 解决方案:重建表或添加分区字段(需迁移数据)
4. **元数据不一致**
- 执行元数据修复:
```sql
MSCK REPAIR TABLE your_table_name;
```
- 或手动刷新元数据:
```sql
ALTER TABLE your_table_name RECOVER PARTITIONS;
```
5. **语法错误**
- 检查特殊字符转义:
```sql
ALTER TABLE test_table DROP PARTITION (dt='date=2023-01-01')
```
- 确保分区值不包含非法字符(如空格、分号等)
#### 操作示例:
```sql
-- 1. 确认分区字段名
DESCRIBE FORMATTED sales_data;
-- Output: Partition Information: pt_d string
-- 2. 正确执行分区操作
ALTER TABLE sales_data DROP PARTITION (pt_d='20230101');
```
阅读全文
相关推荐

















