Map端的shuffle过程： 1. Map任务将输出的键值对按照键进行分区，每个分区对应一个reduce任务。 2. 对于每个分区，Map任务会将其输出的键值对按照键进行排序，并将排序后的结果写入磁盘。 3. Map任务会将每个分区的元数据信息（包括分区编号、文件名、起始位置和长度等）发送给JobTracker。 Reduce端的shuffle过程： 1. Reduce任务会向JobTracker请求获取Map任务输出的元数据信息。 2. JobTracker会将Map任务输出的元数据信息发送给Reduce任务。 3. Reduce任务根据元数据信息，从Map任务的输出文件中读取对应分区的数据，并将其按照键进行排序。 4. Reduce任务将排序后的结果写入磁盘，作为最终的输出结果。

时间: 2023-06-03 09:06:46 浏览: 112

我理解你的问题是MapReduce计算框架中的shuffle过程是怎样的，是吗？ Map端的shuffle过程包括三个步骤：首先，Map任务将输出的键值对按照键进行分区，每个分区对应一个reduce任务；其次，对于每个分区，Map任务会将其输出的键值对按照键进行排序，并将排序后的结果写入磁盘；最后，Map任务会将每个分区的元数据信息（包括分区编号、文件名、起始位置和长度等）发送给JobTracker。 Reduce端的shuffle过程包括四个步骤：首先，Reduce任务会向JobTracker请求获取Map任务输出的元数据信息；其次，JobTracker会将Map任务输出的元数据信息发送给Reduce任务；然后，Reduce任务根据元数据信息，从Map任务的输出文件中读取对应分区的数据，并将其按照键进行排序；最后，Reduce任务将排序后的结果写入磁盘，作为最终的输出结果。希望能够解答你的问题，如果还有其他问题，欢迎继续提问。

1. 词频统计对文本文件“world.txt”内容: 第一行hello Spark 第二行hello world 第三行Spark is fast 第四行world is big 编写spark程序，统计每个单词出现次数，用Map和Reduce思路。MapReduce 工作流程主要包括 Map（映射）和Reduce（规约）两个阶段。（1）当启动一个 MapReduce 任务时，Map 端将会读取 HDFS（Hadoop Distributed File System，Hadoop 分布式文件系统）上的数据，将数据映射成所需要的键值对并输出至 Reduce 端。（2）Reduce 端接收 Map 端键值对类型的中间数据，并根据不同键进行分组，对每一组键相同的数据进行处理，得到新的键值对并输出至 HDFS。第一行用from pyspark.sql import SparkSession导入，越简短越好

<think>我们使用Spark来实现词频统计（WordCount），这是MapReduce的经典案例。在Spark中，我们可以使用RDD（弹性分布式数据集）的转换操作来实现Map和Reduce步骤。步骤：1.创建SparkSession2.读取文本文件，创建RDD3.Map阶段：将每行文本拆分成单词，并将每个单词映射为(word,1)4.Reduce阶段：按单词分组并求和5.输出结果6.停止SparkSession注意：在Spark中，我们通常使用`flatMap`进行拆分，然后使用`map`将每个单词转换为元组，接着使用`reduceByKey`进行聚合。参考引用[2]中的步骤，但使用PySpark实现。代码如下：</think>以下是一个使用 Spark MapReduce 思路实现的词频统计程序，基于 PySpark： ```python from pyspark.sql import SparkSession spark = SparkSession.builder.appName("WordCount").getOrCreate() # 读取文本文件创建RDD text_rdd = spark.sparkContext.textFile("world.txt") # MapReduce步骤 word_counts = text_rdd.flatMap(lambda line: line.split(" ")) \ # Map: 拆分单词 .map(lambda word: (word, 1)) \ # Map: 生成键值对 .reduceByKey(lambda a, b: a + b) # Reduce: 聚合计数 # 输出结果 word_counts.collect() spark.stop() ``` ### 程序说明： 1. **Map阶段**： - `flatMap`：将每行文本拆分为单词流（$word_1, word_2, ..., word_n$） - `map`：为每个单词生成键值对（$(word, 1)$） 2. **Reduce阶段**： - `reduceByKey`：对相同单词的计数值求和（$sum = \sum_{i=1}^{n} 1$） 3. **分布式处理**： - 自动将数据分区到集群节点并行处理 - 类似 Hadoop MapReduce 但内存计算更快[^2][^3] ### 执行示例：输入文件 `world.txt` 内容： ``` hello world hello spark ``` 输出结果： ``` [('hello', 2), ('world', 1), ('spark', 1)] ``` --- ### 相关问题： 1. Spark 的 `flatMap` 和 `map` 操作有何本质区别？ 2. 如何优化上述程序以忽略单词大小写和标点符号？ 3. Spark 的 `reduceByKey` 与 Hadoop MapReduce 的 Shuffle 阶段有何异同？[^3] 4. 当处理超大规模文本时，如何避免 `collect()` 导致的内存溢出？ 5. Spark 词频统计相比传统 MapReduce 有哪些性能优势？[^2]

2025-07-02 11:15:25,551 INFO - task run command: sudo -u hadoop -E bash /tmp/dolphinscheduler/exec/process/hadoop/16836554651104/18167664743392_10/32581/56672/32581_56672.command 2025-07-02 11:15:25,552 INFO - process start, process id is: 1190 2025-07-02 11:15:26,553 INFO - -> /usr/lib/dolphinscheduler/worker-server/conf/dolphinscheduler_env.sh: line 23: export: `zookeeper.quorum=': not a valid identifier /usr/lib/dolphinscheduler/worker-server/conf/dolphinscheduler_env.sh: line 23: export: `dominos-usdp-fun01:2181,dominos-usdp-fun02:2181,dominos-usdp-fun03:2181': not a valid identifier 2025-07-02 11:15:31,554 INFO - -> SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/lib/hive/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory] 2025-07-02 11:15:31,226 INFO [main] conf.HiveConf (HiveConf.java:findConfigFile(187)) - Found configuration file file:/etc/hive/conf/hive-site.xml 2025-07-02 11:15:32,554 INFO - -> 2025-07-02 11:15:32,428 main ERROR Cannot access RandomAccessFile java.io.FileNotFoundException: /data/log/hive/hive.log (Permission denied) java.io.FileNotFoundException: /data/log/hive/hive.log (Permission denied) at java.io.RandomAccessFile.open0(Native Method) at java.io.RandomAccessFile.open(RandomAccessFile.java:316) at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243) at java.io.RandomAccessFile.<init>(RandomAccessFile.java:124) at org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager$RollingRandomAccessFileManagerFactory.createManager(RollingRandomAccessFileManager.java:232) at org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager$RollingRandomAccessFileManagerFactory.createManager(RollingRandomAccessFileManager.java:204) at org.apache.logging.log4j.core.appender.AbstractManager.getManager(AbstractManager.java:114) at org.apache.logging.log4j.core.appender.OutputStreamManager.getManager(OutputStreamManager.java:100) at org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager.getRollingRandomAccessFileManager(RollingRandomAccessFileManager.java:107) at org.apache.logging.log4j.core.appender.RollingRandomAccessFileAppender$Builder.build(RollingRandomAccessFileAppender.java:132) at org.apache.logging.log4j.core.appender.RollingRandomAccessFileAppender$Builder.build(RollingRandomAccessFileAppender.java:53) at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:122) at org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:1120) at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:1045) at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:1037) at org.apache.logging.log4j.core.config.AbstractConfiguration.doConfigure(AbstractConfiguration.java:651) at org.apache.logging.log4j.core.config.AbstractConfiguration.initialize(AbstractConfiguration.java:247) at org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:293) at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:626) at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:302) at org.apache.logging.log4j.core.async.AsyncLoggerContext.start(AsyncLoggerContext.java:87) at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:242) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:159) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:131) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:101) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:210) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jDefault(LogUtils.java:173) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:106) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:98) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(LogUtils.java:81) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:699) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:323) at org.apache.hadoop.util.RunJar.main(RunJar.java:236) 2025-07-02 11:15:32,430 main ERROR Could not create plugin of type class org.apache.logging.log4j.core.appender.RollingRandomAccessFileAppender for element RollingRandomAccessFile: java.lang.IllegalStateException: ManagerFactory [org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager$RollingRandomAccessFileManagerFactory@5ef6ae06] unable to create manager for [/data/log/hive/hive.log] with data [org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager$FactoryData@55dfebeb] java.lang.IllegalStateException: ManagerFactory [org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager$RollingRandomAccessFileManagerFactory@5ef6ae06] unable to create manager for [/data/log/hive/hive.log] with data [org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager$FactoryData@55dfebeb] at org.apache.logging.log4j.core.appender.AbstractManager.getManager(AbstractManager.java:116) at org.apache.logging.log4j.core.appender.OutputStreamManager.getManager(OutputStreamManager.java:100) at org.apache.logging.log4j.core.appender.rolling.RollingRandomAccessFileManager.getRollingRandomAccessFileManager(RollingRandomAccessFileManager.java:107) at org.apache.logging.log4j.core.appender.RollingRandomAccessFileAppender$Builder.build(RollingRandomAccessFileAppender.java:132) at org.apache.logging.log4j.core.appender.RollingRandomAccessFileAppender$Builder.build(RollingRandomAccessFileAppender.java:53) at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:122) at org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:1120) at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:1045) at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:1037) at org.apache.logging.log4j.core.config.AbstractConfiguration.doConfigure(AbstractConfiguration.java:651) at org.apache.logging.log4j.core.config.AbstractConfiguration.initialize(AbstractConfiguration.java:247) at org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:293) at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:626) at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:302) at org.apache.logging.log4j.core.async.AsyncLoggerContext.start(AsyncLoggerContext.java:87) at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:242) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:159) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:131) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:101) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:210) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jDefault(LogUtils.java:173) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:106) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:98) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(LogUtils.java:81) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:699) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:323) at org.apache.hadoop.util.RunJar.main(RunJar.java:236) 2025-07-02 11:15:32,431 main ERROR Unable to invoke factory method in class org.apache.logging.log4j.core.appender.RollingRandomAccessFileAppender for element RollingRandomAccessFile: java.lang.IllegalStateException: No factory method found for class org.apache.logging.log4j.core.appender.RollingRandomAccessFileAppender java.lang.IllegalStateException: No factory method found for class org.apache.logging.log4j.core.appender.RollingRandomAccessFileAppender at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.findFactoryMethod(PluginBuilder.java:236) at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:134) at org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:1120) at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:1045) at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:1037) at org.apache.logging.log4j.core.config.AbstractConfiguration.doConfigure(AbstractConfiguration.java:651) at org.apache.logging.log4j.core.config.AbstractConfiguration.initialize(AbstractConfiguration.java:247) at org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:293) at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:626) at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:302) at org.apache.logging.log4j.core.async.AsyncLoggerContext.start(AsyncLoggerContext.java:87) at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:242) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:159) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:131) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:101) at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:210) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jDefault(LogUtils.java:173) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:106) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:98) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(LogUtils.java:81) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:699) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:323) at org.apache.hadoop.util.RunJar.main(RunJar.java:236) 2025-07-02 11:15:32,432 main ERROR Null object returned for RollingRandomAccessFile in Appenders. 2025-07-02 11:15:32,432 main ERROR Unable to locate appender "DRFA" for logger config "root" Hive Session ID = 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 2025-07-02 11:15:32,528 INFO [main] SessionState (SessionState.java:printInfo(1227)) - Hive Session ID = 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 2025-07-02 11:15:33,555 INFO - -> Logging initialized using configuration in file:/etc/hive/conf/hive-log4j2.properties Async: true 2025-07-02 11:15:32,577 INFO [main] SessionState (SessionState.java:printInfo(1227)) - Logging initialized using configuration in file:/etc/hive/conf/hive-log4j2.properties Async: true 2025-07-02 11:15:34,556 INFO - -> 2025-07-02 11:15:33,630 INFO [main] session.SessionState (SessionState.java:createPath(790)) - Created HDFS directory: /tmp/hive/hadoop/63fc22ae-87a3-4d13-b59e-6ea5a99a9941 2025-07-02 11:15:33,652 INFO [main] session.SessionState (SessionState.java:createPath(790)) - Created local directory: /tmp/hadoop/63fc22ae-87a3-4d13-b59e-6ea5a99a9941 2025-07-02 11:15:33,659 INFO [main] session.SessionState (SessionState.java:createPath(790)) - Created HDFS directory: /tmp/hive/hadoop/63fc22ae-87a3-4d13-b59e-6ea5a99a9941/_tmp_space.db 2025-07-02 11:15:33,691 INFO [main] tez.TezSessionState (TezSessionState.java:openInternal(277)) - User of session id 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 is hadoop 2025-07-02 11:15:33,714 INFO [main] tez.DagUtils (DagUtils.java:localizeResource(1159)) - Localizing resource because it does not exist: file:/usr/lib/hive/auxlib/hudi-hadoop-mr-bundle-0.13.0.jar to dest: hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/63fc22ae-87a3-4d13-b59e-6ea5a99a9941-resources/hudi-hadoop-mr-bundle-0.13.0.jar 2025-07-02 11:15:34,365 INFO [main] tez.DagUtils (DagUtils.java:createLocalResource(842)) - Resource modification time: 1751426134293 for hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/63fc22ae-87a3-4d13-b59e-6ea5a99a9941-resources/hudi-hadoop-mr-bundle-0.13.0.jar 2025-07-02 11:15:34,384 INFO [main] tez.DagUtils (DagUtils.java:localizeResource(1159)) - Localizing resource because it does not exist: file:/usr/lib/hive/auxlib/hudi-hive-sync-bundle-0.13.0.jar to dest: hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/63fc22ae-87a3-4d13-b59e-6ea5a99a9941-resources/hudi-hive-sync-bundle-0.13.0.jar 2025-07-02 11:15:35,557 INFO - -> 2025-07-02 11:15:34,776 INFO [main] tez.DagUtils (DagUtils.java:createLocalResource(842)) - Resource modification time: 1751426134737 for hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/63fc22ae-87a3-4d13-b59e-6ea5a99a9941-resources/hudi-hive-sync-bundle-0.13.0.jar 2025-07-02 11:15:34,851 INFO [main] tez.TezSessionState (TezSessionState.java:openInternal(288)) - Created new resources: null 2025-07-02 11:15:34,854 INFO [main] tez.DagUtils (DagUtils.java:getHiveJarDirectory(1058)) - Jar dir is null / directory doesn't exist. Choosing HIVE_INSTALL_DIR - /user/hadoop/.hiveJars 2025-07-02 11:15:35,179 INFO [main] tez.TezSessionState (TezSessionState.java:getSha(854)) - Computed sha: 3420a6126cfea97266fe35b708da5d5f95a5b158cad390dc4124081a39cf906f for file: file:/usr/lib/hive/lib/hive-exec-3.1.3.jar of length: 40.36MB in 321 ms 2025-07-02 11:15:35,191 INFO [main] tez.DagUtils (DagUtils.java:createLocalResource(842)) - Resource modification time: 1715146950410 for hdfs://dominos-usdp-v3-fun/user/hadoop/.hiveJars/hive-exec-3.1.3-3420a6126cfea97266fe35b708da5d5f95a5b158cad390dc4124081a39cf906f.jar 2025-07-02 11:15:35,240 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.task.io.sort.mb, mr initial value=100, tez(original):tez.runtime.io.sort.mb=null, tez(final):tez.runtime.io.sort.mb=100 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.read.timeout, mr initial value=180000, tez(original):tez.runtime.shuffle.read.timeout=null, tez(final):tez.runtime.shuffle.read.timeout=180000 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.speculative.minimum-allowed-tasks, mr initial value=10, tez(original):tez.am.minimum.allowed.speculative.tasks=null, tez(final):tez.am.minimum.allowed.speculative.tasks=10 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.ifile.readahead.bytes, mr initial value=4194304, tez(original):tez.runtime.ifile.readahead.bytes=null, tez(final):tez.runtime.ifile.readahead.bytes=4194304 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.shuffle.ssl.enabled, mr initial value=false, tez(original):tez.runtime.shuffle.ssl.enable=null, tez(final):tez.runtime.shuffle.ssl.enable=false 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.map.sort.spill.percent, mr initial value=0.80, tez(original):tez.runtime.sort.spill.percent=null, tez(final):tez.runtime.sort.spill.percent=0.80 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.speculative.speculative-cap-running-tasks, mr initial value=0.1, tez(original):tez.am.proportion.running.tasks.speculatable=null, tez(final):tez.am.proportion.running.tasks.speculatable=0.1 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.speculative.speculative-cap-total-tasks, mr initial value=0.01, tez(original):tez.am.proportion.total.tasks.speculatable=null, tez(final):tez.am.proportion.total.tasks.speculatable=0.01 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.ifile.readahead, mr initial value=true, tez(original):tez.runtime.ifile.readahead=null, tez(final):tez.runtime.ifile.readahead=true 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.merge.percent, mr initial value=0.66, tez(original):tez.runtime.shuffle.merge.percent=null, tez(final):tez.runtime.shuffle.merge.percent=0.66 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.parallelcopies, mr initial value=50, tez(original):tez.runtime.shuffle.parallel.copies=null, tez(final):tez.runtime.shuffle.parallel.copies=50 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.speculative.retry-after-speculate, mr initial value=15000, tez(original):tez.am.soonest.retry.after.speculate=null, tez(final):tez.am.soonest.retry.after.speculate=15000 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.reduce.slowstart.completedmaps, mr initial value=0.95, tez(original):tez.shuffle-vertex-manager.min-src-fraction=null, tez(final):tez.shuffle-vertex-manager.min-src-fraction=0.95 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.memory.limit.percent, mr initial value=0.25, tez(original):tez.runtime.shuffle.memory.limit.percent=null, tez(final):tez.runtime.shuffle.memory.limit.percent=0.25 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.speculative.retry-after-no-speculate, mr initial value=1000, tez(original):tez.am.soonest.retry.after.no.speculate=null, tez(final):tez.am.soonest.retry.after.no.speculate=1000 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.task.io.sort.factor, mr initial value=100, tez(original):tez.runtime.io.sort.factor=null, tez(final):tez.runtime.io.sort.factor=100 2025-07-02 11:15:35,241 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.map.output.compress, mr initial value=false, tez(original):tez.runtime.compress=null, tez(final):tez.runtime.compress=false 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.connect.timeout, mr initial value=180000, tez(original):tez.runtime.shuffle.connect.timeout=null, tez(final):tez.runtime.shuffle.connect.timeout=180000 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.input.buffer.percent, mr initial value=0.0, tez(original):tez.runtime.task.input.post-merge.buffer.percent=null, tez(final):tez.runtime.task.input.post-merge.buffer.percent=0.0 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.map.output.compress.codec, mr initial value=org.apache.hadoop.io.compress.DefaultCodec, tez(original):tez.runtime.compress.codec=null, tez(final):tez.runtime.compress.codec=org.apache.hadoop.io.compress.DefaultCodec 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.task.merge.progress.records, mr initial value=10000, tez(original):tez.runtime.merge.progress.records=null, tez(final):tez.runtime.merge.progress.records=10000 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):map.sort.class, mr initial value=org.apache.hadoop.util.QuickSort, tez(original):tez.runtime.internal.sorter.class=null, tez(final):tez.runtime.internal.sorter.class=org.apache.hadoop.util.QuickSort 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.input.buffer.percent, mr initial value=0.70, tez(original):tez.runtime.shuffle.fetch.buffer.percent=null, tez(final):tez.runtime.shuffle.fetch.buffer.percent=0.70 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.counters.max, mr initial value=120, tez(original):tez.counters.max=null, tez(final):tez.counters.max=120 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.hdfs-servers, mr initial value=hdfs://dominos-usdp-v3-fun, tez(original):tez.job.fs-servers=null, tez(final):tez.job.fs-servers=hdfs://dominos-usdp-v3-fun 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.queuename, mr initial value=default, tez(original):tez.queue.name=default, tez(final):tez.queue.name=default 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.maxtaskfailures.per.tracker, mr initial value=3, tez(original):tez.am.maxtaskfailures.per.node=null, tez(final):tez.am.maxtaskfailures.per.node=3 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.task.timeout, mr initial value=600000, tez(original):tez.task.timeout-ms=null, tez(final):tez.task.timeout-ms=600000 2025-07-02 11:15:35,242 INFO [main] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):yarn.app.mapreduce.am.job.task.listener.thread-count, mr initial value=30, tez(original):tez.am.task.listener.thread-count=null, tez(final):tez.am.task.listener.thread-count=30 2025-07-02 11:15:35,261 INFO [main] sqlstd.SQLStdHiveAccessController (SQLStdHiveAccessController.java:<init>(96)) - Created SQLStdHiveAccessController for session context : HiveAuthzSessionContext [sessionString=63fc22ae-87a3-4d13-b59e-6ea5a99a9941, clientType=HIVECLI] 2025-07-02 11:15:35,263 WARN [main] session.SessionState (SessionState.java:setAuthorizerV2Config(950)) - METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory. 2025-07-02 11:15:35,328 INFO [main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(441)) - Trying to connect to metastore with URI thrift://dc3-dominos-usdp-fun01:9083 2025-07-02 11:15:35,350 INFO [main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(517)) - Opened a connection to metastore, current connections: 1 2025-07-02 11:15:35,358 INFO [main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(570)) - Connected to metastore. 2025-07-02 11:15:35,358 INFO [main] metastore.RetryingMetaStoreClient (RetryingMetaStoreClient.java:<init>(97)) - RetryingMetaStoreClient proxy=class org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient ugi=hadoop (auth:SIMPLE) retries=1 delay=1 lifetime=0 2025-07-02 11:15:36,558 INFO - -> 2025-07-02 11:15:35,864 INFO [main] counters.Limits (Limits.java:init(61)) - Counter limits initialized with parameters: GROUP_NAME_MAX=256, MAX_GROUPS=500, COUNTER_NAME_MAX=64, MAX_COUNTERS=1200 2025-07-02 11:15:35,864 INFO [main] counters.Limits (Limits.java:init(61)) - Counter limits initialized with parameters: GROUP_NAME_MAX=256, MAX_GROUPS=500, COUNTER_NAME_MAX=64, MAX_COUNTERS=120 2025-07-02 11:15:35,864 INFO [main] client.TezClient (TezClient.java:<init>(210)) - Tez Client Version: [ component=tez-api, version=0.10.2, revision=22f46fe39a7cf99b24275304e99867b9135caba2, SCM-URL=scm:git:https://gitbox.apache.org/repos/asf/tez.git, buildTime=2023-02-08T02:24:56Z, buildUser=jenkins, buildJavaVersion=1.8.0_362 ] 2025-07-02 11:15:35,864 INFO [main] tez.TezSessionState (TezSessionState.java:openInternal(363)) - Opening new Tez Session (id: 63fc22ae-87a3-4d13-b59e-6ea5a99a9941, scratch dir: hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/63fc22ae-87a3-4d13-b59e-6ea5a99a9941) 2025-07-02 11:15:35,884 INFO [main] conf.HiveConf (HiveConf.java:getLogIdVar(5037)) - Using the default value passed in for log id: 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 2025-07-02 11:15:35,884 INFO [main] session.SessionState (SessionState.java:updateThreadName(441)) - Updating thread name to 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main 2025-07-02 11:15:35,954 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:isCompatibleWith(346)) - Mestastore configuration metastore.filter.hook changed from org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook to org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl 2025-07-02 11:15:35,958 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:close(600)) - Closed a connection to metastore, current connections: 0 2025-07-02 11:15:36,152 INFO [Tez session start thread] impl.TimelineReaderClientImpl (TimelineReaderClientImpl.java:serviceInit(97)) - Initialized TimelineReader URI=http://dc3-dominos-usdp-fun02:8198/ws/v2/timeline/, clusterId=dominos-usdp-v3-fun 2025-07-02 11:15:36,342 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(441)) - Trying to connect to metastore with URI thrift://dc3-dominos-usdp-fun01:9083 2025-07-02 11:15:36,344 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(517)) - Opened a connection to metastore, current connections: 1 2025-07-02 11:15:36,345 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(570)) - Connected to metastore. 2025-07-02 11:15:36,345 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.RetryingMetaStoreClient (RetryingMetaStoreClient.java:<init>(97)) - RetryingMetaStoreClient proxy=class org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient ugi=hadoop (auth:SIMPLE) retries=1 delay=1 lifetime=0 Hive Session ID = 4caadf81-0f27-469e-8de0-87e177d910e3 2025-07-02 11:15:36,366 INFO [pool-7-thread-1] SessionState (SessionState.java:printInfo(1227)) - Hive Session ID = 4caadf81-0f27-469e-8de0-87e177d910e3 2025-07-02 11:15:36,385 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] conf.HiveConf (HiveConf.java:getLogIdVar(5037)) - Using the default value passed in for log id: 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 2025-07-02 11:15:36,386 INFO [pool-7-thread-1] session.SessionState (SessionState.java:createPath(790)) - Created HDFS directory: /tmp/hive/hadoop/4caadf81-0f27-469e-8de0-87e177d910e3 2025-07-02 11:15:36,412 INFO [pool-7-thread-1] session.SessionState (SessionState.java:createPath(790)) - Created local directory: /tmp/hadoop/4caadf81-0f27-469e-8de0-87e177d910e3 2025-07-02 11:15:36,420 INFO [pool-7-thread-1] session.SessionState (SessionState.java:createPath(790)) - Created HDFS directory: /tmp/hive/hadoop/4caadf81-0f27-469e-8de0-87e177d910e3/_tmp_space.db 2025-07-02 11:15:36,420 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:openInternal(277)) - User of session id 4caadf81-0f27-469e-8de0-87e177d910e3 is hadoop 2025-07-02 11:15:36,441 INFO [pool-7-thread-1] tez.DagUtils (DagUtils.java:localizeResource(1159)) - Localizing resource because it does not exist: file:/usr/lib/hive/auxlib/hudi-hadoop-mr-bundle-0.13.0.jar to dest: hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/4caadf81-0f27-469e-8de0-87e177d910e3-resources/hudi-hadoop-mr-bundle-0.13.0.jar 2025-07-02 11:15:36,455 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:compile(554)) - Compiling command(queryId=hadoop_20250702111536_a8fe6b15-57e0-4288-895c-6d4f8fd58503): ALTER TABLE ddp_dmo_dwd.DWD_OrdCusSrvDetail DROP IF EXISTS PARTITION(DT='') 2025-07-02 11:15:36,484 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] hooks.ATSHook (ATSHook.java:<init>(146)) - Created ATS Hook 2025-07-02 11:15:36,484 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] hooks.ATSHook (ATSHook.java:<init>(146)) - Created ATS Hook 2025-07-02 11:15:36,484 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] hooks.ATSHook (ATSHook.java:<init>(146)) - Created ATS Hook 2025-07-02 11:15:37,561 INFO - -> 2025-07-02 11:15:36,669 INFO [Tez session start thread] client.AHSProxy (AHSProxy.java:createAHSProxy(43)) - Connecting to Application History server at dc3-dominos-usdp-fun01/10.30.10.60:10200 2025-07-02 11:15:36,686 INFO [Tez session start thread] client.TezClient (TezClient.java:start(388)) - Session mode. Starting session. 2025-07-02 11:15:36,727 INFO [Tez session start thread] client.ConfiguredRMFailoverProxyProvider (ConfiguredRMFailoverProxyProvider.java:performFailover(100)) - Failing over to rm-dc3-dominos-usdp-fun01 2025-07-02 11:15:36,809 INFO [Tez session start thread] client.TezClientUtils (TezClientUtils.java:setupTezJarsLocalResources(180)) - Using tez.lib.uris value from configuration: hdfs:////dominos-usdp-v3-fun/tez/tez.tar.gz 2025-07-02 11:15:36,809 INFO [Tez session start thread] client.TezClientUtils (TezClientUtils.java:setupTezJarsLocalResources(182)) - Using tez.lib.uris.classpath value from configuration: null 2025-07-02 11:15:36,880 INFO [Tez session start thread] client.TezClient (TezCommonUtils.java:createTezSystemStagingPath(123)) - Tez system stage directory hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/63fc22ae-87a3-4d13-b59e-6ea5a99a9941/.tez/application_1740624029612_5078 doesn't exist and is created 2025-07-02 11:15:36,913 INFO [Tez session start thread] conf.Configuration (Configuration.java:getConfResourceAsInputStream(2845)) - resource-types.xml not found 2025-07-02 11:15:36,914 INFO [Tez session start thread] resource.ResourceUtils (ResourceUtils.java:addResourcesFileToConf(476)) - Unable to find 'resource-types.xml'. 2025-07-02 11:15:36,948 INFO [Tez session start thread] Configuration.deprecation (Configuration.java:logDeprecation(1441)) - yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled 2025-07-02 11:15:37,000 INFO [pool-7-thread-1] tez.DagUtils (DagUtils.java:createLocalResource(842)) - Resource modification time: 1751426136955 for hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/4caadf81-0f27-469e-8de0-87e177d910e3-resources/hudi-hadoop-mr-bundle-0.13.0.jar 2025-07-02 11:15:37,005 INFO [pool-7-thread-1] tez.DagUtils (DagUtils.java:localizeResource(1159)) - Localizing resource because it does not exist: file:/usr/lib/hive/auxlib/hudi-hive-sync-bundle-0.13.0.jar to dest: hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/4caadf81-0f27-469e-8de0-87e177d910e3-resources/hudi-hive-sync-bundle-0.13.0.jar 2025-07-02 11:15:38,562 INFO - -> 2025-07-02 11:15:37,601 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:isCompatibleWith(346)) - Mestastore configuration metastore.filter.hook changed from org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl to org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook 2025-07-02 11:15:37,602 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:close(600)) - Closed a connection to metastore, current connections: 0 2025-07-02 11:15:37,603 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:checkConcurrency(285)) - Concurrency mode is disabled, not creating a lock manager 2025-07-02 11:15:37,610 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(441)) - Trying to connect to metastore with URI thrift://dc3-dominos-usdp-fun02:9083 2025-07-02 11:15:37,615 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(517)) - Opened a connection to metastore, current connections: 1 2025-07-02 11:15:37,620 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(570)) - Connected to metastore. 2025-07-02 11:15:37,621 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] metastore.RetryingMetaStoreClient (RetryingMetaStoreClient.java:<init>(97)) - RetryingMetaStoreClient proxy=class org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient ugi=hadoop (auth:SIMPLE) retries=1 delay=1 lifetime=0 2025-07-02 11:15:37,676 INFO [Tez session start thread] impl.YarnClientImpl (YarnClientImpl.java:submitApplication(338)) - Submitted application application_1740624029612_5078 2025-07-02 11:15:37,685 INFO [Tez session start thread] client.TezClient (TezClient.java:start(404)) - The url to track the Tez Session: http://dc3-dominos-usdp-fun01:8088/proxy/application_1740624029612_5078/ 2025-07-02 11:15:38,143 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:compile(666)) - Semantic Analysis Completed (retrial = false) 2025-07-02 11:15:38,145 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:getSchema(374)) - Returning Hive schema: Schema(fieldSchemas:null, properties:null) 2025-07-02 11:15:38,149 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:compile(781)) - Completed compiling command(queryId=hadoop_20250702111536_a8fe6b15-57e0-4288-895c-6d4f8fd58503); Time taken: 1.723 seconds 2025-07-02 11:15:38,150 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] reexec.ReExecDriver (ReExecDriver.java:run(156)) - Execution #1 of query 2025-07-02 11:15:38,150 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:checkConcurrency(285)) - Concurrency mode is disabled, not creating a lock manager 2025-07-02 11:15:38,150 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:execute(2255)) - Executing command(queryId=hadoop_20250702111536_a8fe6b15-57e0-4288-895c-6d4f8fd58503): ALTER TABLE ddp_dmo_dwd.DWD_OrdCusSrvDetail DROP IF EXISTS PARTITION(DT='') 2025-07-02 11:15:38,153 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] hooks.ATSHook (ATSHook.java:setupAtsExecutor(115)) - Creating ATS executor queue with capacity 64 2025-07-02 11:15:38,177 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(130)) - Timeline service address: dc3-dominos-usdp-fun01:8188 2025-07-02 11:15:38,295 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:launchTask(2662)) - Starting task [Stage-0:DDL] in serial mode 2025-07-02 11:15:38,414 INFO [ATS Logger 0] hooks.ATSHook (ATSHook.java:createTimelineDomain(155)) - ATS domain created:hive_63fc22ae-87a3-4d13-b59e-6ea5a99a9941(hadoop,hadoop) 2025-07-02 11:15:38,528 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:execute(2531)) - Completed executing command(queryId=hadoop_20250702111536_a8fe6b15-57e0-4288-895c-6d4f8fd58503); Time taken: 0.378 seconds OK 2025-07-02 11:15:38,528 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (SessionState.java:printInfo(1227)) - OK 2025-07-02 11:15:38,529 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:checkConcurrency(285)) - Concurrency mode is disabled, not creating a lock manager Time taken: 2.104 seconds 2025-07-02 11:15:38,529 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] CliDriver (SessionState.java:printInfo(1227)) - Time taken: 2.104 seconds 2025-07-02 11:15:38,530 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] conf.HiveConf (HiveConf.java:getLogIdVar(5037)) - Using the default value passed in for log id: 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 2025-07-02 11:15:38,530 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] session.SessionState (SessionState.java:resetThreadName(452)) - Resetting thread name to main 2025-07-02 11:15:38,530 INFO [main] conf.HiveConf (HiveConf.java:getLogIdVar(5037)) - Using the default value passed in for log id: 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 2025-07-02 11:15:38,530 INFO [main] session.SessionState (SessionState.java:updateThreadName(441)) - Updating thread name to 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main 2025-07-02 11:15:38,533 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:compile(554)) - Compiling command(queryId=hadoop_20250702111538_f08ba63c-7b09-48c8-86fe-f29aa249329c): ALTER TABLE ddp_dmo_dwd.DWD_OrdCusSrvDetail ADD IF NOT EXISTS PARTITION(DT='') 2025-07-02 11:15:38,551 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] hooks.ATSHook (ATSHook.java:<init>(146)) - Created ATS Hook 2025-07-02 11:15:38,551 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] hooks.ATSHook (ATSHook.java:<init>(146)) - Created ATS Hook 2025-07-02 11:15:38,551 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] hooks.ATSHook (ATSHook.java:<init>(146)) - Created ATS Hook 2025-07-02 11:15:38,559 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:checkConcurrency(285)) - Concurrency mode is disabled, not creating a lock manager 2025-07-02 11:15:39,286 INFO - process has exited. execute path:/tmp/dolphinscheduler/exec/process/hadoop/16836554651104/18167664743392_10/32581/56672, processId:1190 ,exitStatusCode:1 ,processWaitForStatus:true ,processExitValue:1 2025-07-02 11:15:39,287 INFO - Send task execute result to master, the current task status: TaskExecutionStatus{code=6, desc='failure'} 2025-07-02 11:15:39,287 INFO - Remove the current task execute context from worker cache 2025-07-02 11:15:39,287 INFO - The current execute mode isn't develop mode, will clear the task execute file: /tmp/dolphinscheduler/exec/process/hadoop/16836554651104/18167664743392_10/32581/56672 2025-07-02 11:15:39,288 INFO - Success clear the task execute file: /tmp/dolphinscheduler/exec/process/hadoop/16836554651104/18167664743392_10/32581/56672 2025-07-02 11:15:39,562 INFO - -> 2025-07-02 11:15:38,630 INFO [pool-7-thread-1] tez.DagUtils (DagUtils.java:createLocalResource(842)) - Resource modification time: 1751426138580 for hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/4caadf81-0f27-469e-8de0-87e177d910e3-resources/hudi-hive-sync-bundle-0.13.0.jar 2025-07-02 11:15:38,630 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:openInternal(288)) - Created new resources: null 2025-07-02 11:15:38,644 INFO [pool-7-thread-1] tez.DagUtils (DagUtils.java:getHiveJarDirectory(1058)) - Jar dir is null / directory doesn't exist. Choosing HIVE_INSTALL_DIR - /user/hadoop/.hiveJars 2025-07-02 11:15:38,666 INFO [pool-7-thread-1] tez.DagUtils (DagUtils.java:createLocalResource(842)) - Resource modification time: 1715146950410 for hdfs://dominos-usdp-v3-fun/user/hadoop/.hiveJars/hive-exec-3.1.3-3420a6126cfea97266fe35b708da5d5f95a5b158cad390dc4124081a39cf906f.jar 2025-07-02 11:15:38,697 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:compile(666)) - Semantic Analysis Completed (retrial = false) 2025-07-02 11:15:38,698 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:getSchema(374)) - Returning Hive schema: Schema(fieldSchemas:null, properties:null) 2025-07-02 11:15:38,698 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:compile(781)) - Completed compiling command(queryId=hadoop_20250702111538_f08ba63c-7b09-48c8-86fe-f29aa249329c); Time taken: 0.165 seconds 2025-07-02 11:15:38,698 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] reexec.ReExecDriver (ReExecDriver.java:run(156)) - Execution #1 of query 2025-07-02 11:15:38,698 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:checkConcurrency(285)) - Concurrency mode is disabled, not creating a lock manager 2025-07-02 11:15:38,698 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:execute(2255)) - Executing command(queryId=hadoop_20250702111538_f08ba63c-7b09-48c8-86fe-f29aa249329c): ALTER TABLE ddp_dmo_dwd.DWD_OrdCusSrvDetail ADD IF NOT EXISTS PARTITION(DT='') 2025-07-02 11:15:38,700 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:launchTask(2662)) - Starting task [Stage-0:DDL] in serial mode 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.task.io.sort.mb, mr initial value=100, tez(original):tez.runtime.io.sort.mb=null, tez(final):tez.runtime.io.sort.mb=100 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.read.timeout, mr initial value=180000, tez(original):tez.runtime.shuffle.read.timeout=null, tez(final):tez.runtime.shuffle.read.timeout=180000 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.speculative.minimum-allowed-tasks, mr initial value=10, tez(original):tez.am.minimum.allowed.speculative.tasks=null, tez(final):tez.am.minimum.allowed.speculative.tasks=10 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.ifile.readahead.bytes, mr initial value=4194304, tez(original):tez.runtime.ifile.readahead.bytes=null, tez(final):tez.runtime.ifile.readahead.bytes=4194304 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.shuffle.ssl.enabled, mr initial value=false, tez(original):tez.runtime.shuffle.ssl.enable=null, tez(final):tez.runtime.shuffle.ssl.enable=false 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.map.sort.spill.percent, mr initial value=0.80, tez(original):tez.runtime.sort.spill.percent=null, tez(final):tez.runtime.sort.spill.percent=0.80 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.speculative.speculative-cap-running-tasks, mr initial value=0.1, tez(original):tez.am.proportion.running.tasks.speculatable=null, tez(final):tez.am.proportion.running.tasks.speculatable=0.1 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.speculative.speculative-cap-total-tasks, mr initial value=0.01, tez(original):tez.am.proportion.total.tasks.speculatable=null, tez(final):tez.am.proportion.total.tasks.speculatable=0.01 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.ifile.readahead, mr initial value=true, tez(original):tez.runtime.ifile.readahead=null, tez(final):tez.runtime.ifile.readahead=true 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.merge.percent, mr initial value=0.66, tez(original):tez.runtime.shuffle.merge.percent=null, tez(final):tez.runtime.shuffle.merge.percent=0.66 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.parallelcopies, mr initial value=50, tez(original):tez.runtime.shuffle.parallel.copies=null, tez(final):tez.runtime.shuffle.parallel.copies=50 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.speculative.retry-after-speculate, mr initial value=15000, tez(original):tez.am.soonest.retry.after.speculate=null, tez(final):tez.am.soonest.retry.after.speculate=15000 2025-07-02 11:15:38,726 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.reduce.slowstart.completedmaps, mr initial value=0.95, tez(original):tez.shuffle-vertex-manager.min-src-fraction=null, tez(final):tez.shuffle-vertex-manager.min-src-fraction=0.95 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.memory.limit.percent, mr initial value=0.25, tez(original):tez.runtime.shuffle.memory.limit.percent=null, tez(final):tez.runtime.shuffle.memory.limit.percent=0.25 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.speculative.retry-after-no-speculate, mr initial value=1000, tez(original):tez.am.soonest.retry.after.no.speculate=null, tez(final):tez.am.soonest.retry.after.no.speculate=1000 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.task.io.sort.factor, mr initial value=100, tez(original):tez.runtime.io.sort.factor=null, tez(final):tez.runtime.io.sort.factor=100 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.map.output.compress, mr initial value=false, tez(original):tez.runtime.compress=null, tez(final):tez.runtime.compress=false 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.connect.timeout, mr initial value=180000, tez(original):tez.runtime.shuffle.connect.timeout=null, tez(final):tez.runtime.shuffle.connect.timeout=180000 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.input.buffer.percent, mr initial value=0.0, tez(original):tez.runtime.task.input.post-merge.buffer.percent=null, tez(final):tez.runtime.task.input.post-merge.buffer.percent=0.0 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.map.output.compress.codec, mr initial value=org.apache.hadoop.io.compress.DefaultCodec, tez(original):tez.runtime.compress.codec=null, tez(final):tez.runtime.compress.codec=org.apache.hadoop.io.compress.DefaultCodec 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.task.merge.progress.records, mr initial value=10000, tez(original):tez.runtime.merge.progress.records=null, tez(final):tez.runtime.merge.progress.records=10000 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):map.sort.class, mr initial value=org.apache.hadoop.util.QuickSort, tez(original):tez.runtime.internal.sorter.class=null, tez(final):tez.runtime.internal.sorter.class=org.apache.hadoop.util.QuickSort 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.reduce.shuffle.input.buffer.percent, mr initial value=0.70, tez(original):tez.runtime.shuffle.fetch.buffer.percent=null, tez(final):tez.runtime.shuffle.fetch.buffer.percent=0.70 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.counters.max, mr initial value=120, tez(original):tez.counters.max=null, tez(final):tez.counters.max=120 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.hdfs-servers, mr initial value=hdfs://dominos-usdp-v3-fun, tez(original):tez.job.fs-servers=null, tez(final):tez.job.fs-servers=hdfs://dominos-usdp-v3-fun 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.queuename, mr initial value=default, tez(original):tez.queue.name=default, tez(final):tez.queue.name=default 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.job.maxtaskfailures.per.tracker, mr initial value=3, tez(original):tez.am.maxtaskfailures.per.node=null, tez(final):tez.am.maxtaskfailures.per.node=3 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):mapreduce.task.timeout, mr initial value=600000, tez(original):tez.task.timeout-ms=null, tez(final):tez.task.timeout-ms=600000 2025-07-02 11:15:38,727 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:setupTezParamsBasedOnMR(562)) - Config: mr(unset):yarn.app.mapreduce.am.job.task.listener.thread-count, mr initial value=30, tez(original):tez.am.task.listener.thread-count=null, tez(final):tez.am.task.listener.thread-count=30 2025-07-02 11:15:38,731 INFO [pool-7-thread-1] sqlstd.SQLStdHiveAccessController (SQLStdHiveAccessController.java:<init>(96)) - Created SQLStdHiveAccessController for session context : HiveAuthzSessionContext [sessionString=4caadf81-0f27-469e-8de0-87e177d910e3, clientType=HIVECLI] 2025-07-02 11:15:38,731 WARN [pool-7-thread-1] session.SessionState (SessionState.java:setAuthorizerV2Config(950)) - METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory. 2025-07-02 11:15:38,735 INFO [pool-7-thread-1] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(441)) - Trying to connect to metastore with URI thrift://dc3-dominos-usdp-fun02:9083 2025-07-02 11:15:38,739 INFO [pool-7-thread-1] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(517)) - Opened a connection to metastore, current connections: 2 2025-07-02 11:15:38,745 INFO [pool-7-thread-1] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:open(570)) - Connected to metastore. 2025-07-02 11:15:38,745 INFO [pool-7-thread-1] metastore.RetryingMetaStoreClient (RetryingMetaStoreClient.java:<init>(97)) - RetryingMetaStoreClient proxy=class org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient ugi=hadoop (auth:SIMPLE) retries=1 delay=1 lifetime=0 2025-07-02 11:15:38,750 INFO [pool-7-thread-1] client.TezClient (TezClient.java:<init>(210)) - Tez Client Version: [ component=tez-api, version=0.10.2, revision=22f46fe39a7cf99b24275304e99867b9135caba2, SCM-URL=scm:git:https://gitbox.apache.org/repos/asf/tez.git, buildTime=2023-02-08T02:24:56Z, buildUser=jenkins, buildJavaVersion=1.8.0_362 ] 2025-07-02 11:15:38,750 INFO [pool-7-thread-1] tez.TezSessionState (TezSessionState.java:openInternal(363)) - Opening new Tez Session (id: 4caadf81-0f27-469e-8de0-87e177d910e3, scratch dir: hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/4caadf81-0f27-469e-8de0-87e177d910e3) 2025-07-02 11:15:38,773 INFO [pool-7-thread-1] impl.TimelineReaderClientImpl (TimelineReaderClientImpl.java:serviceInit(97)) - Initialized TimelineReader URI=http://dc3-dominos-usdp-fun02:8198/ws/v2/timeline/, clusterId=dominos-usdp-v3-fun 2025-07-02 11:15:38,779 ERROR [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] exec.DDLTask (DDLTask.java:failed(927)) - Failed org.apache.hadoop.hive.ql.metadata.HiveException: partition spec is invalid; field dt does not exist or is empty at org.apache.hadoop.hive.ql.metadata.Partition.createMetaPartitionObject(Partition.java:129) at org.apache.hadoop.hive.ql.metadata.Hive.convertAddSpecToMetaPartition(Hive.java:2525) at org.apache.hadoop.hive.ql.metadata.Hive.createPartitions(Hive.java:2466) at org.apache.hadoop.hive.ql.exec.DDLTask.addPartitions(DDLTask.java:1320) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:466) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:210) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2664) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2335) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2011) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1709) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1703) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:218) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:787) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:323) at org.apache.hadoop.util.RunJar.main(RunJar.java:236) 2025-07-02 11:15:38,790 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] reexec.ReOptimizePlugin (ReOptimizePlugin.java:run(70)) - ReOptimization: retryPossible: false FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. partition spec is invalid; field dt does not exist or is empty 2025-07-02 11:15:38,791 ERROR [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (SessionState.java:printError(1250)) - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. partition spec is invalid; field dt does not exist or is empty 2025-07-02 11:15:38,792 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:execute(2531)) - Completed executing command(queryId=hadoop_20250702111538_f08ba63c-7b09-48c8-86fe-f29aa249329c); Time taken: 0.094 seconds 2025-07-02 11:15:38,792 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] ql.Driver (Driver.java:checkConcurrency(285)) - Concurrency mode is disabled, not creating a lock manager 2025-07-02 11:15:38,793 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] conf.HiveConf (HiveConf.java:getLogIdVar(5037)) - Using the default value passed in for log id: 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 2025-07-02 11:15:38,793 INFO [63fc22ae-87a3-4d13-b59e-6ea5a99a9941 main] session.SessionState (SessionState.java:resetThreadName(452)) - Resetting thread name to main 2025-07-02 11:15:38,793 INFO [main] conf.HiveConf (HiveConf.java:getLogIdVar(5037)) - Using the default value passed in for log id: 63fc22ae-87a3-4d13-b59e-6ea5a99a9941 2025-07-02 11:15:38,799 INFO [main] tez.TezSessionPoolManager (TezSessionPoolManager.java:closeIfNotDefault(351)) - Closing tez session if not default: sessionId=63fc22ae-87a3-4d13-b59e-6ea5a99a9941, queueName=null, user=hadoop, doAs=false, isOpen=false, isDefault=false 2025-07-02 11:15:38,800 INFO [Tez session start thread] client.TezClient (TezClient.java:stop(731)) - Shutting down Tez Session, sessionName=HIVE-63fc22ae-87a3-4d13-b59e-6ea5a99a9941, applicationId=application_1740624029612_5078 2025-07-02 11:15:38,815 INFO [Tez session start thread] client.TezClient (TezClient.java:stop(777)) - Could not connect to AM, killing session via YARN, sessionName=HIVE-63fc22ae-87a3-4d13-b59e-6ea5a99a9941, applicationId=application_1740624029612_5078 2025-07-02 11:15:38,824 INFO [main] tez.TezSessionState (TezSessionState.java:cleanupDagResources(721)) - Attemting to clean up resources for 63fc22ae-87a3-4d13-b59e-6ea5a99a9941: hdfs://dominos-usdp-v3-fun/tmp/hive/hadoop/_tez_session_dir/63fc22ae-87a3-4d13-b59e-6ea5a99a9941-resources; 0 additional files, 2 localized resources 2025-07-02 11:15:38,839 INFO [pool-7-thread-1] client.AHSProxy (AHSProxy.java:createAHSProxy(43)) - Connecting to Application History server at dc3-dominos-usdp-fun01/10.30.10.60:10200 2025-07-02 11:15:38,840 INFO [pool-7-thread-1] client.TezClient (TezClient.java:start(388)) - Session mode. Starting session. 2025-07-02 11:15:38,840 INFO [pool-7-thread-1] client.ConfiguredRMFailoverProxyProvider (ConfiguredRMFailoverProxyProvider.java:performFailover(100)) - Failing over to rm-dc3-dominos-usdp-fun01 2025-07-02 11:15:38,851 INFO [pool-7-thread-1] client.TezClientUtils (TezClientUtils.java:setupTezJarsLocalResources(180)) - Using tez.lib.uris value from configuration: hdfs:////dominos-usdp-v3-fun/tez/tez.tar.gz 2025-07-02 11:15:38,851 INFO [pool-7-thread-1] client.TezClientUtils (TezClientUtils.java:setupTezJarsLocalResources(182)) - Using tez.lib.uris.classpath value from configuration: null 2025-07-02 11:15:38,855 INFO [main] session.SessionState (SessionState.java:dropPathAndUnregisterDeleteOnExit(885)) - Deleted directory: /tmp/hive/hadoop/63fc22ae-87a3-4d13-b59e-6ea5a99a9941 on fs with scheme hdfs 2025-07-02 11:15:38,856 INFO [main] session.SessionState (SessionState.java:dropPathAndUnregisterDeleteOnExit(885)) - Deleted directory: /tmp/hadoop/63fc22ae-87a3-4d13-b59e-6ea5a99a9941 on fs with scheme file 2025-07-02 11:15:38,857 INFO [main] metastore.HiveMetaStoreClient (HiveMetaStoreClient.java:close(600)) - Closed a connection to metastore, current connections: 1 2025-07-02 11:15:39,565 INFO - FINALIZE_SESSION

<think>我们正在处理一个Hive表分区操作的问题。用户在执行ALTER TABLE时出现错误："partition specis invalidfield dtdoes notexist oris empty"。根据错误信息，问题可能出在分区字段'dt'上。可能的原因包括：1.表结构中不存在名为'dt'的分区字段。2.在指定分区时，分区字段名写错了（比如大小写不匹配，或者拼写错误）。3.分区值可能为空或者格式不正确。我们需要检查表结构，确认分区字段是否存在，以及分区指定的语法是否正确。根据引用[1]中的示例，添加分区的正确语法是：alter tabletest_table addif notexists partition(pt_d='20190101');注意，这里的分区字段是'pt_d'，而不是'dt'。用户的问题中使用的分区字段是'dt'，所以我们需要确认用户表的分区字段名。另外，引用[2]中插入数据时指定分区的方式是：partition(dt='2018-08-15')，这里使用的是'dt'。所以用户可能在操作不同的表，或者表结构不一致。因此，建议用户执行以下步骤：1.检查表结构，确认分区字段名称。可以使用命令：`DESCRIBEFORMATTEDtable_name;`，查看分区字段。2.确保在ALTER TABLE语句中使用的分区字段名称与表定义中的完全一致（包括大小写，因为Hive元数据存储的分区字段名是区分大小写的）。如果用户确认分区字段存在，那么可能是分区值的问题。确保分区值不为空，并且格式正确（例如，日期格式应该是字符串，且与表定义中的格式一致）。另外，错误信息中提到了“field dtdoes notexist”，这通常意味着分区字段不存在。所以重点检查分区字段名称。如果用户确实想添加一个名为'dt'的分区字段，但表结构中并没有这个字段，那么需要先修改表结构添加分区字段。但是，Hive表的分区字段是在创建表时定义的，不能通过ALTER TABLE来添加新的分区字段（只能添加新的分区，即已有分区字段的值）。所以，如果表没有'dt'这个分区字段，那么用户需要重新创建表或者使用已有的分区字段。综上所述，解决方案如下：-确认表的分区字段名称：使用`DESCRIBEFORMATTEDtable_name;`查看。-如果分区字段不是'dt'，则使用正确的分区字段名称。-如果表没有分区字段，那么需要重新定义表结构（或者用户可能误操作）。下面是一个示例，假设表的分区字段是'pt_d'，那么正确的删除分区和添加分区的语句应该是：ALTER TABLEtable_nameDROPIFEXISTS PARTITION (pt_d='value');ALTERTABLE table_name ADDIF NOTEXISTS PARTITION (pt_d='value');注意：删除分区和添加分区的语句中，分区字段名称必须一致。如果用户确实需要修改分区字段，那么这涉及到表结构的修改，通常需要重建表。但是，Hive不支持直接修改分区字段（即不能添加或删除分区字段，只能修改分区字段的值）。所以，如果用户想要改变分区字段，只能创建新表。根据引用[3]，对于分区表，ALTERTABLE修改列信息时，如果使用CASCADE关键字，会修改所有分区的元数据，但不会改变实际数据。但这里的问题不是修改列，而是分区字段本身，所以不适用。因此，针对用户当前的错误，最可能的原因是分区字段名写错了或者不存在。我们生成相关问题以帮助用户进一步排查： §§相关问题§§1.请确认您的表的分区字段名称是什么？可以使用`DESCRIBE FORMATTED table_name;`查看。 2.您执行的具体ALTER TABLE语句是什么？请提供完整的语句以便分析。3.您是否在创建表时定义了分区字段'dt'？如果没有，您需要重新创建表或使用已有的分区字段。 4.您是否尝试过使用正确的分区字段名称（例如，如果实际分区字段是'pt_d'，则使用pt_d）？</think>### Hive分区操作错误解决方案根据错误信息`partition spec is invalid field dt does not exist or is empty`，该问题通常由以下原因引起： #### 常见原因及解决方案： 1. **分区字段名称错误** - 📌 检查表结构中是否存在名为`dt`的分区字段： ```sql DESCRIBE FORMATTED your_table_name; ``` - 确认实际分区字段名（如`pt_d`），确保ALTER语句使用正确字段名[^1] - ✅ 正确示例：`ALTER TABLE test_table DROP PARTITION (pt_d='20230101')` 2. **分区值格式错误** - 确保分区值格式与表定义一致（如字符串需加引号） - ❌ 错误示例：`... PARTITION (dt=20230101)` - ✅ 正确示例：`... PARTITION (dt='20230101')` 3. **表未定义分区字段** - 若表创建时未指定分区字段： ```sql CREATE TABLE non_partitioned_table (...) -- 缺少PARTITIONED BY子句 ``` - 解决方案：重建表或添加分区字段（需迁移数据） 4. **元数据不一致** - 执行元数据修复： ```sql MSCK REPAIR TABLE your_table_name; ``` - 或手动刷新元数据： ```sql ALTER TABLE your_table_name RECOVER PARTITIONS; ``` 5. **语法错误** - 检查特殊字符转义： ```sql ALTER TABLE test_table DROP PARTITION (dt='date=2023-01-01') ``` - 确保分区值不包含非法字符（如空格、分号等） #### 操作示例： ```sql -- 1. 确认分区字段名 DESCRIBE FORMATTED sales_data; -- Output: Partition Information: pt_d string -- 2. 正确执行分区操作 ALTER TABLE sales_data DROP PARTITION (pt_d='20230101'); ```

阅读全文

相关推荐

Map-Reduce-Iris-Flower:这些Map Reduce程序的目标是从著名的鸢尾花数据集中计算出萼片长度，萼片宽度，花瓣长度和花瓣宽度的最大值，最小值和平均值。

MapReduce执行流程和Shuffle过程

MapReduce详解Shuffle过程

大数据面试题及答案：HDFS文件读写步骤、Map端shuffle过程

【减少延迟】：MapReduce Shuffle优化，Map端和Reduce端的性能加速

MapReduce：键值对分配对分区影响的深度理解

MapReduce Shuffle过程全解析：数据从Map到Reduce的高效传输术

使用mr程序统计每年入职的人数。 最终结果要求如下： 1. 格式如下： 年份：1980 人数:xxx 年份：1981 人数:xxx ....... 2. 两个分区： 0分区存储 入职年份<1982年的 1分区存储 入职年份>=1982年的

spark shuflle tong-sort shuffle i.支持序列化重定向(对象不需要反序列化即可被操作进行排序)：kyroSerializer ii.reduce端输出分区数< 2 24次方 -1 对吗

分别描述map端和reduce端的shuffle过程

map shuffle reduce

map端的shuffle过程

简述map和reduce的shuffle过程

spark map shuffle reduce

分别描述map端和Reduce端的shuffle过程（需要包括“溢写“”归并“”领取“）

map和reduce和shuffle

hive on spark 执行任务时 Caused by: java.io.IOException: Unable to rename hdfs:

Android支付宝H5支付转Native支付实现方案

毕设&课设：基于 Spring Cloud 微服务架构下广告系统设计与实现.zip

大家在看

RL78/F14 RLin slave例程

vb6组件指南(Vb高级精华)

中国铁路网shp数据，细致，精确

Tibco 手册

filter LTC1068 模块AD设计 Altium设计 硬件原理图+PCB文件.rar

最新推荐

Ext4压缩与解压工具：从解包到重新打包全过程

【数据转换的基石】：技术分析，Excel到Oracle建表语句的自动化

前端vue2 使用高德地图api

易语言源码：希冀程序保护专家深入解析

【数据迁移流程优化】：一步到位的Excel到Oracle建表语句自动化转换

二维视觉缺陷检测技术背景

Orca(msi编辑工具) 4.5.6 中文版发布：微软官方数据库编辑器

【数据迁移与整合的高效方法】：Excel到Oracle建表语句生成器的深度解析

solidworks怎么画立方体螺纹线

COCO数据集加载与Mask R-CNN训练指南

使用mr程序统计每年入职的人数。最终结果要求如下： 1. 格式如下：年份：1980 人数:xxx 年份：1981 人数:xxx ....... 2. 两个分区： 0分区存储入职年份<1982年的 1分区存储入职年份>=1982年的

filter LTC1068 模块AD设计 Altium设计硬件原理图+PCB文件.rar