活动介绍

怎么定时执行sqoop任务 sqoop: command not found

时间: 2025-05-21 08:35:52 浏览: 17
### 定时执行 Sqoop 任务并解决 `sqoop: command not found` 的方法 当尝试通过 `cron` 或者 Shell 脚本来定时运行 Sqoop 任务时,可能会遇到诸如 `sqoop: command not found` 这样的错误。这通常是由于环境变量未正确设置或者路径不匹配所导致的。 #### 环境变量配置的重要性 Cron 使用的是有限的默认环境变量集合[^1],因此即使在终端中可以正常运行 Sqoop 命令,在 Cron 中可能仍然会报错。为了确保 Cron 可以成功调用 Sqoop,需要显式指定其可执行文件的位置或将其路径加入到脚本中的 `$PATH` 环境变量里。 #### 编写 Shell 脚本 可以通过编写一个简单的 Shell 脚本来封装 Sqoop 任务,并在此过程中明确定义所需的环境变量: ```bash #!/bin/bash # 设置必要的环境变量 export HADOOP_HOME=/path/to/your/hadoop-installation-directory export SQOOP_HOME=/path/to/your/sqoop-installation-directory export PATH=$SQOOP_HOME/bin:$HADOOP_HOME/bin:$PATH # 执行 Sqoop 导入任务 $SQOOP_HOME/bin/sqoop import \ --connect jdbc:mysql://localhost/testdb \ --username root \ --password password \ --table employees \ --target-dir /user/data/sqoop_imports/employees ``` 此脚本设置了两个主要的环境变量:`HADOOP_HOME` 和 `SQOOP_HOME`,并将它们对应的 bin 目录添加到了系统的 `$PATH` 中。这样能够保证无论是在交互式的 Bash 终端还是非交互式的 Cron 环境下都能找到 Sqoop 和其他依赖工具的二进制文件。 #### 配置 Crontab 来调度任务 编辑用户的 crontab 文件来安排定期的任务执行: ```bash crontab -e ``` 然后增加一行用于每天凌晨两点触发上面提到的那个脚本(假设该脚本保存为 `/home/user/scripts/run_sqoop.sh`) : ```text 0 2 * * * /home/user/scripts/run_sqoop.sh >> /home/user/logs/sqoop.log 2>&1 ``` 这里重定向了标准输出和错误流至日志文件以便于后续排查任何潜在的问题。 #### 检查权限与所有权 确认脚本具有足够的执行权限并且由合适的用户拥有。如果必要的话更改这些属性: ```bash chmod +x /home/user/scripts/run_sqoop.sh chown user:usergroup /home/user/scripts/run_sqoop.sh ``` 以上步骤应该能有效防止因缺少适当环境而导致的 “command not found” 错误发生。 ### 注意事项 - 如果使用的是特定版本的 Hadoop/Sqoop,则需调整相应的安装路径。 - 对于敏感数据如数据库密码考虑采用更安全的方式存储而不是硬编码在脚本内部。
阅读全文

相关推荐

[root@node ~]# mysql -u root -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 8 Server version: 8.0.42 MySQL Community Server - GPL Copyright (c) 2000, 2025, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> CREATE DATABASE weblog_db; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'CREATE DATABASE weblog_db' at line 1 mysql> CREATE DATABASE weblog_db; ERROR 1007 (HY000): Can't create database 'weblog_db'; database exists mysql> USE weblog_db; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed mysql> DROP DATABASE IF EXISTS weblog_db; Query OK, 2 rows affected (0.02 sec) mysql> CREATE DATABASE weblog_db; Query OK, 1 row affected (0.01 sec) mysql> USE weblog_db; Database changed mysql> CREATE TABLE page_visits ( -> page VARCHAR(255) , -> visits BIGINT -> ); ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'TABLE page_visits ( page VARCHAR(255) , visits BIGINT )' at line 1 mysql> CREATE TABLE page_visits ( -> page VARCHAR(255), -> visits BIGINT -> ); Query OK, 0 rows affected (0.02 sec) mysql> SHOW TABLES; +---------------------+ | Tables_in_weblog_db | +---------------------+ | page_visits | +---------------------+ 1 row in set (0.00 sec) mysql> ^C mysql> q -> quit -> exit -> ^C mysql> ^C mysql> ^C mysql> ^DBye [root@node ~]# hive SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Hive Session ID = 7bb79582-cc2b-49b6-abc7-020dcdc46542 Logging initialized using configuration in jar:file:/home/hive-3.1.3/lib/hive-common-3.1.3.jar!/hive-log4j2.properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Hive Session ID = 15d9da52-e18e-40b2-a80f-e76eda81df4c hive> DESCRIBE FORMATTED page_visits; OK # col_name data_type comment page string visits bigint # Detailed Table Information Database: default OwnerType: USER Owner: root CreateTime: Tue Jul 08 01:43:42 CST 2025 LastAccessTime: UNKNOWN Retention: 0 Location: hdfs://node:9000/hive/warehouse/page_visits Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE {\"BASIC_STATS\":\"true\"} bucketing_version 2 numFiles 1 numRows 4 rawDataSize 56 totalSize 60 transient_lastDdlTime 1751910222 # Storage Information SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat: org.apache.hadoop.mapred.TextInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Compressed: No Num Buckets: -1 Bucket Columns: [] Sort Columns: [] Storage Desc Params: serialization.format 1 Time taken: 0.785 seconds, Fetched: 32 row(s) hive> [root@node ~]# [root@node ~]# sqoop export \ > --connect jdbc:mysql://localhost/weblog_db \ > --username root \ > --password Aa@123456 \ > --table page_visits \ > --export-dir hdfs://node:9000/hive/warehouse/page_visits \ > --input-fields-terminated-by '\001' \ > --num-mappers 1 Warning: /home/sqoop-1.4.7/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /home/sqoop-1.4.7/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 2025-07-08 15:28:12,550 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 2025-07-08 15:28:12,587 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 2025-07-08 15:28:12,704 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 2025-07-08 15:28:12,708 INFO tool.CodeGenTool: Beginning code generation Loading class com.mysql.jdbc.Driver'. This is deprecated. The new driver class is com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary. 2025-07-08 15:28:13,225 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM page_visits AS t LIMIT 1 2025-07-08 15:28:13,266 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM page_visits AS t LIMIT 1 2025-07-08 15:28:13,280 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hadoop/hadoop3.3 Note: /tmp/sqoop-root/compile/363869e21c2078b9742685122c43a3cc/page_visits.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 2025-07-08 15:28:16,377 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/363869e21c2078b9742685122c43a3cc/page_visits.jar 2025-07-08 15:28:16,391 INFO mapreduce.ExportJobBase: Beginning export of page_visits 2025-07-08 15:28:16,391 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 2025-07-08 15:28:16,484 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 2025-07-08 15:28:17,339 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 2025-07-08 15:28:17,342 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 2025-07-08 15:28:17,343 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 2025-07-08 15:28:17,555 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at node/192.168.196.122:8032 2025-07-08 15:28:17,782 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/root/.staging/job_1751959003014_0001 2025-07-08 15:28:26,026 INFO input.FileInputFormat: Total input files to process : 1 2025-07-08 15:28:26,029 INFO input.FileInputFormat: Total input files to process : 1 2025-07-08 15:28:26,495 INFO mapreduce.JobSubmitter: number of splits:1 2025-07-08 15:28:26,528 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 2025-07-08 15:28:26,619 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1751959003014_0001 2025-07-08 15:28:26,620 INFO mapreduce.JobSubmitter: Executing with tokens: [] 2025-07-08 15:28:26,805 INFO conf.Configuration: resource-types.xml not found 2025-07-08 15:28:26,805 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'. 2025-07-08 15:28:27,226 INFO impl.YarnClientImpl: Submitted application application_1751959003014_0001 2025-07-08 15:28:27,264 INFO mapreduce.Job: The url to track the job: http://node:8088/proxy/application_1751959003014_0001/ 2025-07-08 15:28:27,264 INFO mapreduce.Job: Running job: job_1751959003014_0001 2025-07-08 15:28:34,334 INFO mapreduce.Job: Job job_1751959003014_0001 running in uber mode : false 2025-07-08 15:28:34,335 INFO mapreduce.Job: map 0% reduce 0% 2025-07-08 15:28:38,374 INFO mapreduce.Job: map 100% reduce 0% 2025-07-08 15:28:38,381 INFO mapreduce.Job: Job job_1751959003014_0001 failed with state FAILED due to: Task failed task_1751959003014_0001_m_000000 Job failed as tasks failed. failedMaps:1 failedReduces:0 killedMaps:0 killedReduces: 0 2025-07-08 15:28:38,448 INFO mapreduce.Job: Counters: 8 Job Counters Failed map tasks=1 Launched map tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=2061 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=2061 Total vcore-milliseconds taken by all map tasks=2061 Total megabyte-milliseconds taken by all map tasks=2110464 2025-07-08 15:28:38,456 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead 2025-07-08 15:28:38,457 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 21.1033 seconds (0 bytes/sec) 2025-07-08 15:28:38,462 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead 2025-07-08 15:28:38,462 INFO mapreduce.ExportJobBase: Exported 0 records. 2025-07-08 15:28:38,462 ERROR mapreduce.ExportJobBase: Export job failed! 2025-07-08 15:28:38,463 ERROR tool.ExportTool: Error during export: Export job failed! at org.apache.sqoop.mapreduce.ExportJobBase.runExport(ExportJobBase.java:445) at org.apache.sqoop.manager.SqlManager.exportTable(SqlManager.java:931) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:80) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:99) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:81) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243) at org.apache.sqoop.Sqoop.main(Sqoop.java:252) [root@node ~]# sqoop export \ > --connect jdbc:mysql://localhost/weblog_db \ > --username root \ > --password Aa@123456 \ > --table page_visits \ > --export-dir hdfs://node:9000/hive/warehouse/page_visits \ > --input-fields-terminated-by ',' \ > --num-mappers 1 Warning: /home/sqoop-1.4.7/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /home/sqoop-1.4.7/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 2025-07-08 15:30:31,174 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 2025-07-08 15:30:31,218 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 2025-07-08 15:30:31,333 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 2025-07-08 15:30:31,336 INFO tool.CodeGenTool: Beginning code generation Loading class com.mysql.jdbc.Driver'. This is deprecated. The new driver class is com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary. 2025-07-08 15:30:31,771 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM page_visits AS t LIMIT 1 2025-07-08 15:30:31,814 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM page_visits AS t LIMIT 1 2025-07-08 15:30:31,821 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hadoop/hadoop3.3 Note: /tmp/sqoop-root/compile/ab00e36d1f5084a0f7d522b4e9a975e5/page_visits.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 2025-07-08 15:30:33,116 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/ab00e36d1f5084a0f7d522b4e9a975e5/page_visits.jar 2025-07-08 15:30:33,129 INFO mapreduce.ExportJobBase: Beginning export of page_visits 2025-07-08 15:30:33,129 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 2025-07-08 15:30:33,212 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 2025-07-08 15:30:33,877 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 2025-07-08 15:30:33,880 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 2025-07-08 15:30:33,880 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 2025-07-08 15:30:34,097 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at node/192.168.196.122:8032 2025-07-08 15:30:34,310 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/root/.staging/job_1751959003014_0002 2025-07-08 15:30:39,127 INFO input.FileInputFormat: Total input files to process : 1 2025-07-08 15:30:39,131 INFO input.FileInputFormat: Total input files to process : 1 2025-07-08 15:30:39,995 INFO mapreduce.JobSubmitter: number of splits:1 2025-07-08 15:30:40,022 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 2025-07-08 15:30:40,532 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1751959003014_0002 2025-07-08 15:30:40,532 INFO mapreduce.JobSubmitter: Executing with tokens: [] 2025-07-08 15:30:40,689 INFO conf.Configuration: resource-types.xml not found 2025-07-08 15:30:40,689 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'. 2025-07-08 15:30:40,746 INFO impl.YarnClientImpl: Submitted application application_1751959003014_0002 2025-07-08 15:30:40,783 INFO mapreduce.Job: The url to track the job: http://node:8088/proxy/application_1751959003014_0002/ 2025-07-08 15:30:40,784 INFO mapreduce.Job: Running job: job_1751959003014_0002 2025-07-08 15:30:46,847 INFO mapreduce.Job: Job job_1751959003014_0002 running in uber mode : false 2025-07-08 15:30:46,848 INFO mapreduce.Job: map 0% reduce 0% 2025-07-08 15:30:50,893 INFO mapreduce.Job: map 100% reduce 0% 2025-07-08 15:30:51,905 INFO mapreduce.Job: Job job_1751959003014_0002 failed with state FAILED due to: Task failed task_1751959003014_0002_m_000000 Job failed as tasks failed. failedMaps:1 failedReduces:0 killedMaps:0 killedReduces: 0 2025-07-08 15:30:51,973 INFO mapreduce.Job: Counters: 8 Job Counters Failed map tasks=1 Launched map tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=2058 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=2058 Total vcore-milliseconds taken by all map tasks=2058 Total megabyte-milliseconds taken by all map tasks=2107392 2025-07-08 15:30:51,979 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead 2025-07-08 15:30:51,980 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 18.0828 seconds (0 bytes/sec) 2025-07-08 15:30:51,983 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead 2025-07-08 15:30:51,983 INFO mapreduce.ExportJobBase: Exported 0 records. 2025-07-08 15:30:51,983 ERROR mapreduce.ExportJobBase: Export job failed! 2025-07-08 15:30:51,984 ERROR tool.ExportTool: Error during export: Export job failed! at org.apache.sqoop.mapreduce.ExportJobBase.runExport(ExportJobBase.java:445) at org.apache.sqoop.manager.SqlManager.exportTable(SqlManager.java:931) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:80) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:99) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:81) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243) at org.apache.sqoop.Sqoop.main(Sqoop.java:252) [root@node ~]# 6.2 Sqoop导出数据 6.2.1从Hive将数据导出到MySQL 6.2.2sqoop导出格式 6.2.3导出page_visits表 6.2.4导出到ip_visits表 6.3验证导出数据 6.3.1登录MySQL 6.3.2执行查询

sqoop import \ > --connect jdbc:mysql://hadoop01:3306/test \ > --username 'root' \ > --password '123456' \ > --query "select id,hno from emp_conn where /$CONDITIONS" \ > --targer-dir '/user/hive/warehouse/tx.db/emp_conn' \ > -m 1 Warning: /opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1425774/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /export/server/zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1425774/jars/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1425774/jars/log4j-slf4j-impl-2.8.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 25/03/18 21:01:50 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7-cdh6.2.1 25/03/18 21:01:50 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 25/03/18 21:01:50 ERROR tool.BaseSqoopTool: Error parsing arguments for import: 25/03/18 21:01:50 ERROR tool.BaseSqoopTool: Unrecognized argument: --targer-dir 25/03/18 21:01:50 ERROR tool.BaseSqoopTool: Unrecognized argument: /user/hive/warehouse/tx.db/emp_conn 25/03/18 21:01:50 ERROR tool.BaseSqoopTool: Unrecognized argument: -m 25/03/18 21:01:50 ERROR tool.BaseSqoopTool: Unrecognized argument: 1 Try --help for usage instructions.

sqoop import --connect jdbc:mysql://zhaosai:3306/mydb --username root --password jqe6b6 --table news --target-dir /user/news --fields-terminated-by “;” --hive-import --hive-table news -m 1出现错误Warning: /opt/programs/sqoop-1.4.7.bin__hadoop-2.6.0/../hbase does not exist! HBase imports will fail. Please set $HBASE_HOME to the root of your HBase installation. Warning: /opt/programs/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /opt/programs/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /opt/programs/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 23/06/10 16:18:23 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 23/06/10 16:18:23 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 23/06/10 16:18:23 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 23/06/10 16:18:23 INFO tool.CodeGenTool: Beginning code generation Sat Jun 10 16:18:23 CST 2023 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification. 23/06/10 16:18:24 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM news AS t LIMIT 1 23/06/10 16:18:24 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM news AS t LIMIT 1 23/06/10 16:18:24 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/programs/hadoop-2.7.6 注: /tmp/sqoop-root/compile/84ba419f00fa83cb5d16dba722729d01/news.java使用或覆盖了已过时的 API。 注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。 23/06/10 16:18:25 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/84ba419f00fa83cb5d16dba722729d01/news.jar 23/06/10 16:18:25 WARN manager.MySQLManager: It looks like you are importing from mysql. 23/06/10 16:18:25 WARN manager.MySQLManager: This transfer can be faster! Use the --direct 23/06/10 16:18:25 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path. 23/06/10 16:18:25 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql) 23/06/10 16:18:25 ERROR tool.ImportTool: Import failed: No primary key could be found for table news. Please specify one with --split-by or perform a sequential import with '-m 1'.

2023-06-06 18:10:33,041 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 2023-06-06 18:10:33,075 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 2023-06-06 18:10:33,218 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 2023-06-06 18:10:33,218 INFO tool.CodeGenTool: Beginning code generation Loading class com.mysql.jdbc.Driver'. This is deprecated. The new driver class is com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary. 2023-06-06 18:10:33,782 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM user_log AS t LIMIT 1 2023-06-06 18:10:33,825 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM user_log AS t LIMIT 1 2023-06-06 18:10:33,834 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/module/hadoop-3.1.4 注: /tmp/sqoop-root/compile/5f4cfb16d119de74d33f1a0d776d5ae0/user_log.java使用或覆盖了已过时的 API。 注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。 2023-06-06 18:10:35,111 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/5f4cfb16d119de74d33f1a0d776d5ae0/user_log.jar 2023-06-06 18:10:35,125 WARN manager.MySQLManager: It looks like you are importing from mysql. 2023-06-06 18:10:35,126 WARN manager.MySQLManager: This transfer can be faster! Use the --direct 2023-06-06 18:10:35,126 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path. 2023-06-06 18:10:35,126 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql) 2023-06-06 18:10:35,130 ERROR tool.ImportTool: Import failed: No primary key could be found for table user_log. Please specify one with --split-by or perform a sequential import with '-m 1'.

[root@node ~]# start-dfs.sh Starting namenodes on [node] Last login: 二 7月 8 16:00:18 CST 2025 from 192.168.1.92 on pts/0 Starting datanodes Last login: 二 7月 8 16:00:38 CST 2025 on pts/0 Starting secondary namenodes [node] Last login: 二 7月 8 16:00:41 CST 2025 on pts/0 SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. [root@node ~]# start-yarn.sh Starting resourcemanager Last login: 二 7月 8 16:00:45 CST 2025 on pts/0 Starting nodemanagers Last login: 二 7月 8 16:00:51 CST 2025 on pts/0 [root@node ~]# mapred --daemon start historyserver [root@node ~]# jps 3541 ResourceManager 4007 Jps 2984 NameNode 3944 JobHistoryServer 3274 SecondaryNameNode [root@node ~]# mkdir -p /weblog [root@node ~]# cat > /weblog/access.log << EOF > 192.168.1.1,2023-06-01 10:30:22,/index.html > 192.168.1.2,2023-06-01 10:31:15,/product.html > 192.168.1.1,2023-06-01 10:32:45,/cart.html > 192.168.1.3,2023-06-01 11:45:30,/checkout.html > 192.168.1.4,2023-06-01 12:10:05,/index.html > 192.168.1.2,2023-06-01 14:20:18,/product.htm > EOF [root@node ~]# ls /weblog access.log [root@node ~]# hdfs dfs -mkdir -p /weblog/raw SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. [root@node ~]# hdfs dfs -put /weblog/access.log /weblog/raw/ SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. [root@node ~]# hdfs dfs -ls /weblog/raw SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Found 1 items -rw-r--r-- 3 root supergroup 269 2025-07-08 16:03 /weblog/raw/access.log [root@node ~]# cd /weblog [root@node weblog]# mkdir weblog-mapreduce [root@node weblog]# cd weblog-mapreduce [root@node weblog-mapreduce]# touch CleanMapper.java [root@node weblog-mapreduce]# vim CleanMapper.java import java.io.IOException; import org.apache.hadoop.io.*; import org.apache.hadoop.mapreduce.*; public class CleanMapper extends Mapper<LongWritable, Text, Text, NullWritable> { public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); String[] fields = line.split(","); if(fields.length == 3) { String ip = fields[0]; String time = fields[1]; String page = fields[2]; if(ip.matches("\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}")) { String outputLine = ip + "," + time + "," + page; context.write(new Text(outputLine), NullWritable.get()); } } } } [root@node weblog-mapreduce]# touch CleanReducer.java [root@node weblog-mapreduce]# vim CleanReducer.java import java.io.IOException; import org.apache.hadoop.io.*; import org.apache.hadoop.mapreduce.*; public class CleanReducer extends Reducer<Text, NullWritable, Text, NullWritable> { public void reduce(Text key, Iterable<NullWritable> values, Context context) throws IOException, InterruptedException { context.write(key, NullWritable.get()); } } [root@node weblog-mapreduce]# touch LogCleanDriver.java [root@node weblog-mapreduce]# vim LogCleanDriver.java import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.*; import org.apache.hadoop.mapreduce.*; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class LogCleanDriver { public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = Job.getInstance(conf, "Web Log Cleaner"); job.setJarByClass(LogCleanDriver.class); job.setMapperClass(CleanMapper.class); job.setReducerClass(CleanReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(NullWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); } } [root@node weblog-mapreduce]# ls /weblog/weblog-mapreduce CleanMapper.java CleanReducer.java LogCleanDriver.java [root@node weblog-mapreduce]# javac -classpath $(hadoop classpath) -d . *.java [root@node weblog-mapreduce]# ls /weblog/weblog-mapreduce CleanMapper.class CleanReducer.class LogCleanDriver.class CleanMapper.java CleanReducer.java LogCleanDriver.java [root@node weblog-mapreduce]# jar cf logclean.jar *.class [root@node weblog-mapreduce]# ls /weblog/weblog-mapreduce CleanMapper.class CleanReducer.class LogCleanDriver.class logclean.jar CleanMapper.java CleanReducer.java LogCleanDriver.java [root@node weblog-mapreduce]# hdfs dfs -ls /weblog/raw SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Found 1 items -rw-r--r-- 3 root supergroup 269 2025-07-08 16:03 /weblog/raw/access.log [root@node weblog-mapreduce]# hdfs dfs -ls /weblog/output SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. ls: /weblog/output': No such file or directory [root@node weblog-mapreduce]# hadoop jar logclean.jar LogCleanDriver /weblog/raw /weblog/output SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. [root@node weblog-mapreduce]# [root@node weblog-mapreduce]# mapred job -status job_1751961655287_0001 SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Job: job_1751961655287_0001 Job File: hdfs://node:9000/tmp/hadoop-yarn/staging/history/done/2025/07/08/000000/job_1751961655287_0001_conf.xml Job Tracking URL : http://node:19888/jobhistory/job/job_1751961655287_0001 Uber job : false Number of maps: 1 Number of reduces: 1 map() completion: 1.0 reduce() completion: 1.0 Job state: SUCCEEDED retired: false reason for failure: Counters: 54 File System Counters FILE: Number of bytes read=287 FILE: Number of bytes written=552699 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=372 HDFS: Number of bytes written=269 HDFS: Number of read operations=8 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 HDFS: Number of bytes read erasure-coded=0 Job Counters Launched map tasks=1 Launched reduce tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=1848 Total time spent by all reduces in occupied slots (ms)=2016 Total time spent by all map tasks (ms)=1848 Total time spent by all reduce tasks (ms)=2016 Total vcore-milliseconds taken by all map tasks=1848 Total vcore-milliseconds taken by all reduce tasks=2016 Total megabyte-milliseconds taken by all map tasks=1892352 Total megabyte-milliseconds taken by all reduce tasks=2064384 Map-Reduce Framework Map input records=6 Map output records=6 Map output bytes=269 Map output materialized bytes=287 Input split bytes=103 Combine input records=0 Combine output records=0 Reduce input groups=6 Reduce shuffle bytes=287 Reduce input records=6 Reduce output records=6 Spilled Records=12 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=95 CPU time spent (ms)=1050 Physical memory (bytes) snapshot=500764672 Virtual memory (bytes) snapshot=5614292992 Total committed heap usage (bytes)=379584512 Peak Map Physical memory (bytes)=293011456 Peak Map Virtual memory (bytes)=2803433472 Peak Reduce Physical memory (bytes)=207753216 Peak Reduce Virtual memory (bytes)=2810859520 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=269 File Output Format Counters Bytes Written=269 [root@node weblog-mapreduce]# hdfs dfs -ls /weblog/output SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Found 2 items -rw-r--r-- 3 root supergroup 0 2025-07-08 16:34 /weblog/output/_SUCCESS -rw-r--r-- 3 root supergroup 269 2025-07-08 16:34 /weblog/output/part-r-00000 [root@node weblog-mapreduce]# hdfs dfs -cat /weblog/output/part-r-00000 | head -5 SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. 192.168.1.1,2023-06-01 10:30:22,/index.html 192.168.1.1,2023-06-01 10:32:45,/cart.html 192.168.1.2,2023-06-01 10:31:15,/product.html 192.168.1.2,2023-06-01 14:20:18,/product.htm 192.168.1.3,2023-06-01 11:45:30,/checkout.html [root@node weblog-mapreduce]# hive SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Hive Session ID = 5199f37c-a381-428a-be1b-0a2afaab8583 Logging initialized using configuration in jar:file:/home/hive-3.1.3/lib/hive-common-3.1.3.jar!/hive-log4j2.properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Hive Session ID = f38c99b3-ff7c-4f61-ae07-6b21d86d7160 hive> CREATE EXTERNAL TABLE weblog ( > ip STRING, > access_time TIMESTAMP, > page STRING > ) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY ',' > LOCATION '/weblog/output'; OK Time taken: 1.274 seconds hive> select * from weblog; OK 192.168.1.1 2023-06-01 10:30:22 /index.html 192.168.1.1 2023-06-01 10:32:45 /cart.html 192.168.1.2 2023-06-01 10:31:15 /product.html 192.168.1.2 2023-06-01 14:20:18 /product.htm 192.168.1.3 2023-06-01 11:45:30 /checkout.html 192.168.1.4 2023-06-01 12:10:05 /index.html Time taken: 1.947 seconds, Fetched: 6 row(s) hive> select * from weblog limit 5; OK 192.168.1.1 2023-06-01 10:30:22 /index.html 192.168.1.1 2023-06-01 10:32:45 /cart.html 192.168.1.2 2023-06-01 10:31:15 /product.html 192.168.1.2 2023-06-01 14:20:18 /product.htm 192.168.1.3 2023-06-01 11:45:30 /checkout.html Time taken: 0.148 seconds, Fetched: 5 row(s) hive> hive> CREATE TABLE page_visits AS > SELECT > page, > COUNT(*) AS visits > FROM weblog > GROUP BY page > ORDER BY visits DESC; Query ID = root_20250708183002_ec44d1b4-af24-403c-bb67-380dfb6961c3 Total jobs = 2 Launching Job 1 out of 2 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Job = job_1751961655287_0002, Tracking URL = http://node:8088/proxy/application_1751961655287_0002/ Kill Command = /home/hadoop/hadoop3.3/bin/mapred job -kill job_1751961655287_0002 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2025-07-08 18:30:12,692 Stage-1 map = 0%, reduce = 0% 2025-07-08 18:30:16,978 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.8 sec 2025-07-08 18:30:23,184 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 3.66 sec MapReduce Total cumulative CPU time: 3 seconds 660 msec Ended Job = job_1751961655287_0002 Launching Job 2 out of 2 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Job = job_1751961655287_0003, Tracking URL = http://node:8088/proxy/application_1751961655287_0003/ Kill Command = /home/hadoop/hadoop3.3/bin/mapred job -kill job_1751961655287_0003 Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 1 2025-07-08 18:30:35,969 Stage-2 map = 0%, reduce = 0% 2025-07-08 18:30:41,155 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 1.23 sec 2025-07-08 18:30:46,313 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 2.95 sec MapReduce Total cumulative CPU time: 2 seconds 950 msec Ended Job = job_1751961655287_0003 Moving data to directory hdfs://node:9000/hive/warehouse/page_visits MapReduce Jobs Launched: Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 3.66 sec HDFS Read: 12379 HDFS Write: 251 SUCCESS Stage-Stage-2: Map: 1 Reduce: 1 Cumulative CPU: 2.95 sec HDFS Read: 7308 HDFS Write: 150 SUCCESS Total MapReduce CPU Time Spent: 6 seconds 610 msec OK Time taken: 46.853 seconds hive> hive> describe page_visits; OK page string visits bigint Time taken: 0.214 seconds, Fetched: 2 row(s) hive> CREATE TABLE ip_visits AS > SELECT > ip, > COUNT(*) AS visits > FROM weblog > GROUP BY ip > ORDER BY visits DESC; Query ID = root_20250708183554_da402d08-af34-46f9-a33a-3f66ddd1a580 Total jobs = 2 Launching Job 1 out of 2 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Job = job_1751961655287_0004, Tracking URL = http://node:8088/proxy/application_1751961655287_0004/ Kill Command = /home/hadoop/hadoop3.3/bin/mapred job -kill job_1751961655287_0004 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2025-07-08 18:36:04,037 Stage-1 map = 0%, reduce = 0% 2025-07-08 18:36:09,250 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.57 sec 2025-07-08 18:36:14,393 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 3.3 sec MapReduce Total cumulative CPU time: 3 seconds 300 msec Ended Job = job_1751961655287_0004 Launching Job 2 out of 2 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Job = job_1751961655287_0005, Tracking URL = http://node:8088/proxy/application_1751961655287_0005/ Kill Command = /home/hadoop/hadoop3.3/bin/mapred job -kill job_1751961655287_0005 Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 1 2025-07-08 18:36:27,073 Stage-2 map = 0%, reduce = 0% 2025-07-08 18:36:31,215 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 1.25 sec 2025-07-08 18:36:36,853 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 3.27 sec MapReduce Total cumulative CPU time: 3 seconds 270 msec Ended Job = job_1751961655287_0005 Moving data to directory hdfs://node:9000/hive/warehouse/ip_visits MapReduce Jobs Launched: Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 3.3 sec HDFS Read: 12445 HDFS Write: 216 SUCCESS Stage-Stage-2: Map: 1 Reduce: 1 Cumulative CPU: 3.27 sec HDFS Read: 7261 HDFS Write: 129 SUCCESS Total MapReduce CPU Time Spent: 6 seconds 570 msec OK Time taken: 44.523 seconds hive> [root@node weblog-mapreduce]# hive> [root@node weblog-mapreduce]# describe ip_visite; bash: describe: command not found... [root@node weblog-mapreduce]# hive SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Hive Session ID = 57dafc2a-afe2-41a4-8159-00f8d44b5add Logging initialized using configuration in jar:file:/home/hive-3.1.3/lib/hive-common-3.1.3.jar!/hive-log4j2.properties Async: true Hive Session ID = f866eae4-4cb4-4403-b7a2-7a52701c5a74 Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. hive> describe ip_visite; FAILED: SemanticException [Error 10001]: Table not found ip_visite hive> describe ip_visits; OK ip string visits bigint Time taken: 0.464 seconds, Fetched: 2 row(s) hive> SELECT * FROM page_visits; OK /index.html 2 /product.html 1 /product.htm 1 /checkout.html 1 /cart.html 1 Time taken: 2.095 seconds, Fetched: 5 row(s) hive> SELECT * FROM ip_visits; OK 192.168.1.2 2 192.168.1.1 2 192.168.1.4 1 192.168.1.3 1 Time taken: 0.176 seconds, Fetched: 4 row(s) hive> hive> [root@node weblog-mapreduce]# [root@node weblog-mapreduce]# mysql -u root -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 48 Server version: 8.0.42 MySQL Community Server - GPL Copyright (c) 2000, 2025, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> CREATE DATABASE IF NOT EXISTS weblog_db; Query OK, 1 row affected (0.06 sec) mysql> USE weblog_db; Database changed mysql> CREATE TABLE IF NOT EXISTS page_visits ( -> page VARCHAR(255), -> visits BIGINT -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8; Query OK, 0 rows affected, 1 warning (0.05 sec) mysql> SHOW TABLES; +---------------------+ | Tables_in_weblog_db | +---------------------+ | page_visits | +---------------------+ 1 row in set (0.00 sec) mysql> DESCRIBE page_visits; +--------+--------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------+--------------+------+-----+---------+-------+ | page | varchar(255) | YES | | NULL | | | visits | bigint | YES | | NULL | | +--------+--------------+------+-----+---------+-------+ 2 rows in set (0.00 sec) mysql> CREATE TABLE IF NOT EXISTS ip_visits ( -> ip VARCHAR(15), -> visits BIGINT -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8; Query OK, 0 rows affected, 1 warning (0.02 sec) mysql> SHOW TABLES; +---------------------+ | Tables_in_weblog_db | +---------------------+ | ip_visits | | page_visits | +---------------------+ 2 rows in set (0.01 sec) mysql> DESC ip_visits; +--------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------+-------------+------+-----+---------+-------+ | ip | varchar(15) | YES | | NULL | | | visits | bigint | YES | | NULL | | +--------+-------------+------+-----+---------+-------+ 2 rows in set (0.00 sec) mysql> ^C mysql> [root@node weblog-mapreduce]# hive SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Hive Session ID = f34e6971-71ae-4aa5-aa22-895061f33bdf Logging initialized using configuration in jar:file:/home/hive-3.1.3/lib/hive-common-3.1.3.jar!/hive-log4j2.properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Hive Session ID = f7a06e76-e117-4fbb-9ee8-09fdfd002104 hive> DESCRIBE FORMATTED page_visits; OK # col_name data_type comment page string visits bigint # Detailed Table Information Database: default OwnerType: USER Owner: root CreateTime: Tue Jul 08 18:30:47 CST 2025 LastAccessTime: UNKNOWN Retention: 0 Location: hdfs://node:9000/hive/warehouse/page_visits Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE {\"BASIC_STATS\":\"true\"} bucketing_version 2 numFiles 1 numRows 5 rawDataSize 70 totalSize 75 transient_lastDdlTime 1751970648 # Storage Information SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat: org.apache.hadoop.mapred.TextInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Compressed: No Num Buckets: -1 Bucket Columns: [] Sort Columns: [] Storage Desc Params: serialization.format 1 Time taken: 1.043 seconds, Fetched: 32 row(s) hive> 到这里就不会了 6.2.2sqoop导出格式 6.2.3导出page_visits表 6.2.4导出到ip_visits表 6.3验证导出数据 6.3.1登录MySQL 6.3.2执行查询

最新推荐

recommend-type

Comsol声子晶体能带计算:六角与三角晶格原胞选取及布里渊区高对称点选择 - 声子晶体 v1.0

内容概要:本文详细探讨了利用Comsol进行声子晶体能带计算过程中,六角晶格和三角晶格原胞选取的不同方法及其对简约布里渊区高对称点选择的影响。文中不仅介绍了两种晶格类型的基矢量定义方式,还强调了正确设置周期性边界条件(特别是相位补偿)的重要性,以避免计算误差如鬼带现象。同时,提供了具体的MATLAB代码片段用于演示关键步骤,并分享了一些实践经验,例如如何通过观察能带图中的狄拉克锥特征来验证路径设置的准确性。 适合人群:从事材料科学、物理学研究的专业人士,尤其是那些正在使用或计划使用Comsol软件进行声子晶体模拟的研究人员。 使用场景及目标:帮助研究人员更好地理解和掌握在Comsol环境中针对不同类型晶格进行精确的声子晶体能带计算的方法和技术要点,从而提高仿真精度并减少常见错误的发生。 其他说明:文章中提到的实际案例展示了因晶格类型混淆而导致的问题,提醒使用者注意细节差异,确保模型构建无误。此外,文中提供的代码片段可以直接应用于相关项目中作为参考模板。
recommend-type

springboot213大学生心理健康管理系统的设计与实现.zip

springboot213大学生心理健康管理系统的设计与实现
recommend-type

三轴自动锁螺丝机PLC配方编程:吸钉式锁螺丝智能调整与注释详解 变址寄存器 高效版

一种基于三菱FX系列PLC的三轴自动锁螺丝机的配方编程方法。该系统采用吸钉式锁螺丝方式,通过PLC进行智能管理和调整。主要内容包括:利用D寄存器阵列和变址寄存器Z来存储和管理不同配方的数据,如坐标和螺丝数量;通过触摸屏和示教器简化调试流程,使工人能够快速设置和保存参数;并通过RS指令将数据保存到触摸屏内置存储中。此外,还展示了具体的PLC程序片段,解释了如何通过简单的寄存器操作实现复杂的配方管理和自动化操作。 适合人群:从事工业自动化领域的工程师和技术人员,尤其是熟悉PLC编程和机械设备调试的专业人士。 使用场景及目标:适用于需要提高生产效率和简化调试流程的制造业企业。主要目标是帮助技术人员掌握如何使用PLC进行配方管理,优化自动锁螺丝机的操作流程,减少人工干预,提升设备的智能化水平。 其他说明:文中提供的具体PLC程序代码和详细的注释有助于读者更好地理解和应用相关技术。同时,通过实例演示了如何利用PLC寄存器寻址特性和变址寄存器简化程序逻辑,为类似项目提供有价值的参考。
recommend-type

基于QT与STM32的Modbus-TCP四遥功能实现及源码解析

基于Qt开发的Modbus-TCP远程控制系统,用于实现四遥(遥测、遥控、遥信、遥调)功能。系统由上位机和下位机组成,上位机使用Qt进行图形界面开发,下位机采用STM32和W5500以太网模块,所有Modbus功能均自行实现,未使用第三方库。文中具体展示了各个功能的实现细节,包括ADC数据采集、LED控制、按键状态读取以及参数调节等功能的具体代码实现。 适合人群:具有一定嵌入式开发经验的研发人员,尤其是熟悉Qt和STM32的开发者。 使用场景及目标:适用于工业自动化、智能家居等领域,旨在帮助开发者理解和实现基于Modbus-TCP协议的远程控制系统,掌握四遥功能的具体实现方法。 其他说明:文中提供了详细的代码片段和技术难点解析,有助于读者深入理解系统的实现过程。同时,针对常见的开发问题给出了具体的解决方案,如浮点数转换、字节序处理等。
recommend-type

ERP系统客户与供应商信息视图创建:Oracle数据库中客户和供应商数据整合查询设计

内容概要:本文档定义了一个名为 `xxx_SCustSuplier_info` 的视图,用于整合和展示客户(Customer)和供应商(Supplier)的相关信息。视图通过连接多个表来获取组织单位、客户账户、站点使用、位置、财务代码组合等数据。对于客户部分,视图选择了与账单相关的记录,并提取了账单客户ID、账单站点ID、客户名称、账户名称、站点代码、状态、付款条款等信息;对于供应商部分,视图选择了有效的供应商及其站点信息,包括供应商ID、供应商名称、供应商编号、状态、付款条款、财务代码组合等。视图还通过外连接确保即使某些字段为空也能显示相关信息。 适合人群:熟悉Oracle ERP系统,尤其是应付账款(AP)和应收账款(AR)模块的数据库管理员或开发人员;需要查询和管理客户及供应商信息的业务分析师。 使用场景及目标:① 数据库管理员可以通过此视图快速查询客户和供应商的基本信息,包括账单信息、财务代码组合等;② 开发人员可以利用此视图进行报表开发或数据迁移;③ 业务分析师可以使用此视图进行数据分析,如信用评估、付款周期分析等。 阅读建议:由于该视图涉及多个表的复杂连接,建议读者先熟悉各个表的结构和关系,特别是 `hz_parties`、`hz_cust_accounts`、`ap_suppliers` 等核心表。此外,注意视图中使用的外连接(如 `gl_code_combinations_kfv` 表的连接),这可能会影响查询结果的完整性。
recommend-type

Web前端开发:CSS与HTML设计模式深入解析

《Pro CSS and HTML Design Patterns》是一本专注于Web前端设计模式的书籍,特别针对CSS(层叠样式表)和HTML(超文本标记语言)的高级应用进行了深入探讨。这本书籍属于Pro系列,旨在为专业Web开发人员提供实用的设计模式和实践指南,帮助他们构建高效、美观且可维护的网站和应用程序。 在介绍这本书的知识点之前,我们首先需要了解CSS和HTML的基础知识,以及它们在Web开发中的重要性。 HTML是用于创建网页和Web应用程序的标准标记语言。它允许开发者通过一系列的标签来定义网页的结构和内容,如段落、标题、链接、图片等。HTML5作为最新版本,不仅增强了网页的表现力,还引入了更多新的特性,例如视频和音频的内置支持、绘图API、离线存储等。 CSS是用于描述HTML文档的表现(即布局、颜色、字体等样式)的样式表语言。它能够让开发者将内容的表现从结构中分离出来,使得网页设计更加模块化和易于维护。随着Web技术的发展,CSS也经历了多个版本的更新,引入了如Flexbox、Grid布局、过渡、动画以及Sass和Less等预处理器技术。 现在让我们来详细探讨《Pro CSS and HTML Design Patterns》中可能包含的知识点: 1. CSS基础和选择器: 书中可能会涵盖CSS基本概念,如盒模型、边距、填充、边框、背景和定位等。同时还会介绍CSS选择器的高级用法,例如属性选择器、伪类选择器、伪元素选择器以及选择器的组合使用。 2. CSS布局技术: 布局是网页设计中的核心部分。本书可能会详细讲解各种CSS布局技术,包括传统的浮动(Floats)布局、定位(Positioning)布局,以及最新的布局模式如Flexbox和CSS Grid。此外,也会介绍响应式设计的媒体查询、视口(Viewport)单位等。 3. 高级CSS技巧: 这些技巧可能包括动画和过渡效果,以及如何优化性能和兼容性。例如,CSS3动画、关键帧动画、转换(Transforms)、滤镜(Filters)和混合模式(Blend Modes)。 4. HTML5特性: 书中可能会深入探讨HTML5的新标签和语义化元素,如`<article>`、`<section>`、`<nav>`等,以及如何使用它们来构建更加标准化和语义化的页面结构。还会涉及到Web表单的新特性,比如表单验证、新的输入类型等。 5. 可访问性(Accessibility): Web可访问性越来越受到重视。本书可能会介绍如何通过HTML和CSS来提升网站的无障碍访问性,比如使用ARIA标签(Accessible Rich Internet Applications)来增强屏幕阅读器的使用体验。 6. 前端性能优化: 性能优化是任何Web项目成功的关键。本书可能会涵盖如何通过优化CSS和HTML来提升网站的加载速度和运行效率。内容可能包括代码压缩、合并、避免重绘和回流、使用Web字体的最佳实践等。 7. JavaScript与CSS/HTML的交互: 在现代Web开发中,JavaScript与CSS及HTML的交云并用是不可或缺的。书中可能会讲解如何通过JavaScript动态地修改样式、操作DOM元素以及使用事件监听和响应用户交互。 8. Web框架和预处理器: 这本书可能会提到流行的Web开发框架和预处理器,比如Bootstrap、Foundation、Sass和Less等,它们是如何简化和加速开发流程的。 9. 测试和维护: 书中也可能包含关于如何测试网页以及如何持续优化和维护CSS和HTML代码的章节。例如,使用断言测试、自动化测试、性能分析工具等。 最后,鉴于文件名称列表中的“压缩包子文件”的表述,这可能是对“压缩包”文件的一种误译或误用,此处“压缩包”应该指的是包含该书籍PDF文件的压缩文件格式,如ZIP或RAR。而“Pro CSS and HTML Design Patterns.pdf”指的就是该书籍的PDF格式电子版文件。 以上所述,构成了《Pro CSS and HTML Design Patterns》一书可能包含的核心知识点。通过学习这些内容,Web前端开发者可以掌握更为高效和优雅的设计模式,从而在日常工作中更高效地解决实际问题。
recommend-type

Zotero 7数据同步:Attanger插件安装&设置,打造文献管理利器

/i.s3.glbimg.com/v1/AUTH_08fbf48bc0524877943fe86e43087e7a/internal_photos/bs/2018/L/j/ipKA8TRPyt4ptaG1DGww/t-m-01.png) # 1. Zotero 7数据同步简介 在数字时代,管理研究数据变得日益重要,而Zot
recommend-type

卷积神经网络的基础理论200字

<think>我们正在回答用户关于卷积神经网络(CNN)基础理论的询问。根据提供的引用资料,我们可以概述CNN的基本结构和工作原理。引用[1]提到CNN是一种前馈神经网络,信息从输入单向流动到输出。引用[2]指出CNN在图像处理等领域应用广泛,通过层次结构和参数调整实现模式识别。引用[3]说明CNN包含卷积层,使用卷积核提取局部特征,减少参数量,提高效率。引用[4]则提到了训练过程,包括前向传播、误差计算和反向传播(梯度下降)。因此,我们将从以下几个方面概述CNN:1.CNN的基本结构(卷积层、池化层、全连接层)2.卷积操作的工作原理(局部感受野、权重共享)3.训练过程(前向传播、损失函数、反
recommend-type

轻便实用的Java库类查询工具介绍

标题 "java2库类查询" 和描述表明,所提及的工具是一个专门用于查询Java库类的应用程序。此软件旨在帮助开发者快速地查找和引用Java的标准开发工具包(SDK)中包含的所有应用程序编程接口(API)类。通过这样的工具,开发者可以节省大量在官方文档或搜索引擎上寻找类定义和使用方法的时间。它被描述为轻巧且方便,这表明其占用的系统资源相对较少,同时提供直观的用户界面,使得查询过程简洁高效。 从描述中可以得出几个关键知识点: 1. Java SDK:Java的软件开发工具包(SDK)是Java平台的一部分,提供了一套用于开发Java应用软件的软件包和库。这些软件包通常被称为API,为开发者提供了编程界面,使他们能够使用Java语言编写各种类型的应用程序。 2. 库类查询:这个功能对于开发者来说非常关键,因为它提供了一个快速查找特定库类及其相关方法、属性和使用示例的途径。良好的库类查询工具可以帮助开发者提高工作效率,减少因查找文档而中断编程思路的时间。 3. 轻巧性:软件的轻巧性通常意味着它对计算机资源的要求较低。这样的特性对于资源受限的系统尤为重要,比如老旧的计算机、嵌入式设备或是当开发者希望最小化其开发环境占用空间时。 4. 方便性:软件的方便性通常关联于其用户界面设计,一个直观、易用的界面可以让用户快速上手,并减少在使用过程中遇到的障碍。 5. 包含所有API:一个优秀的Java库类查询软件应当能够覆盖Java所有标准API,这包括Java.lang、Java.util、Java.io等核心包,以及Java SE平台的所有其他标准扩展包。 从标签 "java 库 查询 类" 可知,这个软件紧密关联于Java编程语言的核心功能——库类的管理和查询。这些标签可以关联到以下知识点: - Java:一种广泛用于企业级应用、移动应用(如Android应用)、网站后端、大型系统和许多其他平台的编程语言。 - 库:在Java中,库是一组预打包的类和接口,它们可以被应用程序重复使用。Java提供了庞大的标准库,以支持各种常见的任务和功能。 - 查询:查询指的是利用软件工具搜索、定位和检索信息的过程。对于Java库类查询工具来说,这意味着可以通过类名、方法签名或其他标识符来查找特定的API条目。 最后,压缩包文件列表包含了两个文件:“java.dit”和“Java.exe”。其中“Java.exe”很可能是程序的可执行文件,而“java.dit”可能是一个数据文件,用于存储Java类的索引或数据。由于文件名后缀通常与文件类型相关联,但“dit”并不是一个常见的文件扩展名。这可能是一个特定于软件的自定义格式,或是一个打字错误。 总结来说,"java2库类查询" 是一个针对Java开发者的实用工具,它提供了一个轻量级、易用的平台来查询和定位Java标准库中的所有类和API。此工具对优化开发流程,减少查找Java类文档的时间大有裨益,尤其适合需要频繁查阅Java API的开发者使用。
recommend-type

【Zotero 7终极指南】:新手必备!Attanger插件全攻略与数据同步神技

# 1. Zotero 7与Attanger插件的介绍 在当今的学术研究和知识管理领域,高效的文献管理工具至关重要。Zotero 7作为一个流行的参考文献管理软件,因其强大的功能和用户友好的界面而受到专业人士的青睐。而Attanger插件则为Zotero 7带来了更多定制化和高级功能,极大地增强