一、部署架构
-
华为数据仓库服务DWS,集群版本8.1.3.x
-
集群拓扑结构:
上述拓扑结构为DWS单AZ高可靠部署架构,为减少硬件故障对系统可用性的影响,建议集群部署方案遵循如下原则:- 对于每组实例,其主、备部署在不同的节点上。例如:GTM的主、备分别部署在不同的节点上。DN的主、备、从备部署在不同的节点上。
- 建议节点内存大于等于512G,每个节点部署4个DN。
- 低并发场景下,整个集群部署2~4个CN即可以满足使用要求。
- 建议GTM、CM部署在没有CN的节点上。如此部署,既减少某节点故障带来的损失,还可以避免集群运行压力集中在个别节点上。
- 安全环是集群组网的基本单元,普通安全环内包含至少3个服务器,各服务器的DN形成完备的主备关系。系统默认会根据Datanode进程数据目录的个数加1确定环节点数,也可以配置环节点列表和环节点数参数指定成环规则,建议使用小环,环节点数不宜过大。
为保证负载均衡和资源的有效利用,在遵循上述原则的基础上,建议部署模式如下:
- 主GTM和备CMServer部署在同一个节点上,备GTM和主CMServer部署在同一个节点上。
- 根据需要在部分节点上部署CN。
- 对于DN的部署模式:
- 一个服务器上的主DN对应的备节点和从备节点会按照上图所示自动依照安全环中节点的顺序依次分散部署在其他节点上,DN分布均衡。
- 各个节点上的DN数要求相同。
- DN的主、备、从备部署在不同的节点上。
- 特别说明:
从备DN不占用实际存储空间,仅在主、备DN故障时才起作用,且只存储数据日志,不存储数据页面。
二、物理结构
本小节内容主要查看DWS数仓按照上述部署架构完成部署后,其服务器上数仓的物理结构是什么样子的,以及当发生DDL、DML等操作后,CN和DN各自发生了什么变化等,并查看表的数据文件存储情况等。以下是通过实操来演示整个过程:
- 以root用户,利用SSH工具登录到dws数仓后台服务器。
- 切换到omm用户,然后source一下环境变量,例如执行如下命令:
source /opt/huawei/Bigdata/mppdb/.mppdbgs_profile
- 查看集群状态信息,两种方式
第一种方式:gs_om -t status --detail
第二种方式:cm_ctl query -v -comm@host-192-168-5-204:~> gs_om -t status --detail [ CMServer State ] node node_ip instance state ---------------------------------------------------------------------------- 1 dws03 192.168.5.203 1 /opt/huawei/Bigdata/mppdb/cm/cm_server Standby 3 dws05 192.168.5.205 2 /opt/huawei/Bigdata/mppdb/cm/cm_server Primary [ Cluster State ] cluster_state : Normal redistributing : No balanced : Yes [ Coordinator State ] node node_ip instance state --------------------------------------------------------------------------- 1 dws03 192.168.5.203 5001 /srv/BigData/mppdb/data1/coordinator Normal 2 dws04 192.168.5.204 5002 /srv/BigData/mppdb/data1/coordinator Normal 3 dws05 192.168.5.205 5003 /srv/BigData/mppdb/data1/coordinator Normal [ Central Coordinator State ] node node_ip instance state -------------------------------------------------------------------------- 2 dws04 192.168.5.204 5002 /srv/BigData/mppdb/data1/coordinator Normal [ GTM State ] node node_ip instance state sync_state ---------------------------------------------------------------- 3 dws05 192.168.5.205 1001 /opt/huawei/Bigdata/mppdb/gtm P Primary Connection ok Sync 1 dws03 192.168.5.203 1002 /opt/huawei/Bigdata/mppdb/gtm S Standby Connection ok Sync [ Datanode State ] 主备从架构 node node_ip instance state | node node_ip instance state | node node_ip instance state ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1 dws03 192.168.5.203 6001 /srv/BigData/mppdb/data1/master1 P Primary Normal | 2 dws04 192.168.5.204 6002 /srv/BigData/mppdb/data1/slave1 S Standby Normal | 3 dws05 192.168.5.205 3002 /srv/BigData/mppdb/data1/dummyslave1 R Secondary Normal 2 dws04 192.168.5.204 6003 /srv/BigData/mppdb/data1/master1 P Primary Normal | 3 dws05 192.168.5.205 6004 /srv/BigData/mppdb/data1/slave1 S Standby Normal | 1 dws03 192.168.5.203 3003 /srv/BigData/mppdb/data1/dummyslave1 R Secondary Normal 3 dws05 192.168.5.205 6005 /srv/BigData/mppdb/data1/master1 P Primary Normal | 1 dws03 192.168.5.203 6006 /srv/BigData/mppdb/data1/slave1 S Standby Normal | 2 dws04 192.168.5.204 3004 /srv/BigData/mppdb/data1/dummyslave1 R Secondary Normal
omm@host-192-168-5-204:~> cm_ctl query -v -C [ CMServer State ] node instance state ------------------------- 1 dws03 1 Standby 3 dws05 2 Primary [ Cluster State ] cluster_state : Normal redistributing : No balanced : Yes [ Coordinator State ] node instance state -------------------------- 1 dws03 5001 Normal 2 dws04 5002 Normal 3 dws05 5003 Normal [ Central Coordinator State ] node instance state ------------------------- 2 dws04 5002 Normal [ GTM State ] node instance state sync_state ------------------------------------------------ 3 dws05 1001 P Primary Connection ok Sync 1 dws03 1002 S Standby Connection ok Sync [ Datanode State ] node instance state | node instance state | node instance state -------------------------------------------------------------------------------------------------------------- 1 dws03 6001 P Primary Normal | 2 dws04 6002 S Standby Normal | 3 dws05 3002 R Secondary Normal 2 dws04 6003 P Primary Normal | 3 dws05 6004 S Standby Normal | 1 dws03 3003 R Secondary Normal 3 dws05 6005 P Primary Normal | 1 dws03 6006 S Standby Normal | 2 dws04 3004 R Secondary Normal omm@host-192-168-5-204:~>
- 查看dws数据仓库安装目录的结构
omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/core> ll total 16 drwx------ 4 omm wheel 4096 Oct 31 2023 bin drwx------ 2 omm wheel 58 Oct 26 2023 etc drwx------ 3 omm wheel 24 Aug 17 2023 include drwx------ 4 omm wheel 95 Mar 1 2022 jre drwx------ 6 omm wheel 8192 Oct 26 2023 lib drwx------ 6 omm wheel 68 Oct 26 2023 share drwx------ 2 omm wheel 20 Oct 26 2023 utilslib omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/core> omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/core/bin> ls alarmItem.conf diagcollect.sh gds gs_guc gs_running_xacts pg_config runsessionstat.sh cluster_dynamic_config drop_caches.sh getDEK.jar gs_initcm gs_upgrade pg_controldata script cluster_guc.conf etcd gs_cgroup gs_initdb gtm_ctl pg_format_cu seq_query cluster_static_config etcdctl gs_clean gs_initgtm initdb_param pg_recvlogical server.key.cipher cm_agent gaussdb gs_ctl gs_log jeprof pg_resetxlog server.key.rand cm_agent.lock GaussDB-8.1.3-SUSE11-x86_64bit-symbol.tar.gz gs_dump gsql om_monitor pg_xlogdump total_database_size cm_ctl gaussdb.license gs_dumpall gs_redis om_monitor.lock result transfer.py cm_server gaussdb.version gs_encrypt gs_restore openssl retry_errcodes.conf upgrade_version dfx_tool gaussmaster gs_gtm gs_roach pagehack run_drop_cache.sh version.cfg omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/core/bin> omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/core/include/postgresql> cd server/ omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/core/include/postgresql/server> ll total 324 drwx------ 3 omm wheel 128 Aug 17 2023 access drwx------ 2 omm wheel 61 Aug 17 2023 catalog drwx------ 8 omm wheel 180 Oct 26 2023 cfunction -rw------- 1 omm wheel 37130 Aug 17 2023 c.h drwx------ 2 omm wheel 22 Aug 17 2023 common drwx------ 2 omm wheel 25 Aug 17 2023 datatype drwx------ 2 omm wheel 24 Aug 17 2023 executor -rw------- 1 omm wheel 38382 Aug 17 2023 extension_dependency.h -rw------- 1 omm wheel 24767 Aug 17 2023 fmgr.h -rw------- 1 omm wheel 2376 Aug 17 2023 gs_thread.h -rw------- 1 omm wheel 632 Aug 17 2023 gs_threadlocal.h drwx------ 2 omm wheel 42 Aug 17 2023 lib -rw------- 1 omm wheel 40469 Aug 17 2023 libpq-fe.h drwx------ 2 omm wheel 24 Aug 17 2023 mb drwx------ 2 omm wheel 145 Aug 17 2023 nodes -rw------- 1 omm wheel 28705 Aug 17 2023 pg_config.h -rw------- 1 omm wheel 10722 Aug 17 2023 pg_config_manual.h -rw------- 1 omm wheel 1051 Aug 17 2023 pg_config_os.h -rw------- 1 omm wheel 1841 Aug 17 2023 pgtime.h drwx------ 2 omm wheel 23 Aug 17 2023 pgxc drwx------ 2 omm wheel 43 Aug 17 2023 port -rw------- 1 omm wheel 14190 Aug 17 2023 port.h -rw------- 1 omm wheel 2054 Aug 17 2023 postgres_ext.h -rw------- 1 omm wheel 26454 Aug 17 2023 postgres.h -rw------- 1 omm wheel 8483 Aug 17 2023 securec_check.h -rw------- 1 omm wheel 28973 Apr 21 2023 securec.h -rw------- 1 omm wheel 17751 Apr 21 2023 securectype.h drwx------ 2 omm wheel 198 Aug 17 2023 storage drwx------ 2 omm wheel 20 Aug 17 2023 tcop drwx------ 3 omm wheel 4096 Aug 17 2023 utils omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/core/include/postgresql/server> omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/core/share> ll total 4 drwx------ 2 omm wheel 29 Aug 17 2023 llvmir drwx------ 2 omm wheel 32 Aug 17 2023 postgis drwx------ 7 omm wheel 4096 Oct 26 2023 postgresql drwx------ 6 omm wheel 55 Oct 26 2023 sslcert omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/core/share> cd postgis/ omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/core/share/postgis> ll total 4 -rw------- 1 omm wheel 3469 Aug 17 2023 PostGIS_install.sh omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/core/share/postgresql> ll total 1300 -rw------- 1 omm wheel 5440 Aug 17 2023 cm.conf.sample -rw------- 1 omm wheel 76384 Aug 17 2023 conversion_create.sql drwx------ 2 omm wheel 4096 Aug 17 2023 extension -rw------- 1 omm wheel 3093 Aug 17 2023 gtm.conf.sample -rw------- 1 omm wheel 107038 Aug 17 2023 information_schema.sql -rw------- 1 omm wheel 72 Aug 17 2023 pg_cast_oid.txt -rw------- 1 omm wheel 4446 Aug 17 2023 pg_hba.conf.sample -rw------- 1 omm wheel 1636 Aug 17 2023 pg_ident.conf.sample -rw------- 1 omm wheel 604 Aug 17 2023 pg_service.conf.sample -rw------- 1 omm wheel 122640 Oct 26 2023 pmk_schema_bak.sql -rw------- 1 omm wheel 122586 Aug 17 2023 pmk_schema_single_inst.sql -rw------- 1 omm wheel 122592 Aug 17 2023 pmk_schema.sql -rw------- 1 omm wheel 236646 Aug 17 2023 postgres.bki -rw------- 1 omm wheel 32948 Aug 17 2023 postgres.description -rw------- 1 omm wheel 35156 Aug 17 2023 postgresql.conf.sample -rw------- 1 omm wheel 49 Aug 17 2023 postgres.shdescription -rw------- 1 omm wheel 220 Aug 17 2023 psqlrc.sample -rw------- 1 omm wheel 4814 Aug 17 2023 recovery.conf.sample -rw------- 1 omm wheel 13359 Aug 17 2023 snowball_create.sql -rw------- 1 omm wheel 33329 Aug 17 2023 sql_features.txt -rw------- 1 omm wheel 347311 Aug 17 2023 system_views.sql drwx------ 18 omm wheel 4096 Aug 17 2023 timezone drwx------ 2 omm wheel 237 Aug 17 2023 timezonesets drwx------ 2 omm wheel 25 Aug 17 2023 tmp drwx------ 2 omm wheel 4096 Aug 17 2023 tsearch_data omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/core/share/postgresql> omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/core/share/postgresql/tsearch_data> ll total 27004 -rw------- 1 omm wheel 424 Aug 17 2023 danish.stop -rw------- 1 omm wheel 13245765 Aug 17 2023 dict.gbk.xdb -rw------- 1 omm wheel 14315393 Aug 17 2023 dict.utf8.xdb -rw------- 1 omm wheel 453 Aug 17 2023 dutch.stop -rw------- 1 omm wheel 622 Aug 17 2023 english.stop -rw------- 1 omm wheel 1579 Aug 17 2023 finnish.stop -rw------- 1 omm wheel 805 Aug 17 2023 french.stop -rw------- 1 omm wheel 1349 Aug 17 2023 german.stop -rw------- 1 omm wheel 1227 Aug 17 2023 hungarian.stop -rw------- 1 omm wheel 242 Aug 17 2023 hunspell_sample.affix -rw------- 1 omm wheel 465 Aug 17 2023 ispell_sample.affix -rw------- 1 omm wheel 81 Aug 17 2023 ispell_sample.dict -rw------- 1 omm wheel 1654 Aug 17 2023 italian.stop -rw------- 1 omm wheel 851 Aug 17 2023 norwegian.stop -rw------- 1 omm wheel 1267 Aug 17 2023 portuguese.stop -rw------- 1 omm wheel 3714 Aug 17 2023 rules.gbk.ini -rw------- 1 omm wheel 4396 Aug 17 2023 rules.utf8.ini -rw------- 1 omm wheel 1235 Aug 17 2023 russian.stop -rw------- 1 omm wheel 2178 Aug 17 2023 spanish.stop -rw------- 1 omm wheel 559 Aug 17 2023 swedish.stop -rw------- 1 omm wheel 73 Aug 17 2023 synonym_sample.syn -rw------- 1 omm wheel 473 Aug 17 2023 thesaurus_sample.ths -rw------- 1 omm wheel 260 Aug 17 2023 turkish.stop omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/core/share/postgresql/tsearch_data> omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/cm> ll total 0 drwx------ 2 omm wheel 60 Aug 7 15:15 cm_agent drwx------ 2 omm wheel 27 Aug 7 15:17 cm_server omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/cm> omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/cm> cd cm_server/ omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/cm/cm_server> ll total 4 -rw------- 1 omm wheel 46 Aug 7 15:17 cm_server.pid omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/cm/cm_server> cd ../cm_agent/ omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/cm/cm_agent> ll total 20 -rw------- 1 omm wheel 45 Aug 7 15:15 cm_agent.pid -rw------- 1 omm wheel 5580 Oct 26 2023 cm.conf -rw------- 1 omm wheel 5580 Oct 26 2023 cm.conf.bak
- 在dws安装路径下,查看gtm.conf和cm.conf,了解下两者的内容大概是什么
查看cm.confomm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/gtm> cat gtm.conf # ---------------------- # GTM configuration file # ---------------------- # # This file must be placed on gtm working directory # specified by -D command line option of gtm or gtm_ctl. The # configuration file name must be "gtm.conf" # # # This file consists of lines of the form # # name = value # # (The "=" is optional.) Whitespace may be used. Comments are # introduced with "#" anywhere on a line. The complete list of # parameter names and allowed values can be found in the # Postgres-XC documentation. # # The commented-out settings shown in this file represent the default # values. # # Re-commenting a setting is NOT sufficient to revert it to the default # value. # # You need to restart the server. #------------------------------------------------------------------------------ # GENERAL PARAMETERS #------------------------------------------------------------------------------ nodename = 'gtm_1002' # Specifies the node name. # (changes requires restart) listen_addresses = 'localhost,192.168.5.203' # Listen addresses of this GTM. # (changes requires restart) port = 25306 # Port number of this GTM. # (changes requires restart) #------------------------------------------------------------------------------ # ERROR REPORTING AND LOGGING #------------------------------------------------------------------------------ log_directory = '/var/log/Bigdata/mpp/omm/pg_log/gtm' # directory where log files are written, # can be absolute or relative. #log_file = 'gtm-%Y-%m-%d_%H%M%S.log' # Log file name #log_min_messages = WARNING # log_min_messages. Default WARNING. # Valid value: DEBUG, DEBUG5, DEBUG4, DEBUG3, # DEBUG2, DEBUG1, INFO, NOTICE, WARNING, # ERROR, LOG, FATAL, PANIC. #------------------------------------------------------------------------------ # GTM STANDBY PARAMETERS #------------------------------------------------------------------------------ #Those parameters are effective when GTM is activated as a standby server active_host = '192.168.5.205' # Listen address of active GTM. # (changes requires restart) active_port = 25305 # (changes requires restart) local_host = '192.168.5.203' # Listen address of HA local host. # (changes requires restart) local_port = 25307 # (changes requires restart) #--------------------------------------- # OTHER OPTIONS #--------------------------------------- enable_alarm = on enable_connect_control = true # check ip. #standby_connection_timeout = 7 # standby connect timeout. #keepalives_idle = 0 # Keepalives_idle parameter. #keepalives_interval = 0 # Keepalives_interval parameter. #keepalives_count = 0 # Keepalives_count internal parameter. #synchronous_backup = auto # If backup to standby is synchronous # off, on or auto. #wlm_max_mem = 2048 # Maximum memory an instance can use for its executions, unit: MB. # (changes requires restart) #query_memory_limit = 0.25 # Sets the percentage limit of memory a query can use. # (changes requires restart) alarm_component = '/opt/huawei/Bigdata/mppdb/snas_cm_cmd'
omm@host-192-168-5-203:/opt/huawei/Bigdata/mppdb/cm/cm_agent> cat cm.conf #-------------------------------------------------------------------------------------------------- # LOG #-------------------------------------------------------------------------------------------------- # Default: cm_agent data dir. cm_agent_log_dir = '/var/log/Bigdata/mpp/omm/cm/cm_agent' # # Default: cm_server data dir. cm_server_log_dir = '/var/log/Bigdata/mpp/omm/cm/cm_server' # Valid values: DEBUG5, DEBUG1, WARNING, ERROR, LOG, FATAL. # Default: WARNING log_min_messages = WARNING # Only support MB. # Default: 16MB. log_file_size = 16MB #-------------------------------------------------------------------------------------------------- # ALARM #-------------------------------------------------------------------------------------------------- alarm_component = '/opt/huawei/Bigdata/mppdb/snas_cm_cmd' # Default: 3 alarm_report_interval = 3 #-------------------------------------------------------------------------------------------------- # TIMEOUT #-------------------------------------------------------------------------------------------------- # Default: 30 # Minimum: 8 instance_heartbeat_timeout = 30 # Default: 600 coordinator_heartbeat_timeout = 600 #-------------------------------------------------------------------------------------------------- # THREAD POOL #-------------------------------------------------------------------------------------------------- # Default: 10 # Range : [2, 255] thread_count = 10 #-------------------------------------------------------------------------------------------------- # ABNORMAL CHECK #-------------------------------------------------------------------------------------------------- # Default: on enable_abnormal_check = on abnormal_check_memory_usage = '{ "_name" : "libac_memory_usage.so", "check_interval" : "60", "usage_threshold" : "70", "check_count" : "10" }' abnormal_check_general_task = '{ "_name" : "libac_general_task.so", "check_interval" : "3600" }' abnormal_check_create_table = '{ "_name" : "libac_create_table.so", "check_interval" : "150", "check_count" : "6" }' abnormal_check_phony_dead = '{ "_name" : "libac_phony_dead.so", "check_interval" : "180", "phony_dead_effective_time" : "5", "cmserver_phony_dead_restart_interval" : "21600" }' #-------------------------------------------------------------------------------------------------- # STORAGE #-------------------------------------------------------------------------------------------------- # Default: on enable_transaction_read_only = on # Default: 600 datastorage_threshold_check_interval = 600 # Default: 90 datastorage_threshold_value_check = 90 # Default: 43200 max_datastorage_threshold_check = 43200 #-------------------------------------------------------------------------------------