「实战系列」万字长文轻松学会 Greenplum 6.2.1 安装配置

本文档详细记录了在Redhat 6.8系统上安装配置Greenplum 6.2.1的过程,包括软硬件要求、依赖安装、系统参数配置、集群初始化等步骤,并提供了官方文档链接和关键配置参数。此外,还强调了安装过程中遇到的问题及解决办法,如防火墙、SELinux、端口设置、磁盘I/O等。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

了解更多Greenplum技术干货,欢迎访问Greenplum中文社区网站

2019年12月12号,pivotal 发布gp6.2.1,适逢公司gp集群扩建升级,需要确定版本,所以安装gp6的版本与gp5做比对测试。本文档参考官方文档,按照官方标准步骤一步一步安装完成。文档中列举了gp6 与旧版本安装的差异点。

时间:20191217
安装版本:greenplum 6.2.1
下载地址:https://network.pivotal.io/products/pivotal-gpdb/#/releases/526878
官方安装文档:https://gpdb.docs.pivotal.io/6-2/install_guide/platform-requirements.html
中文社区安装文档:https://greenplum.cn/2019/11/30/how-to-set-up-greenplum-6-1-cluster/

1.软硬件说明及必要依赖安装

1.1 软硬件说明

  1. 系统版本:redhat6.8
  2. 硬件:3台虚拟机,2核,16G内存,50G硬盘
  3. 实验节点规划一个 master, 4个segment,4个mirror,无standby
主机ip					host				节点规划
172.28.25.201   mdw					master
172.28.25.202   sdw1				
seg1,seg2,mirror3,mirror4
172.28.25.203   sdw2				
seg3,seg4,mirror1,mirror2

1.2 必要依赖安装

与旧版本差异点
gp4.x 无安装依赖检查步骤
gp5.x 使用rpm安装需要检查安装依赖
gp6.2 使用rpm需要检查安装依赖,使用yum install安装 会自动安装依赖,前提条件是需要联网 
GP6.X RPM版本安装前需要检查软件依赖,安装过程需要联网,若为内网机,需要先下载好相应的包。

在这里插入图片描述

1.2.1 批量安装依赖包(需联网)

greenplum 5 是用 rpm 命令的,而 greenplum 6 则用 yum install 直接安装依赖。

sudo yum install -y apr apr-util bash bzip2 curl krb5 libcurl libevent libxml2 libyaml zlib  openldap openssh openssl openssl-libs perl readline rsync R sed tar zip krb5-devel 

1.2.2 内网机需要人工下载后再上传至服务器

注意:操作系统版本位数 ,例如本次虚拟机是: el6.x86_64

[root@mdw ~]# uname -a
Linux mdw 2.6.32-642.el6.x86_64 #1 SMP Wed Apr 13 00:51:26 EDT 2016 
x86_64 x86_64 x86_64 GNU/Linux

下载地址:
http://rpmfind.net/linux/rpm2html/search.php

1.2.3 linux 中离线下载

条件:
1. 与安装 gp 集群相同版本的操作系统
2. 可联外网

yumdownloader --destdir ./ --resolve libyaml 

2 配置系统参数

## 与旧版本差异点 
gp6 无gpcheck 检查工具,但在gpinitsystem 环节会检查系统参数。 若不按照官方推荐参数修改,不影响集群安装,会影响集群性能 
  • 系统参数需要使用 root 用户修改,修改完需要重启系统,也可以修改完成后一并重启系统。
  • 建议先修改 master 主机的参数,待安装好 master 的 gp 后,打通 ssh,使用 gpscp ,gpssh 批量修改其他节点的系统参数
  • 参考文档:https://gpdb.docs.pivotal.io/6-2/install_guide/prep_os.html

在这里插入图片描述

2.1 关闭防火墙

2.1.1 检查 SElinux ( Security-Enhanced Linux )

使用 root 用户查看

[root@mdw ~]# sestatusSELinux status:                 disabled

如果 SELinux status != disabled ,修改 /etc/selinux/config 设置 ,随后重启系统(可以调节完参数后一并重启)

SELINUX=disabled

2.1.2 检查 iptables 状态

[root@mdw ~]# /sbin/chkconfig --list iptablesiptables        0:off   1:off   2:off   3:off   4:off   5:off   6:off

如果状态没关闭,则修改,随后重启系统(可以调节完参数后一并重启)

/sbin/chkconfig iptables off

2.1.3 检查firewalld(centos6 一般没有)

[root@mdw ~]# systemctl status firewalld

如果firewalld 关闭,则输出

* firewalld.service - firewalld - dynamic firewall daemon   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)   Active: inactive (dead)

如果状态没关闭,则修改,随后重启系统(可以调节完参数后一并重启)

[root@mdw ~]# systemctl stop firewalld.service[root@mdw ~]# systemctl disable firewalld.service

2.2 配置host

2.2.1 配置每台机器host

配置 master hostname 为 mdw, 其他 segment 主机的 hostname 不是必须配置项。修改各台主机的主机名称。

一般建议的命名规则如下:

  • Master :mdw
  • Standby Master :smdw
  • Segment Host :sdw1、 sdw2 … sdwn

修改操作:

#临时修改hostname mdw#永久修改vi /etc/sysconfig/network

2.2.2 配置 /etc/hosts

#添加每台机器的ip 和别名[root@mdw ~]# cat /etc/hosts127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4::1         localhost localhost.localdomain localhost6 localhost6.localdomain6172.28.25.201 mdw172.28.25.202 sdw1172.28.25.203 sdw2#修改集群中所有主机的hosts 文件,登陆到各个主机,执行一下语句:cat >> /etc/hosts << EOF172.28.25.201 mdw172.28.25.202 sdw1172.28.25.203 sdw2EOF

2.3 配置 sysctl.conf

根据系统实际情况来修改系统参数(gp 5.0 之前都是官方给出的默认值,5.0 之后给出了部分计算公式。)

官方推荐配置,设置完成后 重载参数( sysctl -p):

# kernel.shmall = _PHYS_PAGES / 2 # See Shared Memory Pages  # 共享内存kernel.shmall = 4000000000# kernel.shmmax = kernel.shmall * PAGE_SIZE                  # 共享内存kernel.shmmax = 500000000kernel.shmmni = 4096vm.overcommit_memory = 2 # See Segment Host Memory           # 主机内存vm.overcommit_ratio = 95 # See Segment Host Memory           # 主机内存net.ipv4.ip_local_port_range = 10000 65535 # See Port Settings 端口设定kernel.sem = 500 2048000 200 40960kernel.sysrq = 1kernel.core_uses_pid = 1kernel.msgmnb = 65536kernel.msgmax = 65536kernel.msgmni = 2048net.ipv4.tcp_syncookies = 1net.ipv4.conf.default.accept_source_route = 0net.ipv4.tcp_max_syn_backlog = 4096net.ipv4.conf.all.arp_filter = 1net.core.netdev_max_backlog = 10000net.core.rmem_max = 2097152net.core.wmem_max = 2097152vm.swappiness = 10vm.zone_reclaim_mode = 0vm.dirty_expire_centisecs = 500vm.dirty_writeback_centisecs = 100vm.dirty_background_ratio = 0 # See System Memory       # 系统内存vm.dirty_ratio = 0vm.dirty_background_bytes = 1610612736vm.dirty_bytes = 4294967296

2.3.1 共享内存

  • kernel.shmall = _PHYS_PAGES / 2
  • kernel.shmmax = kernel.shmall * PAGE_SIZE
[root@mdw ~]# echo $(expr $(getconf _PHYS_PAGES) / 2)                   2041774 [root@mdw ~]# echo $(expr $(getconf _PHYS_PAGES) / 2 \* $(getconf PAGE_SIZE))8363106304

2.3.2 主机内存

  • **vm.overcommit_memory:**系统使用该参数来确定可以为进程分配多少内存。对于GP数据库,此参数应设置为2。
  • vm.overcommit_ratio:以为进程分配内的百分比,其余部分留给操作系统。在 Red Hat上,默认值为50。建议设置95
#计算 vm.overcommit_ratiovm.overcommit_ratio = (RAM-0.026*gp_vmem) / RAM

2.3.3 端口设定

为避免在 Greenplum 初始化期间与其他应用程序之间的端口冲突,指定的端口范围
net.ipv4.ip_local_port_range。

使用 gpinitsystem 初始化 Greenplum 时,请不要在该范围内指定 Greenplum 数据库端口。

例如,如果
net.ipv4.ip_local_port_range = 10000 65535,将 Greenplum 数据库基本端口号设置为这些值。

PORT_BASE = 6000MIRROR_PORT_BASE = 7000

2.3.4 系统内存
系统内存大于 64G,建议以下配置

vm.dirty_background_ratio = 0vm.dirty_ratio = 0vm.dirty_background_bytes = 1610612736 # 1.5GBvm.dirty_bytes = 4294967296 # 4GB

系统内存小于等于 64GB,移除 vm.dirty_background_bytes 设置,并设置以下参数:

vm.dirty_background_ratio = 3vm.dirty_ratio = 10

增加 vm.min_free_kbytes,确保网络和存储驱动程序 PF_MEMALLOC 得到分配。这对内存大的系统尤其重要。一般系统上,默认值通常太低。可以使用 awk 命令计算 vm.min_free_kbytes 的值,通常是建议的系统物理内存的 3%:

awk 'BEGIN {OFMT = "%.0f";} /MemTotal/ {print "vm.min_free_kbytes =", $2 * .03;}' /proc/meminfo >> /etc/sysctl.conf 

不要设置 vm.min_free_kbytes 超过系统内存的5%,这样做可能会导致内存不足

本次实验使用 redhat 6.8 ,16G 内存,配置如下:

[root@mdw ~]# vi /etc/sysctl.conf[root@mdw ~]# sysctl -pkernel.shmall = 2041774kernel.shmmax = 8363106304kernel.shmmni = 4096vm.overcommit_memory = 2vm.overcommit_ratio = 95net.ipv4.ip_local_port_range = 10000 65535kernel.sem = 500 2048000 200 40960kernel.sysrq = 1kernel.core_uses_pid = 1kernel.msgmnb = 65536kernel.msgmax = 65536kernel.msgmni = 2048net.ipv4.tcp_syncookies = 1net.ipv4.conf.default.accept_source_route = 0net.ipv4.tcp_max_syn_backlog = 4096net.ipv4.conf.all.arp_filter = 1net.core.netdev_max_backlog = 10000net.core.rmem_max = 2097152net.core.wmem_max = 2097152vm.swappiness = 10vm.zone_reclaim_mode = 0vm.dirty_expire_centisecs = 500vm.dirty_writeback_centisecs =
greenplum-db-6.2.1-rhel7-x86_64.rpm Pivotal Greenplum 6.2 Release Notes This document contains pertinent release information about Pivotal Greenplum Database 6.2 releases. For previous versions of the release notes for Greenplum Database, go to Pivotal Greenplum Database Documentation. For information about Greenplum Database end of life, see Pivotal Greenplum Database end of life policy. Pivotal Greenplum 6 software is available for download from the Pivotal Greenplum page on Pivotal Network. Pivotal Greenplum 6 is based on the open source Greenplum Database project code. Important: Pivotal Support does not provide support for open source versions of Greenplum Database. Only Pivotal Greenplum Database is supported by Pivotal Support. Release 6.2.1 Release Date: 2019-12-12 Pivotal Greenplum 6.2.1 is a minor release that includes new features and resolves several issues. New Features Greenplum Database 6.2.1 includes these new features: Greenplum Database supports materialized views. Materialized views are similar to views. A materialized view enables you to save a frequently used or complex query, then access the query results in a SELECT statement as if they were a table. Materialized views persist the query results in a table-like form. Materialized view data cannot be directly updated. To refresh the materialized view data, use the REFRESH MATERIALIZED VIEW command. See Creating and Managing Materialized Views. Note: Known Issues and Limitations describes a limitation of materialized view support in Greenplum 6.2.1. The gpinitsystem utility supports the --ignore-warnings option. The option controls the value returned by gpinitsystem when warnings or an error occurs. If you specify this option, gpinitsystem returns 0 if warnings occurred during system initialization, and returns a non-zero value if a fatal error occurs. If this option is not specified, gpinitsystem returns 1 if initialization completes with warnings, and returns value of 2 or greater if a fatal error occurs. PXF version 5.10.0 is included, which introduces several new and changed features and bug fixes. See PXF Version 5.10.0 below. PXF Version 5.10.0 PXF 5.10.0 includes the following new and changed features: PXF has improved its performance when reading a large number of files from HDFS or an object store. PXF bundles newer tomcat and jackson libraries. The PXF JDBC Connector now supports pushdown of OR and NOT logical filter operators when specified in a JDBC named query or in an external table query filter condition. PXF supports writing Avro-format data to Hadoop and object stores. Refer to Reading and Writing HDFS Avro Data for more information about this feature. PXF is now certified with Hadoop 2.x and 3.1.x and Hive Server 2.x and 3.1, and bundles new and upgraded Hadoop libraries to support these versions. PXF supports Kerberos authentication to Hive Server 2.x and 3.1.x. PXF supports per-server user impersonation configuration. PXF supports concurrent access to multiple Kerberized Hadoop clusters. In previous releases of Greenplum Database, PXF supported accessing a single Hadoop cluster secured with Kerberos, and this Hadoop cluster must have been configured as the default PXF server. PXF introduces a new template file, pxf-site.xml, to specify the Kerberos and impersonation property settings for a Hadoop or JDBC server configuration. Refer to About Kerberos and User Impersonation Configuration (pxf-site.xml) for more information about this file. PXF now supports connecting to Hadoop with a configurable Hadoop user identity. PXF previously supported only proxy access to Hadoop via the gpadmin Greenplum user. PXF version 5.10.0 deprecates the following configuration properties. Note: These property settings continue to work. The PXF_USER_IMPERSONATION, PXF_PRINCIPAL, and PXF_KEYTAB settings in the pxf-env.sh file. You can use the pxf-site.xml file to configure Kerberos and impersonation settings for your new Hadoop server configurations. The pxf.impersonation.jdbc property setting in the jdbc-site.xml file. You can use the pxf.service.user.impersonation property to configure user impersonation for a new JDBC server configuration. Note: If you have previously configured a PXF JDBC server to access Kerberos-secured Hive, you must upgrade the server definition. See Upgrading PXF in Greenplum 6.x for more information. Changed Features Greenplum Database 6.2.1 includes these changed features: Greenplum Stream Server version 1.3.1 is included in the Greenplum distribution. Resolved Issues Pivotal Greenplum 6.2.1 is a minor release that resolves these issues: 29454 - gpstart During Greenplum Database start up, the gpstart utility did not report when a segment instance failed to start. The utility always displayed 0 skipped segment starts. This issue has been resolved. gpstart output was also enhanced to provide additional warnings and summary information about the number of skipped segments. For example: [WARNING]:-********
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值