记录一次搭建生产环境Kubernetes

本教程是通过kubeadm部署k8s的详细教程。我们在阿里云购买了五台ecs服务器,我们创建的vpc网段是:172.16.0.0/12

一、安装Docker、Docker-Compose

安装docker后,默认会安装容器运行时containerd, kubernetes也支持containerd作为runtime

1.安装docker

 

sudo yum install -y docker-ce-24.0.6 docker-ce-cli-24.0.6 containerd.io sudo mkdir -p /etc/docker sudo tee /etc/docker/daemon.json <<EOF { "registry-mirrors": [ "https://docker.m.daocloud.io", "https://hfxvwdfx.mirror.aliyuncs.com", "https://dockerproxy.com", "https://proxy.1panel.live", "https://dockerproxy.cn", "https://hub1.nat.tf", "https://docker.ketches.cn", "https://hub1.nat.tf", "https://hub2.nat.tf", "https://docker.6252662.xyz" ], "insecure-registries" : ["https://ude7leho.mirror.aliyuncs.com"], "log-driver": "json-file", "log-opts": {"max-size": "100m", "max-file": "1"} } EOF sudo systemctl daemon-reload sudo systemctl restart docker sudo systemctl enable docker sudo systemctl start docker sudo docker version --format '{{.Server.Version}}'

由于docker容器网段和vpc网段重叠

原来的两台服务器,跑了docker容器的,配置网段如下:

 

{ "bip": "192.168.0.1/24", "default-address-pools": [ {"base": "10.10.0.0/16","size": 24} ] } { "bip": "192.168.1.1/24", "default-address-pools": [ {"base": "10.11.0.0/16","size": 24} ] }

分别为node1-node5分配bip 2.1/24~6.1/24,修改docker配置

 

在docker里面添加:bip和default-address-pools "bip": "192.168.6.1/24", "default-address-pools": [ { "base": "10.11.0.0/16", "size": 24 } ]

2.安装docker-compose

 

curl -SL https://github.com/docker/compose/releases/download/v2.18.1/docker-compose-$(uname -s)-$(uname -m) mv docker-compose /usr/local/bin/docker-compose sudo chmod +x /usr/local/bin/docker-compose docker-compose --version

二、安装kubernetes

由 kubeadm 创建的 Kubernetes 集群依赖于使用内核特性的相关软件。
这些软件包括但不限于容器运行时kubelet 和容器网络接口(CNI)插件

清单

IP(内网)hostname配置节点类型
172.22.162.243k8s-node14C/16G/100Gworker
172.22.162.245k8s-node24C/16G/100Gworker
172.22.162.244k8s-node34C/8G/100Gmaster
172.22.162.242k8s-node44C/8G/100Gmaster
172.22.162.241k8s-node54C/8G/100Gmaster

配置hosts

 

cat >> /etc/hosts << EOF 172.22.162.243 k8s-node1 172.22.162.245 k8s-node2 172.22.162.244 k8s-node3 172.22.162.242 k8s-node4 172.22.162.241 k8s-node5 172.22.162.241 cluster-endpoint EOF

  1. 映射关系对 master-ip cluster-endpoint 与 kubeadm init 时传递参数 --control-plane-endpoint=cluster-endpoint 关联,配置高可用集群时,该参数必须指定。
  2. 高可用集群部署成功后,可将该映射 IP 地址改为负载均衡地址。
  3. 该映射 IP 为第一个部署的 master 节点地址。

1.前提条件

  • Linux主机,每台机器2GB+的RAM
  • Control Plane要2C+的核心
  • 网络互通,内网公网均可
  • 节点之中不可以有重复的主机名、MAC 地址或 product_uuid
  • 开启机器上的某些端口
  • 交换分区的配置。kubelet 的默认行为是在节点上检测到交换内存时无法启动

所有节点都需要执行一遍

2.检查mac和product uuid唯一性

硬件设备会拥有唯一的地址,但是有些虚拟机的地址可能会重复,如果这些值在每个节点上不唯一,可能会导致安装失败

 

# 检测 mac 地址 ip link ifconfig -a # 检查 product_uuid cat /sys/class/dmi/id/product_uuid

3.检查所需端口

  • 必须开启6443端口,用于Kubernetes API服务器入站请求

  • 目前服务器已关闭防火墙,所有的端口在内网均可访问

4.配置内核参数 转发ipv4并让iptables看到桥接流量

  • 启用Overlay文件系统,Kubernetes 需要它来管理容器镜像和存储
  • 让Linux桥接网络支持iptables防火墙规则, Linux 的网桥(如 Docker 或 Kubernetes 创建的虚拟网桥 cni0)会将多个容器/Pod 的网络连接在一起,形成局域网。默认情况下,桥接的流量不经过宿主机的 iptables 规则链。
  • Iptables是Linux 的防火墙工具,Kubernetes 用它实现:Service负载菌核、网络策略、节点流量转发
  • 如果桥接流量不经过 iptables,Kubernetes 的 Service 和网络策略将失效。这两行配置强制让桥接流量经过 iptables/ip6tables 规则链。
 

cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf overlay br_netfilter EOF sudo modprobe overlay sudo modprobe br_netfilter # 设置所需的 sysctl 参数,参数在重新启动后保持不变 cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 # 启用 Linux 内核的 IPv4 数据包转发功能,允许主机充当路由器,在不同网络接口间转发流量 net.ipv4.ip_forward = 1 EOF # 应用 sysctl 参数而不重新启动 sudo sysctl --system

5.yum源配置 安装常用包

 

yum install -y vim bash-completion net-tools gcc

6.时间同步

 

yum install -y chrony # 同步的时间服务器修改为国家授时中心 NTP 服务器 sed -i 's/^pool 2.centos.pool.ntp.org iburst$/pool ntp.ntsc.ac.cn iburst/' /etc/chrony.conf systemctl start chronyd systemctl enable chronyd # 查看同步的时间服务器 chronyc sources

7.关闭防火墙

 

systemctl stop firewalld systemctl disable firewalld

8.关闭swap分区

 

# 临时有效 swapoff -a # 永久生效,修改 /etc/fstab,删除如下行 /dev/mapper/cl-swap swap swap defaults 0 0

9.关闭SELinux

 

setenforce 0 sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config

10.配置ssh互信

ssh-copy-id 将本地 SSH 公钥复制到远程主机 k8s-node1,以实现免密码登录

 

# 生成 ssh 公私钥 ssh-keygen # 配置 authorized_keys(各节点交叉执行) ssh-copy-id -i ~/.ssh/id_rsa.pub root@k8s-node1 ssh-copy-id -i ~/.ssh/id_rsa.pub root@k8s-node2 ssh-copy-id -i ~/.ssh/id_rsa.pub root@k8s-node3 ssh-copy-id -i ~/.ssh/id_rsa.pub root@k8s-node4

11.containered配置

修改运行时为containerd

⚠️:containerd虽然和docker一起安装上,但是他们的镜像是单独保存的

  • docker镜像存储在/var/lib/docker, containerd存储在/var/lib/containerd
  • 可以手动导出docker镜像并导入containerd,他们的格式是一样的(OCI)
 

crictl config runtime-endpoint unix:///var/run/containerd/containerd.sock

修改containerd配置文件,如果不存在,先生成默认配置

 

containerd config default > /etc/containerd/config.toml

修改配置文件 vim /etc/containerd/config.toml

 

[plugins."io.containerd.grpc.v1.cri".registry.mirrors] [plugins."io.containerd.grpc.v1.cri".registry.mirrors."registry.k8s.io"] endpoint = ["https://docker.6252662.xyz","https://docker.m.daocloud.io","https://dockerproxy.com"] [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"] endpoint = ["https://docker.6252662.xyz","https://docker.m.daocloud.io","https://dockerproxy.com"]

修改沙箱sandbox pause的镜像地址

 

grep sandbox_image /etc/containerd/config.toml # 按上一步输出选择 sed -i "s#k8s.gcr.io/pause#registry.aliyuncs.com/google_containers/pause#g" /etc/containerd/config.toml 或 sed -i "s#registry.k8s.io/pause#registry.aliyuncs.com/google_containers/pause#g" /etc/containerd/config.toml # 确认修改成功 grep sandbox_image /etc/containerd/config.toml # 应用所有更改后,重新启动containerd systemctl restart containerd # 检查registry配置是否正确 crictl info | grep -A 20 "registry"

12.修改systemd cgroup驱动

官方建议使用systemd cgroup驱动,SystemdCgroup默认是false,使用cgroupfs驱动,改为true使用systemd驱动

 

sed -i 's#SystemdCgroup = false#SystemdCgroup = true#g' /etc/containerd/config.toml # 应用所有更改后,重新启动containerd systemctl restart containerd

13.安装kubeadm、kubelet、kubectl

  • kubeadm:用来初始化集群的指令。
  • kubelet:在集群中的每个节点上用来启动 Pod 和容器等。
  • kubectl:用来与集群通信的命令行工具。
配置k8s yum源
 

cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-$basearch enabled=1 gpgcheck=1 gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg exclude=kubelet kubeadm kubectl EOF

安装
 

yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes systemctl enable --now kubelet

⚠️注意:kubelet 现在每隔几秒就会重启,属于正常现象,因为它陷入了一个等待 kubeadm 指令的死循环。

14.使用kubeadm创建集群

在master1 也就是cluster-endpoint节点上执行kubeadm init
 

# –-apiserver-advertise-address 用于为控制平面节点的 API server 设置广播地址。 # –-image-repository 指定从什么位置来拉取镜像。 # --control-plane-endpoint 用于为所有控制平面节点设置共享端点。如果不设置,则无法将单个控制平面 kubeadm 集群升级成高可用。 # --upload-certs 指定将在所有控制平面实例之间的共享证书上传到集群。后面 join 其他 master 节点时,可以使用该证书。 # –-kubernetes-version 指定 k8s 版本号。 # –-pod-network-cidr 指定 Pod 网络的范围。不同 CNI 默认网段也不一样,Calico 默认为 192.168.0.0/16。 kubeadm init \ --apiserver-advertise-address=172.22.162.241 \ --image-repository registry.aliyuncs.com/google_containers \ --control-plane-endpoint=cluster-endpoint \ --upload-certs \ --kubernetes-version v1.31.9 \ --service-cidr=10.1.0.0/16 \ --pod-network-cidr=192.168.0.0/16 \ --v=5

部署成功后,输出如下,保存kubeadm join的字符串
 

Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of the control-plane node running the following command on each as root: kubeadm join cluster-endpoint:6443 --token l3o830.bnrk2y3fsi40w10o \ --discovery-token-ca-cert-hash sha256:87219069b3e533e86dc28d62446daebe0628b76703aa740f149b032fa7690b6a \ --control-plane --certificate-key ee9fff16e3f8a46dd31624e68df31065cfc8ecdd007ce7cb020a492e8952eedf Please note that the certificate-key gives access to cluster sensitive data, keep it secret! As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use "kubeadm init phase upload-certs --upload-certs" to reload certs afterward. Then you can join any number of worker nodes by running the following on each as root: kubeadm join cluster-endpoint:6443 --token l3o830.bnrk2y3fsi40w10o \ --discovery-token-ca-cert-hash sha256:ee9fff16e3f8a46dd31624e68df31065cfc8ecdd007ce7cb020a492e8952eedf

  • cluster-endpoint:6443: 集群API服务器端点
  • token:加入集群的认证令牌 24H有效期
  • discovery-token-ca-cert-hash,CA证书,验证API服务器身份
  • control-plane:加入的是控制平面的服务器
  • certificate-key:获取控制平面证书,2H过期
手动重新获取join命令

获取worker的join:

 

kubeadm token create --print-join-command

获取master的join:

 

先获取worker的join string 再组合上--control-plan --discovery-token-ca-cert-hash = certificate-key

token过期

 

kubeadm token create # yqiri0.oxybs1qfy7trixy3 kubeadm token create --print-join-command # kubeadm join cluster-endpoint:6443 --token 0tx28e.kskomhxaz21efutu --discovery-token-ca-cert-hash sha256:87219069b3e533e86dc28d62446daebe0628b76703aa740f149b032fa7690b6a

certificate-key过期

 

kubeadm init phase upload-certs --upload-certs kubeadm token create --print-join-command

master1服务器 配置kubectl kube config环境变量
 

mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config # 设置环境变量 echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile source ~/.bash_profile

查看node信息 node5 notReady
 

[root@k8s-node5 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-node5 NotReady control-plane 115s v1.28.2

查看日志报错原因
 

journalctl -xeu kubelet # Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized

安装网络插件CNI calico

你必须部署一个基于 Pod 网络插件的容器网络接口(CNI), 以便你的 Pod 可以相互通信。在安装网络之前,集群 DNS (CoreDNS) 将不会启动。

  • 通过kubectl apply来安装
  • 安装Pod网络后,检查 CoreDNS Pod 是否 Running 来确认其是否正常运行。一旦 CoreDNS Pod 启用并运行,你就可以继续加入节点
  • 我们安装的k8s是1.31 兼容的calico是3.29.5

下载calico 3.29.5的yaml

 

wget https://raw.githubusercontent.com/projectcalico/calico/v3.29.5/manifests/calico.yaml kubectl create -f calico.yaml kubectl get pod -n kube-system -o wide

如果安装过程出现失败,需要删除所有资源,然后重新执行上面的安装命令

 

# 删除 DaemonSet 和 Deployment kubectl delete daemonset calico-node -n kube-system kubectl delete deployment calico-kube-controllers -n kube-system # 删除 ServiceAccount kubectl delete serviceaccount calico-kube-controllers -n kube-system kubectl delete serviceaccount calico-node -n kube-system kubectl delete serviceaccount calico-cni-plugin -n kube-system # 删除 ClusterRole 和 ClusterRoleBinding kubectl delete clusterrole calico-kube-controllers kubectl delete clusterrole calico-node kubectl delete clusterrole calico-cni-plugin kubectl delete clusterrole calico-tier-getter kubectl delete clusterrolebinding calico-kube-controllers kubectl delete clusterrolebinding calico-node kubectl delete clusterrolebinding calico-cni-plugin kubectl delete clusterrolebinding calico-tier-getter # 删除 ConfigMap kubectl delete configmap calico-config -n kube-system # 删除 PodDisruptionBudget kubectl delete poddisruptionbudget calico-kube-controllers -n kube-system # 删除calico crd kubectl delete crd \ bgpconfigurations.crd.projectcalico.org \ bgpfilters.crd.projectcalico.org \ bgppeers.crd.projectcalico.org \ blockaffinities.crd.projectcalico.org \ caliconodestatuses.crd.projectcalico.org \ clusterinformations.crd.projectcalico.org \ felixconfigurations.crd.projectcalico.org \ globalnetworkpolicies.crd.projectcalico.org \ globalnetworksets.crd.projectcalico.org \ hostendpoints.crd.projectcalico.org \ ipamblocks.crd.projectcalico.org \ ipamconfigs.crd.projectcalico.org \ ipamhandles.crd.projectcalico.org \ ippools.crd.projectcalico.org \ ipreservations.crd.projectcalico.org \ kubecontrollersconfigurations.crd.projectcalico.org \ networkpolicies.crd.projectcalico.org \ networksets.crd.projectcalico.org \ tiers.crd.projectcalico.org \ adminnetworkpolicies.policy.networking.k8s.io

 

kubectl apply -f tigera-operator.yaml kubectl apply -f custom-resources.yaml

node3-4加入集群

certificate-key可能会过期,有效时间2H,需要已经加入集群的master1重新生成:

 

kubeadm join cluster-endpoint:6443 --token l3o830.bnrk2y3fsi40w10o \ --discovery-token-ca-cert-hash sha256:87219069b3e533e86dc28d62446daebe0628b76703aa740f149b032fa7690b6a \ --control-plane --certificate-key ee9fff16e3f8a46dd31624e68df31065cfc8ecdd007ce7cb020a492e8952eedf

  • calico在整个集群里面只需要装一次
  • 但是需要初始化kubectl

node1-2加入集群作为worker

 

kubeadm join cluster-endpoint:6443 --token l3o830.bnrk2y3fsi40w10o \ --discovery-token-ca-cert-hash sha256:ee9fff16e3f8a46dd31624e68df31065cfc8ecdd007ce7cb020a492e8952eedf

  • worker节点不需要kubectl,kubectl的初始化可以跳过 worker节点也看不到nodes

安装kubesphere

安装helm

  • helm需要安装在master节点上

1.下载helm安装包github.com/helm/helm/r…

2.解压后,移动mv linux-amd64/helm /usr/local/bin/helm

3.helm version检测是否安装完成

4.修改源 微软的源推荐azure的

 

helm repo remove stable helm repo add stable http://mirror.azure.cn/kubernetes/charts/ helm repo update

安装kubesphere

 

helm upgrade --install \ -n kubesphere-system \ --create-namespace \ ks-core \ https://charts.kubesphere.com.cn/main/ks-core-1.1.4.tgz \ --debug

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值