1.
Kubernetes Cluster Architecture
Master Node
The master node is responsible for managing the Kubernetes cluster, storing information regarding the different
nodes, planning which containers go where, and monitoring the nodes and containers. The master node performs
these tasks using a set of components known as the control plane components.
etcd (Key-Value Store)
etcd is a database that stores information in a key-value format. Since there are many containers being loaded and
unloaded on a daily basis, Kubernetes needs to maintain information about the different nodes, what container is
running on which node, and at what time it was scheduled.
Scheduler
The scheduler identifies the right node to place a container based on the container’s resource requirements, the
worker node’s capacity, or other policies and constraints such as taints, tolerations, or node affinity rules.
Controllers
In Kubernetes, controllers manage different aspects of the cluster. The Node Controller is responsible for
onboarding new nodes to the cluster, handling situations where nodes become unavailable or get destroyed.
The Replication Controller ensures that the desired number of containers are running at all times in a replication
group.
Kube API Server
The Kube API server is the primary management component of Kubernetes. It orchestrates all operations within
the cluster. It exposes the Kubernetes API, which is used by external users to perform management operations, as
well as by controllers to monitor the state of the cluster and make necessary changes. The worker nodes also use it
for communication.
Worker Node
Worker nodes are responsible for running containerized applications.
Kubelet
The kubelet acts as the captain of the ship, managing all activities on a node. It listens for instructions from the
Kube API server, deploys and destroys containers as required, and sends reports about node and container status.
Kube Proxy
The Kube Proxy service ensures that necessary networking rules are in place for communication between services
running on different nodes.
Container Runtime
The container runtime is required on all nodes to run containers. It can be Docker, ContainerD, or any other
supported runtime.
Summary
1. The Master Node manages cluster operations via etcd, the Kube API server, the scheduler, and
controllers.
2. The Worker Nodes run applications with components like Kubelet, Kube Proxy, and the container
runtime.
3. Networking & Communication are handled by Kube Proxy, enabling seamless connectivity between
services.
2. Docker vs Containerd
Introduction
Docker and Containerd are both essential tools in the container ecosystem. Docker initially dominated the
container landscape, but Kubernetes later introduced the Container Runtime Interface (CRI) to support multiple
runtimes. As a result, Containerd emerged as a lightweight alternative for Kubernetes container management.
Docker
Docker is a complete containerization platform that provides tools for building, sharing, and running containerized
applications.
Components of Docker
Docker CLI: A command-line interface to interact with Docker.
Docker API: Provides programmatic access to Docker’s functionality.
Build Tools: Tools to build container images.
Networking & Security: Manages networking, authentication, and security policies.
Container Runtime (runC): Responsible for running container processes.
Containerd: A daemon that manages runC and orchestrates container lifecycle operations.
Docker’s Role in Kubernetes
Initially, Kubernetes was built specifically to work with Docker.
Docker lacked CRI support, leading Kubernetes to introduce dockershim, a temporary bridge to continue
supporting Docker.
From Kubernetes v1.24, dockershim was removed, meaning Docker is no longer directly supported as a
runtime.
Containerd
Containerd is a lightweight container runtime that manages the container lifecycle, including pulling images,
starting and stopping containers, and handling storage and networking.
Features of Containerd
CRI-Compatible: Works directly with Kubernetes as a container runtime.
Efficient: Provides only essential runtime features without unnecessary overhead.
Independence: Can run without Docker and integrates seamlessly with Kubernetes.
CNCF Graduated Project: Developed as an independent project under the Cloud Native Computing
Foundation (CNCF).
Containerd CLI Tools
1. ctr
o A basic CLI tool used for debugging Containerd.
o Has a limited set of features.
o Used to pull and run images but is not user-friendly for production.
2. nerdctl
o A user-friendly CLI alternative that works like Docker.
o Supports advanced features such as encrypted container images and lazy pulling.
o Ideal for managing containers when using Containerd directly.
3. crictl
o A Kubernetes tool for interacting with CRI-compatible runtimes.
o Used for debugging and troubleshooting container runtimes in Kubernetes.
o Unlike Docker CLI, it also interacts with pods in Kubernetes.
Comparison: Docker vs. Containerd
Feature Docker Containerd
CLI Tool docker ctr, nerdctl
Kubernetes Compatibility Deprecated (post v1.24) CRI-Compatible
Usability User-friendly ctr (debugging), nerdctl (user-friendly)
Feature Scope Complete platform Lightweight runtime
CNCF Status No Graduated Project
Summary
Docker is a comprehensive containerization platform but is no longer supported as a runtime in
Kubernetes.
Containerd is a lightweight alternative that integrates seamlessly with Kubernetes and supports CRI.
CLI tools like ctr (for debugging), nerdctl (Docker-like usability), and crictl (Kubernetes debugging) provide
various levels of interaction with Containerd.
Moving forward, Kubernetes users should rely on Containerd or other CRI-compatible runtimes for
container orchestration.
3. etcd in Kubernetes
Role of etcd in Kubernetes
etcd serves as the primary data store for Kubernetes, storing critical cluster information such as:
Nodes
Pods
ConfigMaps
Secrets
Service Accounts
Roles and Role Bindings
Whenever you run kubectl get commands, the displayed information is retrieved from etcd. Any cluster changes,
such as adding nodes, deploying pods, or modifying resources, are first recorded in etcd before they are
considered complete.
Deployment of etcd
etcd can be deployed in two ways:
1. Setting up Kubernetes from Scratch
You manually download, install, and configure etcd as a service on the master node.
Various options must be configured, many of which relate to TLS certificates.
One crucial configuration is the advertised client URL, which defines the address on which etcd listens
(default: IP:2379).
The Kubernetes API server must be configured to communicate with etcd using this URL.
2. Setting up Kubernetes using kubeadm
kubeadm automates the deployment of etcd as a Pod in the kube-system namespace.
You can explore the etcd database using the etcdctl utility within this pod.
Exploring etcd Database
To list all keys stored in etcd, use:
ETCDCTL_API=3 etcdctl get / --prefix --keys-only
Kubernetes follows a structured directory hierarchy:
Root Directory: /registry
Subdirectories: Nodes (/registry/minions), Pods, ReplicaSets, Deployments, etc.
High Availability etcd Setup
In HA environments, multiple master nodes host separate etcd instances.
These instances must be aware of each other, which is configured in the etcd service using the initial-
cluster parameter.
Properly setting up etcd in HA mode ensures redundancy and fault tolerance.
ETCDCTL Utility
etcdctl is the CLI tool used to interact with etcd. It supports two API versions:
Version 2 (default)
Version 3 (recommended)
Version 2 Commands:
etcdctl backup
etcdctl cluster-health
etcdctl mk
etcdctl mkdir
etcdctl set
Version 3 Commands:
etcdctl snapshot save
etcdctl endpoint health
etcdctl get
etcdctl put
To use API version 3, set:
export ETCDCTL_API=3
Authentication with etcd
When interacting with etcd, specify certificate files for authentication:
--cacert /etc/kubernetes/pki/etcd/ca.crt
--cert /etc/kubernetes/pki/etcd/server.crt
--key /etc/kubernetes/pki/etcd/server.key
Example command:
kubectl exec etcd-master -n kube-system -- sh -c "ETCDCTL_API=3 etcdctl get / --prefix --keys-only --limit=10 --
cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key
/etc/kubernetes/pki/etcd/server.key"
Summary
etcd stores all critical Kubernetes cluster data.
It can be manually deployed or automatically set up using kubeadm.
The advertised client URL (default: port 2379) is crucial for API server communication.
In HA clusters, multiple etcd instances synchronize data across master nodes.
The etcdctl utility helps manage etcd, with version-specific commands and authentication requirements.
Understanding etcd is key to managing and troubleshooting Kubernetes effectively
kube-apiserver in Kubernetes
Role of kube-apiserver in Kubernetes
kube-apiserver is the central management component of Kubernetes. It serves as the primary entry point for all
administrative operations and is responsible for:
Authentication and authorization of requests
Validating and processing resource configurations
Communicating with the etcd datastore
Managing interactions between control plane components
Exposing the Kubernetes API for internal and external clients
All interactions within the Kubernetes cluster, including pod scheduling and node status updates, pass through
kube-apiserver.
How kube-apiserver Works
1. A user or component sends a request to the kube-apiserver.
2. The request is authenticated and validated.
3. The kube-apiserver updates the etcd datastore with the new configuration.
4. Other components (such as the scheduler or controller-manager) observe the change and take necessary
actions.
5. kube-apiserver keeps all components informed by continuously monitoring etcd for updates.
Deployment of kube-apiserver
kube-apiserver can be deployed in two ways:
1. Setting up Kubernetes from Scratch
kube-apiserver is downloaded manually from the Kubernetes release page.
It is configured to run as a system service on the master node.
The configuration includes multiple startup options, such as authentication and authorization settings.
2. Setting up Kubernetes using kubeadm
kubeadm automatically deploys kube-apiserver as a Pod in the kube-system namespace.
The pod’s configuration is stored at /etc/kubernetes/manifests/kube-apiserver.yaml.
Key kube-apiserver Configuration Parameters
kube-apiserver is launched with numerous command-line options. Some of the important ones include:
--etcd-servers=<ETCD_URL> → Specifies the location of the etcd cluster.
--authorization-mode=RBAC → Enables role-based access control.
--enable-admission-plugins=... → Defines admission controllers for resource validation.
--client-ca-file → Specifies the client certificate authority for authentication.
--tls-cert-file and --tls-private-key-file → Secure API communications with SSL/TLS.
Interacting with kube-apiserver
kube-apiserver can be queried using kubectl or directly via HTTP API calls.
To view cluster resources:
kubectl get pods
kubectl get nodes
kubectl get deployments
For debugging purposes, you can inspect running API server configurations:
In kubeadm-based deployments: Check /etc/kubernetes/manifests/kube-apiserver.yaml
In manual deployments: Inspect /etc/systemd/system/kube-apiserver.service
High Availability Setup
In HA environments, multiple kube-apiserver instances are deployed across master nodes.
A load balancer or DNS mechanism distributes API requests among these instances.
Each kube-apiserver instance connects to the same etcd cluster to ensure consistency.
Summary
kube-apiserver is the core API gateway for Kubernetes, handling authentication, validation, and resource
management.
It directly interacts with etcd to store and retrieve cluster state information.
kube-apiserver is automatically deployed using kubeadm but can also be manually configured.
The etcd-servers parameter is crucial for connecting to the etcd datastore.
In HA setups, multiple kube-apiserver instances work behind a load balancer for redundancy.
Understanding kube-apiserver is essential for managing and troubleshooting Kubernetes effectively.
Kube Controller Manager in Kubernetes
Role of Kube Controller Manager
The Kube Controller Manager is responsible for managing multiple controllers in Kubernetes. Controllers
continuously monitor and manage the state of various cluster components to ensure they remain in the desired
state.
Key Responsibilities of Controllers
Continuously watch the state of Kubernetes objects.
Take corrective actions when discrepancies arise to maintain the desired state.
Communicate with the Kube API Server to apply changes.
Types of Controllers
Some key controllers managed by the Kube Controller Manager include:
1. Node Controller
Monitors the health of worker nodes.
Checks node status every 5 seconds.
Marks a node as unreachable if no heartbeat is received for 40 seconds.
Waits 5 minutes before reassigning pods to healthy nodes if the unreachable node does not recover.
2. Replication Controller
Ensures the desired number of pod replicas are always running.
If a pod dies, a new one is created to maintain the specified replica count.
3. Other Controllers
Deployment Controller: Manages rolling updates and rollbacks.
Service Controller: Ensures services are correctly assigned to the right endpoints.
Namespace Controller: Handles lifecycle events of namespaces.
Persistent Volume Controller: Manages the lifecycle of persistent storage volumes.
Deployment of Kube Controller Manager
The Kube Controller Manager is packaged as a single process that runs multiple controllers. There are two
deployment methods:
1. Setting up Kubernetes from Scratch
Download and extract Kube Controller Manager from the Kubernetes release page.
Run it as a service with appropriate configurations.
Configure options such as node monitoring period, grace period, and eviction timeout.
2. Setting up Kubernetes using kubeadm
kubeadm deploys the Kube Controller Manager as a Pod in the kube-system namespace.
You can view its configuration in /etc/kubernetes/manifests/kube-controller-manager.yaml.
Configuring the Kube Controller Manager
The Kube Controller Manager is configured using various command-line options:
Node Monitor Period: Specifies how frequently node health is checked.
Grace Period: Defines the waiting period before marking a node as unreachable.
Eviction Timeout: Determines when pods should be reassigned if a node remains unhealthy.
Controllers Option: Allows enabling or disabling specific controllers.
o By default, all controllers are enabled.
o If a specific controller is not working, checking this setting is a good troubleshooting step.
Viewing Kube Controller Manager Configuration
For kubeadm-based Deployments
The Kube Controller Manager runs as a pod in the kube-system namespace.
View configuration in /etc/kubernetes/manifests/kube-controller-manager.yaml.
For Non-kubeadm Setups
The service configuration is stored in the system services directory.
View running process options by listing processes on the master node:
ps -ef | grep kube-controller-manager
Summary
The Kube Controller Manager manages various controllers that ensure Kubernetes resources are in the
desired state.
Controllers such as the Node Controller and Replication Controller perform essential monitoring and
recovery tasks.
It can be deployed manually or via kubeadm.
Configuration options allow tuning behavior, and troubleshooting starts by inspecting these settings.
Understanding the Kube Controller Manager is crucial for maintaining the stability and reliability of a Kubernetes
cluster.