k8s安装单机版hadoop
时间: 2025-05-19 18:12:38 浏览: 24
### 部署 Hadoop 到 Kubernetes 单机环境
要在 Kubernetes 的单机环境中成功部署 Hadoop,可以按照以下方法操作:
#### 准备工作
确保已安装并配置好 Kubernetes 和 Docker 环境。如果尚未完成,请参考相关文档[^2]。
创建一个命名空间用于隔离 Hadoop 资源:
```yaml
apiVersion: v1
kind: Namespace
metadata:
name: hadoop-cluster
```
保存上述内容到 `namespace-hadoop.yaml` 文件中,并通过命令应用它:
```bash
kubectl apply -f namespace-hadoop.yaml
```
#### 创建 PersistentVolume (PV) 和 PersistentVolumeClaim (PVC)
为了存储数据持久化,在 Kubernetes 中需要定义 PV 和 PVC。以下是 YAML 定义文件的一个例子:
##### persistent-volume.yaml
```yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: hadoop-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
hostPath:
path: "/mnt/data"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: hadoop-pvc
namespace: hadoop-cluster
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
```
应用此配置文件:
```bash
kubectl apply -f persistent-volume.yaml
```
#### 构建 Hadoop Docker 映像
由于官方并未提供专门针对 Kubernetes 的 Hadoop 映像,因此可能需要自行构建映像。下载 Hadoop 并打包成容器镜像是第一步[^1]。
假设已经解压了 Hadoop 压缩包至 `/opt/hadoop/` 目录下,则可以通过如下方式制作自定义的 Dockerfile:
##### Dockerfile
```dockerfile
FROM ubuntu:latest
ENV HADOOP_HOME=/usr/local/hadoop \
PATH=$PATH:/usr/local/hadoop/bin
RUN apt-get update && apt-get install -y openjdk-8-jdk ssh rsync
WORKDIR /usr/local/
COPY ./hadoop-2.7.7 /usr/local/hadoop
EXPOSE 9870 8088
CMD ["tail", "-f", "/dev/null"]
```
构建并推送该镜像到本地或者远程仓库:
```bash
docker build -t my-repo/my-hadoop-image .
docker push my-repo/my-hadoop-image
```
#### 编写 Deployment 和 Service 配置
接下来编写 Kubernetes 的 Deployment 和 Service 配置来启动 Hadoop 主节点和 DataNode。
##### deployment-hadoop-master.yaml
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: hadoop-master-deployment
namespace: hadoop-cluster
spec:
replicas: 1
selector:
matchLabels:
app: hadoop-master
template:
metadata:
labels:
app: hadoop-master
spec:
containers:
- name: hadoop-master-container
image: my-repo/my-hadoop-image
volumeMounts:
- mountPath: "/data"
name: data-storage
volumes:
- name: data-storage
persistentVolumeClaim:
claimName: hadoop-pvc
---
apiVersion: v1
kind: Service
metadata:
name: hadoop-master-service
namespace: hadoop-cluster
spec:
type: NodePort
ports:
- port: 9870
targetPort: 9870
nodePort: 30070
selector:
app: hadoop-master
```
##### deployment-hadoop-datanode.yaml
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: hadoop-datanode-deployment
namespace: hadoop-cluster
spec:
replicas: 2
selector:
matchLabels:
app: hadoop-datanode
template:
metadata:
labels:
app: hadoop-datanode
spec:
containers:
- name: hadoop-datanode-container
image: my-repo/my-hadoop-image
volumeMounts:
- mountPath: "/data"
name: data-storage
volumes:
- name: data-storage
persistentVolumeClaim:
claimName: hadoop-pvc
---
apiVersion: v1
kind: Service
metadata:
name: hadoop-datanode-service
namespace: hadoop-cluster
spec:
clusterIP: None
selector:
app: hadoop-datanode
```
应用这些配置文件:
```bash
kubectl apply -f deployment-hadoop-master.yaml
kubectl apply -f deployment-hadoop-datanode.yaml
```
验证 Pod 是否正常运行以及服务状态是否正确:
```bash
kubectl get pods -n hadoop-cluster
kubectl get svc -n hadoop-cluster
```
访问 Web UI 来确认集群的状态(例如 http://<node-ip>:30070)。
---
阅读全文
相关推荐








