【Kubenates新增gpu节点调度】

作者：System 时间：2024年08月25日分类：所有,elasticsearch 字数：1366

这篇文章距离上次修改已过331天，其中的内容可能已经有所变动。

在Kubernetes中，要使得GPU节点能够调度，需要确保集群中安装了NVIDIA的GPU驱动和相关的device plugin。以下是一个简单的步骤指导和示例代码，用于确保GPU节点可以被Kubernetes调度。

确保GPU驱动安装正确。
确保Kubernetes集群中的kubelet配置了--feature-gates=Accelerators=true。
确保安装了NVIDIA的device plugin。

示例代码（在GPU节点上）:




# nvidia-device-plugin-daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: nvidia-device-plugin-daemonset
  namespace: kube-system
spec:
  selector:
    matchLabels:
      name: nvidia-device-plugin-daemonset
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        name: nvidia-device-plugin-daemonset
    spec:
      containers:
      - name: nvidia-device-plugin-container
        image: nvidia/k8s-device-plugin:1.0.0-beta
        volumeMounts:
          - name: device-plugin-socket
            mountPath: /var/lib/kubelet/device-plugins
      volumes:
        - name: device-plugin-socket
          hostPath:
            path: /var/lib/kubelet/device-plugins

部署device plugin:




kubectl apply -f nvidia-device-plugin-daemonset.yaml

确保GPU资源在Pod规格中被请求：




apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
spec:
  containers:
  - name: cuda-container
    image: nvidia/cuda:9.0-devel
    resources:
      limits:
        nvidia.com/gpu: 1 # 请求1个GPU

这样，Kubernetes集群就会调度GPU资源给请求它们的Pod。确保你的节点标签正确，以便调度器可以按期望的方式工作。

【Kubenates新增gpu节点调度】

评论已关闭

推荐阅读