zl程序教程

您现在的位置是:首页 >  IT要闻

当前栏目

23-Kubernetes扩展学习实践笔记

2023-02-19 12:17:46 时间

[TOC]

0x00 如何将K8S中源数据通过环境变量注入到容器?

描述: Kubernetes 自从1.7开始,可以在 pod 的container 内获取pod的spec,metadata 等源数据信息,实际上是使用 downward API 通过环境变量把自身的信息呈现给 Pod 中运行的容器。

pod一共有三种类型容器 • Infrastructure Container:基础容器 • 维护整个Pod网络空间 • InitContainers:初始化容器 • 先于业务容器开始执行 • Containers:业务容器 • 并行启动

需求: 假如你有一个根据主机名词尾缀进行选择要使用GPU资源序号,或者是获取资源控制器生成的Pod相关IP或标签信息,此时都可以使用注入环境变量的方式(希望对大家有帮助)

目标:通过使用 env 和 fieldRef,将 k8s 的源数据和容器字段变成环境变量注入到了容器中。

当前资源控制器env对象 (valueFrom.fieldRef.fieldPath) 支持的注入字段信息如下:

# Pod 名称(主机名称)
metadata.name
# 名称空间
metadata.namespace
# 标签
metadata.labels['']
# 注释
metadata.annotations['']
# 节点名词
spec.nodeName
# 服务账户名词
spec.serviceAccountName
# 宿主机IP地址信息
status.hostIP
# Pod IPV4地址信息
status.podIP
# 获取 Pod 的 IPv4 和 IPv6 地址
status.podIPs

示例

apiVersion: v1
kind: Pod
metadata:
  name: dapi-envars-fieldref
  namespace: devtest
  labels:
    app: downwardAPI
  annotations:
    demo: dapi-envars
spec:
  containers:
    - name: test-container
      image: busybox:latest
      command: [ "sh", "-c"]
      args:
      - while true; do
          echo -en '\n';
          printenv MY_NODE_NAME MY_POD_NAME MY_POD_NAMESPACE;
          printenv MY_POD_IP MY_POD_IPS MY_POD_SERVICE_ACCOUNT;
          printenv MY_POD_LABELS_APP MY_POD_ANNOTATIONS_DEMO;
          printenv MY_CPU_REQUEST MY_CPU_LIMIT;
          printenv MY_MEM_REQUEST MY_MEM_LIMIT;
          sleep 10;
        done;
      resources:
        requests:
          memory: "32Mi"
          cpu: "125m"
        limits:
          memory: "64Mi"
          cpu: "250m"
      env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: MY_NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
       - name: HOST_IP
          valueFrom:
            fieldRef:
              fieldPath: status.hostIP
        - name: MY_POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: MY_POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: MY_POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        - name: MY_POD_IPS
          valueFrom:
            fieldRef:
              fieldPath: status.podIPs
        - name: MY_POD_SERVICE_ACCOUNT
          valueFrom:
            fieldRef:
              fieldPath: spec.serviceAccountName
        - name: MY_POD_LABELS_APP
          valueFrom:
            fieldRef:
              fieldPath: metadata.labels['app']
        - name: MY_POD_ANNOTATIONS_DEMO
          valueFrom:
            fieldRef:
              fieldPath: metadata.annotations['demo']
        - name: MY_CPU_REQUEST
          valueFrom:
            resourceFieldRef:
              containerName: test-container
              resource: requests.cpu
        - name: MY_CPU_LIMIT
          valueFrom:
            resourceFieldRef:
              containerName: test-container
              resource: limits.cpu
        - name: MY_MEM_REQUEST
          valueFrom:
            resourceFieldRef:
              containerName: test-container
              resource: requests.memory
        - name: MY_MEM_LIMIT
          valueFrom:
            resourceFieldRef:
              containerName: test-container
              resource: limits.memory
  restartPolicy: Never

运行Pod后查看注入的环境变量:

~$ kubectl apply -f test-container.yaml
pod/dapi-envars-fieldref created

~$ kubectl logs -n devtest dapi-envars-fieldref
dapi-envars-fieldref
devtest
10.66.182.247
10.66.182.247
default
downwardAPI
dapi-envars
1
1
33554432
67108864

~$ kubectl exec -n devtest dapi-envars-fieldref -- printenv
HOSTNAME=dapi-envars-fieldref
MY_MEM_REQUEST=33554432
HOST_IP=192.168.12.226
MY_POD_NAME=dapi-envars-fieldref
MY_POD_NAMESPACE=devtest
MY_POD_IP=10.66.182.247
MY_POD_IPS=10.66.182.247
MY_POD_SERVICE_ACCOUNT=default
MY_POD_ANNOTATIONS_DEMO=dapi-envars
NODE_NAME=weiyigeek-226
MY_POD_LABELS_APP=downwardAPI
MY_CPU_REQUEST=1
MY_CPU_LIMIT=1
MY_MEM_LIMIT=67108864
....

实践示例: 根据Pod名称截取最后一个-字符后的数字来选择该Pod调用的GPU序号(即使用那一块gpu)

apiVersion: v1
kind: Service
metadata:
  name: healthcode
  namespace: devtest
  labels:
    app: healthcode
    use: gpu
  annotations:
    author: weiyigeek
    blog: blog.weiyigeek.top
spec:
  type: NodePort
  ports:
    - name: http
      port: 8000
      targetPort: 8000
      protocol: TCP
      nodePort: 30000
  selector:
    app: healthcode
    use: gpu
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: healthcode-0
  namespace: devtest
  labels:
    app: healthcode
spec:
  replicas: 6
  selector:
    matchLabels:
      app: healthcode
      use: gpu
  serviceName: "healthcode"
  template:
    metadata:
      labels:
        app: healthcode
        use: gpu
    spec:
      volumes:
      - name: workdir
        emptyDir: {}
      - name: workspace
        hostPath:
          path: /storage/webapp/project/MultiTravelcodeocr
          type: DirectoryOrCreate
      - name: model
        hostPath:
          path: /storage/webapp/project/.EasyOCR
          type: DirectoryOrCreate
      - name: img
        hostPath:
          path: /storage/webapp/project/upfile
          type: DirectoryOrCreate
      initContainers:
      - name: init  # 使用初始化容器进行相应处理
        image: busybox:1.35.0
        imagePullPolicy: IfNotPresent
        command:  # 设置 Pod 使用的 GPU 显卡序号
        - /bin/sh
        - -c
        - "echo export CUDA_VISIBLE_DEVICES=${GPU_DEVICES##*-}> /app/${GPU_DEVICES}"
        env:
        - name: GPU_DEVICES
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        # - name: CUDA_VISIBLE_DEVICES    # 此种方式不行,env不能直接截取变量
        #   value: ${GPU_DEVICES##*-}
        volumeMounts:
        - name: workdir
          mountPath: /app/
      containers:
        - name: app
          image: harbor.weiyigeek.top/python/easyocr-healthcode:v1.6.2
          command: ['/bin/bash', '-c','source /app/${HOSTNAME}; echo ${CUDA_VISIBLE_DEVICES}; python ./setup.py --imgdir=/imgs --logdir=
/logs --gpu=True'] # 加载进行环境变量之中,实际上我们也可以在app容器直接在source命令前echo export CUDA_VISIBLE_DEVICES=${HOSTNAME##*-}> /app/${HOSTNAME}使用搞定,总之条条大路通罗马,学习就是思路。
          imagePullPolicy: IfNotPresent
          resources:
            limits: {}
            #  cpu: "8"
            #  memory: 8Gi
          volumeMounts:
            - name: workdir
              mountPath: /app/
            - name: workspace
              mountPath: /workspace
            - name: model
              mountPath: /root/.EasyOCR
            - name: img
              mountPath: /imgs
          ports:
            - name: http
              protocol: TCP
              containerPort: 8000

执行结果:

# 一个6个pod,每个pod使用对应的GPU,例如0-0则使用0号CPU,0-1则使用1号CPU 
$ kubectl get pod -n devtest
NAME             READY   STATUS    RESTARTS   AGE
healthcode-0-5   1/1     Running   0          15h 
healthcode-0-4   1/1     Running   0          15h
healthcode-0-3   1/1     Running   0          15h
healthcode-0-2   1/1     Running   0          15h
healthcode-0-1   1/1     Running   0          15h
healthcode-0-0   1/1     Running   0          15h

# 查看 GPU 服务器使用情况
$ nvidia-smi
Fri Dec  9 10:08:32 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 465.19.01    Driver Version: 465.19.01    CUDA Version: 11.3     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA Tesla V1...  Off  | 00000000:1B:00.0 Off |                    0 |
| N/A   41C    P0    36W / 250W |   6697MiB / 32510MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA Tesla V1...  Off  | 00000000:1D:00.0 Off |                    0 |
| N/A   51C    P0    53W / 250W |   9489MiB / 32510MiB |     14%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  NVIDIA Tesla V1...  Off  | 00000000:3D:00.0 Off |                    0 |
| N/A   53C    P0    42W / 250W |   5611MiB / 32510MiB |     20%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  NVIDIA Tesla V1...  Off  | 00000000:3F:00.0 Off |                    0 |
| N/A   37C    P0    35W / 250W |  10555MiB / 32510MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   4  NVIDIA Tesla V1...  Off  | 00000000:40:00.0 Off |                    0 |
| N/A   45C    P0    51W / 250W |   5837MiB / 32510MiB |      5%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   5  NVIDIA Tesla V1...  Off  | 00000000:41:00.0 Off |                    0 |
| N/A   37C    P0    37W / 250W |  10483MiB / 32510MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A    167660      C   python                           6693MiB |
|    1   N/A  N/A    166790      C   python                           9485MiB |
|    2   N/A  N/A    165941      C   python                           5607MiB |
|    3   N/A  N/A    165032      C   python                          10551MiB |
|    4   N/A  N/A    164226      C   python                           5833MiB |
|    5   N/A  N/A    163344      C   python                          10479MiB |
+-----------------------------------------------------------------------------+

参考文章: