节点基于资源压力要驱逐pod时,pod的状态是什么?
2023-09-11 14:14:16 时间
当pod的所在的主机出现资源压力的时候,比如我们模拟了一个磁盘使用率超过90%的场景
在pod正常运行时,pod的状态是Running
[root@nccztsjb-node-23 ~]# kubectl get pod
NAME READY STATUS RESTARTS AGE
nginx-ds-lw4vj 1/1 Running 0 6d19h
nginx-ds-nrf4t 1/1 Running 0 6d19h
nginx-ds-ql4s8 1/1 Running 0 6d19h
nginx-test-56745657-6wj5n 1/1 Running 0 6d19h
nginx-test-56745657-jx2kp 1/1 Running 0 6d19h
nginx-test-56745657-m6hm4 1/1 Running 0 6d19h
nginx-test-56745657-mhjsh 1/1 Running 0 6d19h
nginx-test-56745657-pqhqp 1/1 Running 0 6d19h
[root@nccztsjb-node-23 ~]# kubectl get pod nginx-test-56745657-6wj5n -o yaml | grep phase
phase: Running
即phase为: Running
然后,通过fallocate模拟一个190G的大文件
fallocate -l 190G bigfile
磁盘空间使用率涨到96%
[root@nccztsjb-node-24 data]# df -h /data
Filesystem Size Used Avail Use% Mounted on
/dev/vdb 200G 191G 9.2G 96% /data
[root@nccztsjb-node-24 data]#
pod被驱逐,查看pod的状态或者说阶段
[root@nccztsjb-node-23 ~]# kubectl get pod nginx-test-56745657-6wj5n -o yaml | grep -i phase
phase: Failed
[root@nccztsjb-node-23 ~]#
已经变为:Failed
查看pod的描述信息
[root@nccztsjb-node-23 ~]# kubectl describe pod nginx-test-56745657-6wj5n
Name: nginx-test-56745657-6wj5n
Namespace: default
Priority: 0
Node: nccztsjb-node-24/172.20.58.65
Start Time: Thu, 17 Mar 2022 15:13:17 +0800
Labels: app=nginx-test
pod-template-hash=56745657
Annotations: cni.projectcalico.org/containerID: cbd9967186479712f1e7c27112fc9b9a31e5628d21e2ec7e96c2a4c8a8a956ea
cni.projectcalico.org/podIP:
cni.projectcalico.org/podIPs:
Status: Failed
Reason: Evicted
Message: The node was low on resource: ephemeral-storage. Container nginx was using 24Ki, which exceeds its request of 0.
IP:
IPs: <none>
Controlled By: ReplicaSet/nginx-test-56745657
Containers:
nginx:
Container ID:
Image: 172.20.58.152/middleware/nginx:1.21.4
Image ID:
Port: <none>
Host Port: <none>
State: Terminated
Reason: ContainerStatusUnknown
Message: The container could not be located when the pod was terminated
Exit Code: 137
Started: Mon, 01 Jan 0001 00:00:00 +0000
Finished: Mon, 01 Jan 0001 00:00:00 +0000
Last State: Terminated
Reason: ContainerStatusUnknown
Message: The container could not be located when the pod was deleted. The container used to be Running
Exit Code: 137
Started: Mon, 01 Jan 0001 00:00:00 +0000
Finished: Mon, 01 Jan 0001 00:00:00 +0000
Ready: False
Restart Count: 1
Limits:
cpu: 500m
memory: 200Mi
Requests:
cpu: 500m
memory: 200Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-cmp26 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-cmp26:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Guaranteed
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Evicted 82s kubelet The node was low on resource: ephemeral-storage. Container nginx was using 24Ki, which exceeds its request of 0.
Normal Killing 82s kubelet Stopping container nginx
可以看出,由于节点存在临时存储压力,所以,kubelet将停止nginx容器。
简单来说,磁盘压力是kubelet发出来的,停止的操作也是有kubelet发起的。
相关文章
- 利用jinterface在java和erlang节点之间通讯
- JavaScript HTML DOM 元素(节点)
- Centos7安装部署openstack--nova计算服务(控制节点)
- kubesphere添加新节点
- k8s master节点参与调度(去除NoSchedule污点)
- Zookeeper节点知识点整理
- 怎么读出Xml文件中某个节点、属性的信息
- Atitit 项目管理之时间管理之道 attilax著 艾龙 著 1. 项目活动的分解和定义1 2. 第2章|项目活动定义与活动排序 131 3. 项目活动资源需求估计2 4. 里程碑节点2
- 基于P2P电力市场中的成本分配研究【IEEE39节点】(Matlab代码实现)
- 考虑大规模电动汽车接入电网的双层优化调度策略【IEEE33节点】(Matlab代码实现)
- 在XAML代码中为节点树安装事件监听器
- 理解dropout——本质是通过阻止特征检测器的共同作用来防止过拟合 Dropout是指在模型训练时随机让网络某些隐含层节点的权重不工作,不工作的那些节点可以暂时认为不是网络结构的一部分,但是它的权重得保留下来(只是暂时不更新而已),因为下次样本输入时它可能又得工作了
- Kubelet 运行机制分析 节点管理
- K8S故障排查指南:部分节点无法启动Pod资源-Pod处于ContainerCreating状态