Description of problem: Migration pod become "Evicted" due to low on resource: ephemeral-storage on node after running quite a few of migrations Version-Release number of selected component (if applicable): hyperconverged-cluster-operator:v2.0.0-32 virt-api:v2.0.0-39 How reproducible: Always Steps to Reproduce: 1. Create a namespace 2. Create a fedora vm 3. Do migrations 4. Delete the migration/vm/namesapce 5. Repeat step1-4 quite a few times Actual results: # oc get pod virt-launcher-vmb-rz8df 2/2 Running 0 10m virt-launcher-vmb-sr8xx 0/2 Evicted 0 8m8s # oc describe pod virt-launcher-vmb-sr8xx Name: virt-launcher-vmb-sr8xx Namespace: network-migration-test Priority: 0 PriorityClassName: <none> Node: working-jjh2c-worker-0-7fc89/ Start Time: Sun, 21 Jul 2019 22:46:29 -0400 Labels: kubevirt.io=virt-launcher kubevirt.io/created-by=8e286d81-ac2a-11e9-9ac4-664f163f5f0f kubevirt.io/migrationJobUID=e78e0e81-ac2a-11e9-9ac4-664f163f5f0f kubevirt.io/nodeName=working-jjh2c-worker-0-rphvm kubevirt.io/vm=fedora-vm Annotations: k8s.v1.cni.cncf.io/networks: [{"interface":"net1","name":"br1test","namespace":"network-migration-test"}] k8s.v1.cni.cncf.io/networks-status: kubevirt.io/domain: vmb kubevirt.io/migrationJobName: l2-migration openshift.io/scc: privileged traffic.sidecar.istio.io/kubevirtInterfaces: k6t-eth0 Status: Failed Reason: Evicted Message: The node was low on resource: ephemeral-storage. Container volumecontainerdisk was using 4Ki, which exceeds its request of 0. Container compute was using 212Ki, which exceeds its request of 0. IP: Controlled By: VirtualMachineInstance/vmb Containers: volumecontainerdisk: Image: quay.io/redhat/cnv-tests-fedora:30 Port: <none> Host Port: <none> Command: /entry-point.sh Readiness: exec [cat /tmp/healthy] delay=2s timeout=5s period=5s #success=2 #failure=5 Environment: COPY_PATH: /var/run/kubevirt-ephemeral-disks/container-disk-data/network-migration-test/vmb/disk_containerdisk/disk-image Mounts: /var/run/kubevirt-ephemeral-disks from ephemeral-disks (rw) compute: Image: brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/container-native-virtualization/virt-launcher:v2.0.0-39 Port: <none> Host Port: <none> Command: /usr/bin/virt-launcher --qemu-timeout 5m --name vmb --uid 8e286d81-ac2a-11e9-9ac4-664f163f5f0f --namespace network-migration-test --kubevirt-share-dir /var/run/kubevirt --ephemeral-disk-dir /var/run/kubevirt-ephemeral-disks --readiness-file /var/run/kubevirt-infra/healthy --grace-period-seconds 15 --hook-sidecars 0 --less-pvc-space-toleration 10 Limits: bridge.network.kubevirt.io/br1test: 1 devices.kubevirt.io/kvm: 1 devices.kubevirt.io/tun: 1 devices.kubevirt.io/vhost-net: 1 Requests: bridge.network.kubevirt.io/br1test: 1 cpu: 100m devices.kubevirt.io/kvm: 1 devices.kubevirt.io/tun: 1 devices.kubevirt.io/vhost-net: 1 memory: 1208392Ki Readiness: exec [cat /var/run/kubevirt-infra/healthy] delay=2s timeout=5s period=2s #success=1 #failure=5 Environment: KUBEVIRT_RESOURCE_NAME_br1test: bridge.network.kubevirt.io/br1test Mounts: /var/run/kubevirt from virt-share-dir (rw) /var/run/kubevirt-ephemeral-disks from ephemeral-disks (rw) /var/run/kubevirt-infra from infra-ready-mount (rw) /var/run/libvirt from libvirt-runtime (rw) Volumes: infra-ready-mount: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: virt-share-dir: Type: HostPath (bare host directory volume) Path: /var/run/kubevirt HostPathType: libvirt-runtime: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: ephemeral-disks: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: QoS Class: Burstable Node-Selectors: kubevirt.io/schedulable=true Tolerations: node.kubernetes.io/memory-pressure:NoSchedule node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 8m32s default-scheduler Successfully assigned network-migration-test/virt-launcher-vmb-sr8xx to working-jjh2c-worker-0-7fc89 Normal Pulled 8m24s kubelet, working-jjh2c-worker-0-7fc89 Container image "quay.io/redhat/cnv-tests-fedora:30" already present on machine Normal Created 8m23s kubelet, working-jjh2c-worker-0-7fc89 Created container volumecontainerdisk Normal Started 8m23s kubelet, working-jjh2c-worker-0-7fc89 Started container volumecontainerdisk Normal Pulled 8m23s kubelet, working-jjh2c-worker-0-7fc89 Container image "brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/container-native-virtualization/virt-launcher:v2.0.0-39" already present on machine Normal Created 8m23s kubelet, working-jjh2c-worker-0-7fc89 Created container compute Normal Started 8m23s kubelet, working-jjh2c-worker-0-7fc89 Started container compute Warning Unhealthy 8m20s kubelet, working-jjh2c-worker-0-7fc89 Readiness probe failed: cat: /var/run/kubevirt-infra/healthy: No such file or directory Warning Evicted 6m25s kubelet, working-jjh2c-worker-0-7fc89 The node was low on resource: ephemeral-storage. Container volumecontainerdisk was using 4Ki, which exceeds its request of 0. Container compute was using 212Ki, which exceeds its request of 0. Normal Killing 6m25s kubelet, working-jjh2c-worker-0-7fc89 Stopping container volumecontainerdisk Normal Killing 6m25s kubelet, working-jjh2c-worker-0-7fc89 Stopping container compute Expected results: The ephemeral-storage on node should have a recycling mechanism after the resource is deleted Additional info:
Created attachment 1592523 [details] describe node
Moving back to virtualization. This is happening during live migration. From a cursory look it seems that kubernetes is responding in a somewhat surprising way to the namespace being deleted during the migration.
Yan, could you please provide the VM definition used for this?
Created attachment 1601356 [details] fedora vm
Hi, Stuart, please get the vm yaml file in attachment.
Created attachment 1601367 [details] vm spec Stuart, actually I met the issue when I debugging the network migration automation script, and the vm is configured with multus network based on the fedora template in #Comment4. I attached it as well.
Vladik, Could you investigate this?
Kedar, is this still an issue?
"The node was low on resource: ephemeral-storage. Container volumecontainerdisk was using 4Ki, which exceeds its request of 0. Container compute was using 212Ki, which exceeds its request of 0." This means that pods got evicted because they were using ephemral storage that they did not request. KubeVirt should reflect that it actually is consuming epheraml resources in these two containers by requesting an educated-guess amount of ephemeral storage, let's say: 1Mi.
Is this something that we've addressed in code already?
(In reply to Fabian Deutsch from comment #12) > "The node was low on resource: ephemeral-storage. Container > volumecontainerdisk was using 4Ki, which exceeds its request of 0. Container > compute was using 212Ki, which exceeds its request of 0." > > This means that pods got evicted because they were using ephemral storage > that they did not request. > KubeVirt should reflect that it actually is consuming epheraml resources in > these two containers by requesting an educated-guess amount of ephemeral > storage, let's say: 1Mi. I like Fabians suggestion. Sounds like that is the proper thing to to.
set evictionStrategy: LiveMigrate on the VMI spec and the migration pod likely won't get evicted.
> I like Fabians suggestion. Sounds like that is the proper thing to to. @rmohr you saw the vmi yaml didn't have EvictionStrategy set, right?
This was reported against 2.0... did we even have an evictionStrategy back then? We also copied the container disk then. not sure if this is relevant today. I was debating whether we should still request a minor amount of ephemeral-storage, we do have temporary files, iirc. It's probably not enough to cause an eviction.
(In reply to David Vossel from comment #17) > > I like Fabians suggestion. Sounds like that is the proper thing to to. > > > @rmohr you saw the vmi yaml didn't have EvictionStrategy set, > right? Yes the attached yaml does not have it set. I think this eviction is coming from the kubelet and it will happen independent of PDBs. I think that if you are in the QoS setting of Burstable, you can get deleted if you are above your requests independent of PDBs. I am almost sure but would have to try it again. Independent of that, I think that our workloads should by default never get in the situation to be evicted (independend if /evict or a delete from the kubelet) due to infra-overhead.
(In reply to Vladik Romanovsky from comment #18) > This was reported against 2.0... did we even have an evictionStrategy back > then? > > We also copied the container disk then. not sure if this is relevant today. > > I was debating whether we should still request a minor amount of > ephemeral-storage, we do have temporary files, iirc. It's probably not > enough to cause an eviction. I think that every Ki above the request will make the pod a candidate to evict it. We have the issue that some of our disks `ephemeral`, `containerDisk` and `emptyDisk` for instance, just write into emptyDirs. In that case I agree that users have to add the request to the VMI to be guarded against this type of eviction. However I would still want to add a request of e.g. 5Mi in general to cover our overhead. Since I am pretty sure that every Ki above the limit is too much.
> I think this eviction is coming from the kubelet and it will happen independent of PDBs. I think that if you are in the QoS setting of Burstable, you can get deleted if you are above your requests independent of PDBs. I am almost sure but would have to try it again. I understand now and agree that adding a storage request is a good first step.
Do you have a link to the PR, that fixes this bug?
(In reply to Kedar Bidarkar from comment #22) > Do you have a link to the PR, that fixes this bug? Yes I do: https://github.com/kubevirt/kubevirt/pull/5013.
verify with build hco-bundle-registry-container-v4.8.0-367 virt-operator-container-v4.8.0-58 step: 1. Create a namespace 2. Create a fedora vm 3. Do migrations 4. Delete the migration/vm/namesapce 5. Repeat step1-4 quite a few times run step 1-4 >10 times no Evicted pod found NAME READY STATUS RESTARTS AGE virt-launcher-vm-fedora-k55nl 1/1 Running 0 7m43s virt-launcher-vm-fedora-lvkmj 0/1 Completed 0 19m virt-launcher-vm-fedora-rxjgb 0/1 Completed 0 18m
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 4.8.0 Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2920