Bug 2092271
Summary: | CephFS-based VM status changes to "paused" after migration | |||
---|---|---|---|---|
Product: | Container Native Virtualization (CNV) | Reporter: | chhu | |
Component: | Virtualization | Assignee: | Jed Lejosne <jlejosne> | |
Status: | CLOSED ERRATA | QA Contact: | zhe peng <zpeng> | |
Severity: | medium | Docs Contact: | ||
Priority: | medium | |||
Version: | 4.11.0 | CC: | acardace, akalenyu, alitke, danken, fdeutsch, ibezukh, jlejosne, jsafrane, kbidarka, nashok, pelauter, sgott, yadu | |
Target Milestone: | --- | Flags: | ibezukh:
needinfo+
ibezukh: needinfo? (alitke) |
|
Target Release: | 4.14.0 | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | libvirt_CNV_INT | |||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
When you use two pods with different SELinux contexts, VMs with the ocs-storagecluster-cephfs storage class no longer fail to migrate. (BZ#2092271)
|
Story Points: | --- | |
Clone Of: | ||||
: | 2174226 (view as bug list) | Environment: | ||
Last Closed: | 2023-11-08 14:05:03 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2135381, 2174226 |
Description
chhu
2022-06-01 08:25:29 UTC
Chenli, would you be able to re-test this scenario while using the RBD storage class? It might be that the issue here could be IO related. It would be helpful if you were able to capture the related virt-launcher and virt-handler logs. Would you also be able to post the Pod and VMI manifests? (In reply to sgott from comment #3) > Chenli, would you be able to re-test this scenario while using the RBD > storage class? It might be that the issue here could be IO related. > > It would be helpful if you were able to capture the related virt-launcher > and virt-handler logs. Would you also be able to post the Pod and VMI > manifests? Stu, I re-test this scenario with ceph rbd storage class, migrate VM from node to another node successfully. The issue only happened on VM with cephfs storage class. Please see the attached file: asb-vm-dv-ocs-cephfs.yaml for the VMI manifests, and the described pod information in files: virt-launcher-*-tjs6l-source/target - Create VM # oc create -f asb-vm-dv-ocs-cephfs.yaml # oc get pod -o wide|grep virt-launcher virt-launcher-asb-vm-dv-ocs-cephfs-tjs6l 1/1 Running 0 2m33s 10.128.1.72 dell-per730-63.lab.eng.pek2.redhat.com <none> 1/1 - Migrate VM in web console # oc get pod -o wide|grep virt-launcher virt-launcher-asb-vm-dv-ocs-cephfs-pjlsx 0/1 ContainerCreating 0 4s <none> dell-per730-64.lab.eng.pek2.redhat.com <none> 0/1 virt-launcher-asb-vm-dv-ocs-cephfs-tjs6l 1/1 Running 0 2m49s 10.128.1.72 dell-per730-63.lab.eng.pek2.redhat.com <none> 1/1 # oc get pod -o wide|grep virt-launcher virt-launcher-asb-vm-dv-ocs-cephfs-pjlsx 1/1 Running 0 9s 10.129.0.65 dell-per730-64.lab.eng.pek2.redhat.com <none> 0/1 virt-launcher-asb-vm-dv-ocs-cephfs-tjs6l 1/1 Running 0 2m54s 10.128.1.72 dell-per730-63.lab.eng.pek2.redhat.com <none> 1/1 - Describe the pod information # oc describe pod virt-launcher-asb-vm-dv-ocs-cephfs-tjs6l > virt-launcher-asb-vm-dv-ocs-cephfs-tjs6l-source # oc describe pod virt-launcher-asb-vm-dv-ocs-cephfs-pjlsx > virt-launcher-asb-vm-dv-ocs-cephfs-pjlsx-target # oc get pod -o wide|grep virt-launcher virt-launcher-asb-vm-dv-ocs-cephfs-pjlsx 0/1 Completed 0 4m9s 10.129.0.65 dell-per730-64.lab.eng.pek2.redhat.com <none> 0/1 virt-launcher-asb-vm-dv-ocs-cephfs-tjs6l 1/1 Running 0 6m54s 10.128.1.72 dell-per730-63.lab.eng.pek2.redhat.com <none> 0/1 # oc rsh virt-launcher-asb-vm-dv-ocs-cephfs-tjs6l sh-4.4# virsh list --all Id Name State --------------------------------------------------- 1 openshift-cnv_asb-vm-dv-ocs-cephfs paused # tail -f /var/log/libvirt/qemu/openshift-cnv_asb-vm-dv-ocs-cephfs.log -device VGA,id=video0,vgamem_mb=16,bus=pcie.0,addr=0x1 \ -device virtio-balloon-pci-non-transitional,id=balloon0,bus=pci.5,addr=0x0 \ -object rng-random,id=objrng0,filename=/dev/urandom \ -device virtio-rng-pci-non-transitional,rng=objrng0,id=rng0,bus=pci.6,addr=0x0 \ -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ -msg timestamp=on 2022-06-07 09:08:10.706+0000: Domain id=1 is tainted: custom-ga-command 2022-06-07 09:10:37.059+0000: initiating migration 2022-06-07T09:10:42.355174Z qemu-kvm: warning: Failed to unlock byte 201 2022-06-07T09:10:42.355267Z qemu-kvm: warning: Failed to unlock byte 201 Cephfs does not support read-write-many as a valid mode, so it's not surprising that this sequence caused an IOError. However, this conflicting invalid/configuration should likely have been caught during provisioning. With that in mind, changing the component to Storage for further evaluation. Please feel free to change the component if this appears to be in error. Cephfs actually does ReadWriteMany: https://github.com/ceph/ceph-csi/blob/c85d03c79edcd46c0399dbd0fedd6a8be7703a58/examples/cephfs/pvc.yaml#L8 Actually, we even tried to bring it to our upstream CI at one point: https://github.com/kubevirt/kubevirtci/pull/768 So it should be eligible for migration AFAIK, could I also join Stu's request for manifests and ask for the PVC & DataVolume? @chhu Stu, I believe we provided all the information from storage side, it looks like a migration issue, can we move to Virt? Thanks, Yan! Hi, Stu, Alex Please see the dv, pvc, pv information in attached files: dv.yaml, pvc.yaml, pv.yaml, thank you! # oc get dv NAME PHASE PROGRESS RESTARTS AGE asb-dv-ocs-cephfs Succeeded 100.0% 146m # oc get dv asb-dv-ocs-cephfs -o yaml >dv.yaml # oc get pvc|grep asb-dv-ocs-cephfs asb-dv-ocs-cephfs Bound pvc-212aae52-7459-4d6b-bf6e-b9018bc56866 12Gi RWX ocs-storagecluster-cephfs 149m # oc get pvc asb-dv-ocs-cephfs -o yaml >pvc.yaml # oc get pv|grep asb-dv-ocs-cephfs pvc-212aae52-7459-4d6b-bf6e-b9018bc56866 12Gi RWX Delete Bound openshift-cnv/asb-dv-ocs-cephfs ocs-storagecluster-cephfs 150m # oc get pv pvc-212aae52-7459-4d6b-bf6e-b9018bc56866 -o yaml >pv.yaml Hi, I will add CephFS support in KubevirtCI upstream, will try to reproduce it there. Hi, I managed to reproduce the issue, but what fixed it is a configuration of CephFS CRD can you please provide us with the CephFileSystem CRD? I think it may be misconfiguration of CephFS. The number of data and metadata replicas should be equal to the number of OSDs that are running on the cluster. TIA Igor Also we suspect this issue as the root cause: https://github.com/ceph/ceph-csi/issues/3562 (In reply to Igor Bezukh from comment #17) > Hi, > > I managed to reproduce the issue, but what fixed it is a configuration of > CephFS CRD > > can you please provide us with the CephFileSystem CRD? I think it may be > misconfiguration of CephFS. > > The number of data and metadata replicas should be equal to the number of > OSDs that are running on the cluster. > > TIA > Igor Hi Igor I'll setup the env and provide the CephFileSystem CRD later, thank you! (In reply to chhu from comment #20) > (In reply to Igor Bezukh from comment #17) > > Hi, > > > > I managed to reproduce the issue, but what fixed it is a configuration of > > CephFS CRD > > > > can you please provide us with the CephFileSystem CRD? I think it may be > > misconfiguration of CephFS. > > > > The number of data and metadata replicas should be equal to the number of > > OSDs that are running on the cluster. > > > > TIA > > Igor > > Hi Igor > > I'll setup the env and provide the CephFileSystem CRD later, thank you! Hi Igor I reproduced it on my environment with the steps in "Description" part. For the environment setup, I just installed the ODF, and I haven't do any configuration for the CephFileSystem CRD. will you please help to have a check on my env ? I sent the env information to you by gchat, thank you! The issue that we see with live migration is a side effect of the original issue with CephFS RWX, as described here: https://github.com/ceph/ceph-csi/issues/3562 I will move this bug to CNV Storage team for further observation. OCP storage team here: if it's really https://github.com/ceph/ceph-csi/issues/3562, i.e. two Pods with different SELinux contexts are trying to use the same ReadWriteMany volume at the same time, then it's not a bug, but a feature of Kubernetes / OpenShift - it protects data "leaked" from a Pod to a different Pod that uses a different SELinux context. Please get yaml of both Pods an check their pod.spec.securityContext.seLinuxOptions and/or "crictl inspect <container>" if it's really the case. If two (or more) Pods want to share data on a volume, they must run with the same SELinux context (pod.spec.securityContext.seLinuxOptions or spec.containers[*].securityContext.seLinuxOptions of all Pod's containers that have the volume mounted). If the fields are missing or empty, the container runtime will assign a random one for each Pod! In OpenShift, if the Pods are in the same namespace and their SCC has "SELinuxContext: type: MustRunAs" (e.g. "restricted" SCC), OCP will assign SELinux context to the Pods from from namespace annotations, i.e. they should run with the same SELinux context and be able to share a volume. (If not, we have a bug somewhere.) However, if the Pods are in different namespaces *or* their SCC has a different "SELinuxContext" value, then their SELinux contexts are most probably different and they can't share data on a volume. It's somewhat documented at https://docs.openshift.com/container-platform/4.12/authentication/managing-security-context-constraints.html To sum it, if "restricted" SCC is not enough for CNV, please use any other SCC that uses "SELinuxContext: type: MustRunAs" and all Pods in the same namespace will be able to share their volumes. There are other workarounds possible, but SCC would be the best. Stu, can you take a look at comment #23 from Jan regarding SELinux contexts? It seems that the migration destination Pod really should start with the same context as the source. Matching the security context of the source on the target is what was done in this PR: https://github.com/kubevirt/kubevirt/pull/9246 verify with build: CNV-v4.14.0.rhel9-1553 step: 1. create a vm with CephFS storage ... storage: resources: requests: storage: 30Gi storageClassName: ocs-storagecluster-cephfs ... 2. start vm and do live migration $ oc get pods NAME READY STATUS RESTARTS AGE virt-launcher-vm-fedora-jpcz4 1/1 Running 0 3m13s virt-launcher-vm-fedora-l685m 0/1 Completed 0 8m5s $ oc get virtualmachineinstancemigrations.kubevirt.io NAME PHASE VMI vm-fedora-migration-uouwq Succeeded vm-fedora $ oc get vm NAME AGE STATUS READY vm-fedora 10m Running True migration succeeded, move to verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Virtualization 4.14.0 Images security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:6817 |