Bug 2162252
| Summary: | Got 'SyncVMI failed' when hotplug a NFS disk to a NFS VM | ||
|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | Yan Du <yadu> |
| Component: | Storage | Assignee: | Alexander Wels <awels> |
| Status: | CLOSED ERRATA | QA Contact: | Yan Du <yadu> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.12.0 | CC: | alitke, awels, jpeimer |
| Target Milestone: | --- | ||
| Target Release: | 4.13.1 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | v4.13.1.rhel9-121 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-06-20 13:41:05 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Alexander PTAL Yes it is broken, I suspect it is because of some selinux labeling. I created a kubevirtci cluster, with both local and nfs storage, then I added a blank local volume and a blank nfs volume. When I look in the virt-launcher pod I see: bash-5.1$ ls -alZ total 5267412 drwxrwxrwx. 2 root root system_u:object_r:container_file_t:s0:c449,c794 64 Apr 5 13:35 . drwxrwxrwx. 5 root root system_u:object_r:container_file_t:s0:c449,c794 96 Apr 5 13:25 .. -rw-rw----. 1 qemu qemu system_u:object_r:container_file_t:s0:c449,c794 5073010688 Apr 5 13:33 volume-hotplug-local.img -rw-rw----. 1 qemu qemu system_u:object_r:nfs_t:s0 10146021376 Apr 5 13:36 volume-hotplug.img Note the selinux label of the nfs volume contains nfs_t instead of container_file_t. The local volume was added successfully, but the nfs volume is showing the exact error report. Can you show me nfs storage class? I tried other nfs servers (trident-nfs) and it worked. so I suspect it is simply a configuration issue in the nfs server. $ oc get sc nfs -o yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
creationTimestamp: "2023-04-03T14:03:00Z"
name: nfs
resourceVersion: "73762"
uid: bd1d7758-2dc5-4e80-a439-81e4e95595b7
provisioner: kubernetes.io/no-provisioner
reclaimPolicy: Delete
volumeBindingMode: Immediate
$ oc get storageprofile nfs -o yaml
apiVersion: cdi.kubevirt.io/v1beta1
kind: StorageProfile
metadata:
creationTimestamp: "2023-04-03T14:03:00Z"
generation: 3
labels:
app: containerized-data-importer
app.kubernetes.io/component: storage
app.kubernetes.io/managed-by: cdi-controller
app.kubernetes.io/part-of: hyperconverged-cluster
app.kubernetes.io/version: 4.13.0
cdi.kubevirt.io: ""
name: nfs
ownerReferences:
- apiVersion: cdi.kubevirt.io/v1beta1
blockOwnerDeletion: true
controller: true
kind: CDI
name: cdi-kubevirt-hyperconverged
uid: 3bbcd4d7-1305-4cac-bc6a-f470672a6159
resourceVersion: "74089"
uid: 924b15d5-b4fe-415f-90cb-26d7f5e2bb4b
spec:
claimPropertySets:
- accessModes:
- ReadWriteMany
volumeMode: Filesystem
status:
claimPropertySets:
- accessModes:
- ReadWriteMany
volumeMode: Filesystem
provisioner: kubernetes.io/no-provisioner
storageClass: nfs
$ oc get sc nfs -o yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
creationTimestamp: "2023-04-03T14:03:00Z"
name: nfs
resourceVersion: "73762"
uid: bd1d7758-2dc5-4e80-a439-81e4e95595b7
provisioner: kubernetes.io/no-provisioner
reclaimPolicy: Delete
volumeBindingMode: Immediate
$ oc get storageprofile nfs -o yaml
apiVersion: cdi.kubevirt.io/v1beta1
kind: StorageProfile
metadata:
creationTimestamp: "2023-04-03T14:03:00Z"
generation: 3
labels:
app: containerized-data-importer
app.kubernetes.io/component: storage
app.kubernetes.io/managed-by: cdi-controller
app.kubernetes.io/part-of: hyperconverged-cluster
app.kubernetes.io/version: 4.13.0
cdi.kubevirt.io: ""
name: nfs
ownerReferences:
- apiVersion: cdi.kubevirt.io/v1beta1
blockOwnerDeletion: true
controller: true
kind: CDI
name: cdi-kubevirt-hyperconverged
uid: 3bbcd4d7-1305-4cac-bc6a-f470672a6159
resourceVersion: "74089"
uid: 924b15d5-b4fe-415f-90cb-26d7f5e2bb4b
spec:
claimPropertySets:
- accessModes:
- ReadWriteMany
volumeMode: Filesystem
status:
claimPropertySets:
- accessModes:
- ReadWriteMany
volumeMode: Filesystem
provisioner: kubernetes.io/no-provisioner
storageClass: nfs
So after some more debugging the problem is not with selinux. I am able to hotplug nfs volumes if the boot volume is not NFS. For some reason when the virt-handler mounts the volume in the virt-launcher pod, it finds the wrong NFS disk, and this why you are seeing the unable to get lock message, since that image is already locked for the boot disk. Investigating why that is happening. Test on CNV-v4.13.1.rhel9-123, issue has been fixed.
volumeStatus:
- hotplugVolume:
attachPodName: hp-volume-sdn94
attachPodUID: e350f374-ed7d-4ea5-8687-2096d96dac5b
message: Successfully attach hotplugged volume blank-dv to VM
name: blank-dv
persistentVolumeClaimInfo:
accessModes:
- ReadWriteOnce
capacity:
storage: 5Gi
filesystemOverhead: "0.055"
requests:
storage: 1Gi
volumeMode: Filesystem
phase: Ready
reason: VolumeReady
target: sda
- name: cloudinitdisk
size: 1048576
target: vdb
- name: dv-disk
persistentVolumeClaimInfo:
accessModes:
- ReadWriteOnce
capacity:
storage: 25Gi
filesystemOverhead: "0.055"
requests:
storage: 10Gi
volumeMode: Filesystem
target: vda
kind: List
metadata:
resourceVersion: ""
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Virtualization 4.13.1 Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2023:3686 |
Description of problem: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulCreate 11m virtualmachine-controller Created virtual machine pod virt-launcher-vm-fedora-c4lg4 Normal Created 11m virt-handler VirtualMachineInstance defined. Normal Started 11m virt-handler VirtualMachineInstance started. Normal SuccessfulCreate 11m virtualmachine-controller Created attachment pod hp-volume-bgzzw Normal SuccessfulCreate 11m (x6 over 11m) virtualmachine-controller Created hotplug attachment pod hp-volume-bgzzw, for volume blank-dv Normal VolumeMountedToPod 11m virt-handler Volume blank-dv has been mounted in virt-launcher pod Warning SyncFailed 112s (x447 over 11m) virt-handler server error. command SyncVMI failed: "LibvirtError(Code=1, Domain=10, Message='internal error: unable to execute QEMU command 'device_add': Failed to get \"write\" lock')" Version-Release number of selected component (if applicable): CNV 4.12.0 How reproducible: Always Steps to Reproduce: 1. Import DV (nfs) and create VM 2. Create a blank dv(nfs) 3. Hotplug the disk to VM $virtctl addvolume vm-fedora --volume-name=blank-dv 4. Describe the vmi Actual results: Got error as description for vmi Volume blank-dv has been mounted in virt-launcher pod Warning SyncFailed 112s (x447 over 11m) virt-handler server error. command SyncVMI failed: "LibvirtError(Code=1, Domain=10, Message='internal error: unable to execute QEMU command 'device_add': Failed to get \"write\" lock')" and volume status is keeping in VolumeMountedToPod volumeStatus: - hotplugVolume: attachPodName: hp-volume-bgzzw attachPodUID: 9c2e93b3-edac-48d9-bbf8-cf679ae9b8fd message: Volume blank-dv has been mounted in virt-launcher pod name: blank-dv persistentVolumeClaimInfo: accessModes: - ReadWriteOnce capacity: storage: 5Gi filesystemOverhead: "0.055" requests: storage: 1Gi volumeMode: Filesystem phase: MountedToPod reason: VolumeMountedToPod target: "" Expected results: VolumeReady in vmi's volumeStatus, hotplug works without error Additional info: --- apiVersion: cdi.kubevirt.io/v1alpha1 kind: DataVolume metadata: name: dv1 spec: source: http: url: http://url/fedora-images/Fedora-Cloud-Base-34-1.2.x86_64.qcow2 pvc: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi storageClassName: nfs volumeMode: Filesystem contentType: kubevirt --- apiVersion: kubevirt.io/v1 kind: VirtualMachine metadata: labels: kubevirt.io/vm: vm-fedora name: vm-fedora spec: running: true template: metadata: labels: kubevirt.io/vm: vm-fedora spec: domain: devices: disks: - disk: bus: virtio name: dv-disk - disk: bus: virtio name: cloudinitdisk resources: requests: memory: 1024Mi terminationGracePeriodSeconds: 0 volumes: - name: dv-disk dataVolume: name: dv1 - cloudInitNoCloud: userData: |- #cloud-config password: fedora chpasswd: { expire: False } echo 'printed from cloud-init userdata' name: cloudinitdisk --- apiVersion: cdi.kubevirt.io/v1beta1 kind: DataVolume metadata: name: blank-dv spec: source: blank: {} pvc: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi storageClassName: nfs volumeMode: Filesystem