Description of problem: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulCreate 11m virtualmachine-controller Created virtual machine pod virt-launcher-vm-fedora-c4lg4 Normal Created 11m virt-handler VirtualMachineInstance defined. Normal Started 11m virt-handler VirtualMachineInstance started. Normal SuccessfulCreate 11m virtualmachine-controller Created attachment pod hp-volume-bgzzw Normal SuccessfulCreate 11m (x6 over 11m) virtualmachine-controller Created hotplug attachment pod hp-volume-bgzzw, for volume blank-dv Normal VolumeMountedToPod 11m virt-handler Volume blank-dv has been mounted in virt-launcher pod Warning SyncFailed 112s (x447 over 11m) virt-handler server error. command SyncVMI failed: "LibvirtError(Code=1, Domain=10, Message='internal error: unable to execute QEMU command 'device_add': Failed to get \"write\" lock')" Version-Release number of selected component (if applicable): CNV 4.12.0 How reproducible: Always Steps to Reproduce: 1. Import DV (nfs) and create VM 2. Create a blank dv(nfs) 3. Hotplug the disk to VM $virtctl addvolume vm-fedora --volume-name=blank-dv 4. Describe the vmi Actual results: Got error as description for vmi Volume blank-dv has been mounted in virt-launcher pod Warning SyncFailed 112s (x447 over 11m) virt-handler server error. command SyncVMI failed: "LibvirtError(Code=1, Domain=10, Message='internal error: unable to execute QEMU command 'device_add': Failed to get \"write\" lock')" and volume status is keeping in VolumeMountedToPod volumeStatus: - hotplugVolume: attachPodName: hp-volume-bgzzw attachPodUID: 9c2e93b3-edac-48d9-bbf8-cf679ae9b8fd message: Volume blank-dv has been mounted in virt-launcher pod name: blank-dv persistentVolumeClaimInfo: accessModes: - ReadWriteOnce capacity: storage: 5Gi filesystemOverhead: "0.055" requests: storage: 1Gi volumeMode: Filesystem phase: MountedToPod reason: VolumeMountedToPod target: "" Expected results: VolumeReady in vmi's volumeStatus, hotplug works without error Additional info: --- apiVersion: cdi.kubevirt.io/v1alpha1 kind: DataVolume metadata: name: dv1 spec: source: http: url: http://url/fedora-images/Fedora-Cloud-Base-34-1.2.x86_64.qcow2 pvc: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi storageClassName: nfs volumeMode: Filesystem contentType: kubevirt --- apiVersion: kubevirt.io/v1 kind: VirtualMachine metadata: labels: kubevirt.io/vm: vm-fedora name: vm-fedora spec: running: true template: metadata: labels: kubevirt.io/vm: vm-fedora spec: domain: devices: disks: - disk: bus: virtio name: dv-disk - disk: bus: virtio name: cloudinitdisk resources: requests: memory: 1024Mi terminationGracePeriodSeconds: 0 volumes: - name: dv-disk dataVolume: name: dv1 - cloudInitNoCloud: userData: |- #cloud-config password: fedora chpasswd: { expire: False } echo 'printed from cloud-init userdata' name: cloudinitdisk --- apiVersion: cdi.kubevirt.io/v1beta1 kind: DataVolume metadata: name: blank-dv spec: source: blank: {} pvc: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi storageClassName: nfs volumeMode: Filesystem
Alexander PTAL
Yes it is broken, I suspect it is because of some selinux labeling. I created a kubevirtci cluster, with both local and nfs storage, then I added a blank local volume and a blank nfs volume. When I look in the virt-launcher pod I see: bash-5.1$ ls -alZ total 5267412 drwxrwxrwx. 2 root root system_u:object_r:container_file_t:s0:c449,c794 64 Apr 5 13:35 . drwxrwxrwx. 5 root root system_u:object_r:container_file_t:s0:c449,c794 96 Apr 5 13:25 .. -rw-rw----. 1 qemu qemu system_u:object_r:container_file_t:s0:c449,c794 5073010688 Apr 5 13:33 volume-hotplug-local.img -rw-rw----. 1 qemu qemu system_u:object_r:nfs_t:s0 10146021376 Apr 5 13:36 volume-hotplug.img Note the selinux label of the nfs volume contains nfs_t instead of container_file_t. The local volume was added successfully, but the nfs volume is showing the exact error report.
Can you show me nfs storage class? I tried other nfs servers (trident-nfs) and it worked. so I suspect it is simply a configuration issue in the nfs server.
$ oc get sc nfs -o yaml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: creationTimestamp: "2023-04-03T14:03:00Z" name: nfs resourceVersion: "73762" uid: bd1d7758-2dc5-4e80-a439-81e4e95595b7 provisioner: kubernetes.io/no-provisioner reclaimPolicy: Delete volumeBindingMode: Immediate $ oc get storageprofile nfs -o yaml apiVersion: cdi.kubevirt.io/v1beta1 kind: StorageProfile metadata: creationTimestamp: "2023-04-03T14:03:00Z" generation: 3 labels: app: containerized-data-importer app.kubernetes.io/component: storage app.kubernetes.io/managed-by: cdi-controller app.kubernetes.io/part-of: hyperconverged-cluster app.kubernetes.io/version: 4.13.0 cdi.kubevirt.io: "" name: nfs ownerReferences: - apiVersion: cdi.kubevirt.io/v1beta1 blockOwnerDeletion: true controller: true kind: CDI name: cdi-kubevirt-hyperconverged uid: 3bbcd4d7-1305-4cac-bc6a-f470672a6159 resourceVersion: "74089" uid: 924b15d5-b4fe-415f-90cb-26d7f5e2bb4b spec: claimPropertySets: - accessModes: - ReadWriteMany volumeMode: Filesystem status: claimPropertySets: - accessModes: - ReadWriteMany volumeMode: Filesystem provisioner: kubernetes.io/no-provisioner storageClass: nfs
So after some more debugging the problem is not with selinux. I am able to hotplug nfs volumes if the boot volume is not NFS. For some reason when the virt-handler mounts the volume in the virt-launcher pod, it finds the wrong NFS disk, and this why you are seeing the unable to get lock message, since that image is already locked for the boot disk. Investigating why that is happening.
Test on CNV-v4.13.1.rhel9-123, issue has been fixed. volumeStatus: - hotplugVolume: attachPodName: hp-volume-sdn94 attachPodUID: e350f374-ed7d-4ea5-8687-2096d96dac5b message: Successfully attach hotplugged volume blank-dv to VM name: blank-dv persistentVolumeClaimInfo: accessModes: - ReadWriteOnce capacity: storage: 5Gi filesystemOverhead: "0.055" requests: storage: 1Gi volumeMode: Filesystem phase: Ready reason: VolumeReady target: sda - name: cloudinitdisk size: 1048576 target: vdb - name: dv-disk persistentVolumeClaimInfo: accessModes: - ReadWriteOnce capacity: storage: 25Gi filesystemOverhead: "0.055" requests: storage: 10Gi volumeMode: Filesystem target: vda kind: List metadata: resourceVersion: ""
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Virtualization 4.13.1 Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2023:3686