Description of problem: when the cdi importer fails, like in case of bug 1945121, the disk on the source RHV provider stays locked. Thus when the importer starts a retry, it fails on a disk lock, and that blocks the importer to actually try again to import (copy) the image. Version-Release number of selected component (if applicable): CNV-2.6.1
Fabien could you help me to assign this to the appropriate engineer on your team to be fixed in CDI for 4.8?
I've assigned to Matthew since he already fixed another lock case.
Matthew, did you already fix this case or is this a unique one?
No, I did not fix this case already. I fixed a similar thing in 1924560, where ImageIO disks were locked after a clean importer shutdown. This variant was caused by an importer error, and I'm sure there are more potential cases just like it.
Matthew, can this bug be moved to MODIFIED since the attached PR is merged?
Yes, I will move it over.
@Matthew, Can you please suggest verification steps as bug 1945121 is already fixed?
I have had luck triggering a failure in the same code path by removing the importer's copy of qemu-img. If you start an import from RHV, you can oc exec bash in the importer pod and delete or move /bin/qemu-img somewhere that the importer program can't find it. You should get a failure like this: 0622 13:18:42.213320 1 data-processor.go:232] , Couldn't start qemu-img: exec: "qemu-img": executable file not found in $PATH kubevirt.io/containerized-data-importer/pkg/image.(*qemuOperations).Info pkg/image/qemu.go:190 kubevirt.io/containerized-data-importer/pkg/importer.ResizeImage pkg/importer/data-processor.go:304 kubevirt.io/containerized-data-importer/pkg/importer.(*DataProcessor).resize pkg/importer/data-processor.go:272 kubevirt.io/containerized-data-importer/pkg/importer.(*DataProcessor).ProcessDataWithPause pkg/importer/data-processor.go:224 kubevirt.io/containerized-data-importer/pkg/importer.(*DataProcessor).ProcessData pkg/importer/data-processor.go:169 main.main cmd/cdi-importer/importer.go:189 runtime.main GOROOT/src/runtime/proc.go:203 runtime.goexit GOROOT/src/runtime/asm_amd64.s:1373 Resize of image failed ...but the image should not stay locked in RHV.
Versions: CNV 4.8.0-451 iib 86746 MTV 2.1.0-21 iib 88402 OCP 4.8.0-rc.1 After falling the importer cdi by removing /bin/qemu-img inside the pod, The import restarted and completed successfully
Created attachment 1798641 [details] describe importer-pod Attached here the 'oc describe pod importer-pod' command output where you can see that the importer pod is restarted on failure $ oc describe pod importer-mguetta-bug-ver-147c518e-ce0a-44bf-bb82-8672e52906e7 ... Status: Running IP: 10.128.2.100 IPs: IP: 10.128.2.100 Controlled By: PersistentVolumeClaim/mguetta-bug-ver-147c518e-ce0a-44bf-bb82-8672e52906e7 Containers: importer: ... State: Running Started: Tue, 06 Jul 2021 08:24:02 -0400 Last State: Terminated Reason: Error Message: Unable to process data: , Couldn't start qemu-img: exec: "qemu-img": executable file not found in $PATH Exit Code: 1 Started: Tue, 06 Jul 2021 08:22:25 -0400 Finished: Tue, 06 Jul 2021 08:24:01 -0400 Ready: True Restart Count: 1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 4.8.0 Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2920