During OCP Upgrade to 4.8.14 ( CNV 4.8.z, is not involved yet ) from 4.7.33 ----------------------------------------------------------------------------- Source VMI Pod Version: container-native-virtualization/virt-launcher/images/v2.5.8-3" ( yes virt-launcher was still using 2.5.8-3, when on CNV 2.6.7/4.7.33 ) Target VMI Pod Version: container-native-virtualization/virt-operator/images/v2.6.7-8" --- NOTE: Paid close attention to the VMI Pod Versions during this upgrade The below issue is seen when VMI Pod LiveMigrates/Upgrades from 2.5.8-3 to 2.6.7-8 ( during OCP-4.8.14) {"component":"virt-launcher","kind":"","level":"error","msg":"Live migration failed.","name":"vm3-ocs-rhel84","namespace":"default","pos":"manager.go:565","reason":"virError(Code=9, Domain=10, Message='operation failed: migration of disk vdb failed: Source and target image have different sizes')","timestamp":"2021-10-12T19:36:46.340948Z","uid":"7c294dfc-89d6-4b1a-a60d-70e390efa0da"}
This new failure is caused by virt-chroot not working properly in CNV 2.6. More specifically, the --user option which fails on every user, even root, saying the user doesn't exist. I have not figured out why that happens. However, after talking to Roman, we figured out that virt-chroot was not needed in the codepath involved in the issue, and in fact just made the code unnecessarily complicated. So I pushed a fix to KubeVirt main and backported it to release-0.36 (linked above), which should fix the issue (by not using virt-chroot anymore). It is worth noting that another (unrelated) function uses `virt-chroot --user`, GetImageInfo(), and in that case the use of virt-chroot makes sense. I assume that function does not work in CNV 2.6 either, but I'm not sure what the impact of it is.
(In reply to Jed Lejosne from comment #4) > This new failure is caused by virt-chroot not working properly in CNV 2.6. > More specifically, the --user option which fails on every user, even root, > saying the user doesn't exist. > I have not figured out why that happens. > > However, after talking to Roman, we figured out that virt-chroot was not > needed in the codepath involved in the issue, and in fact just made the code > unnecessarily complicated. > So I pushed a fix to KubeVirt main and backported it to release-0.36 (linked > above), which should fix the issue (by not using virt-chroot anymore). > > It is worth noting that another (unrelated) function uses `virt-chroot > --user`, GetImageInfo(), and in that case the use of virt-chroot makes sense. > I assume that function does not work in CNV 2.6 either, but I'm not sure > what the impact of it is. This is not an issue. This is only called at startup where the new launcher image is already in use (there is a small race windows where new handlers can get old launcher pods and that should normally compatible too but it is not worth fixing here and it would only be a transient error).
verify with build: v2.6.8-22 Summary: Start with VM in 2.5.8 (CNV 2.5.8, OCP 4.6) Do OCP upgrade to 4.7, Applied 2.6.8 ICSP immediately and started CNV upgrade with the following scenarios: Scenario 1: Upgrade CNV From: 2.5.8 To: 2.6.4 Migrate VM in CNV 2.6.4 LiveMigration - PASSED Virt-Launcher version - 2.6.4/2.6.3-2 Continue the upgrade to CNV 2.6.8 Virt-Launcher version 2.6.4/2.6.3-2 to 2.6.8-5 - PASSED Scenario 2: Upgrade CNV From: 2.5.8 To: 2.6.5 Migrate VM in CNV 2.6.5 LiveMigration - PASSED Virt-Launcher version - 2.6.5-2 Continue the upgrade to CNV 2.6.8 Virt-Launcher version 2.6.5-2 to 2.6.8-5 - PASSED Scenario 3: Upgrade CNV From: 2.5.8 To: 2.6.6 Migrate VM in CNV 2.6.6 LiveMigration - PASSED Virt-Launcher version - 2.6.6-7 Continue the upgrade to CNV 2.6.8 Virt-Launcher version 2.6.6-7 to 2.6.8-5 - PASSED Scenario 4: Upgrade CNV From: 2.5.8 To: 2.6.7 Migrate VM in CNV 2.6.7 LiveMigration - FAILED (https://bugzilla.redhat.com/show_bug.cgi?id=2019705) Virt-Launcher version 2.5.8 to 2.6.7 Source Virt-Launcher Pod 2.5.8 continues to be in Running state. Target Virt-launcher Pod 2.6.7 enters Completed state VMIM Object shows Status: FAILED Continue the upgrade to CNV 2.6.8 Virt-Launcher version 2.5.8 to 2.6.8-5 - PASSED Scenario 5: Upgrade CNV From: 2.5.8 To: 2.6.8 Was tested as part of Scenario 4 itself. As we see above, the virt-launcher upgrade from version 2.5.8 to 2.6.8-5 - PASSED move this to verified.
LiveMigration of VMI with the following scenario: PASSED source virt-launcher Pod: v2.6.7 Target Virt-Launcher Pod: v2.6.8-5
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 2.6.8 Images security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:4725