Bug 2013494
Summary: | [CNV-2.6.8] VMI is in LiveMigrate loop when Upgrading Cluster from 2.6.7/4.7.32 to OCP 4.8.13 | ||
---|---|---|---|
Product: | Container Native Virtualization (CNV) | Reporter: | Kedar Bidarkar <kbidarka> |
Component: | Virtualization | Assignee: | Jed Lejosne <jlejosne> |
Status: | CLOSED ERRATA | QA Contact: | Israel Pinto <ipinto> |
Severity: | urgent | Docs Contact: | |
Priority: | high | ||
Version: | 2.6.7 | CC: | cnv-qe-bugs, dvossel, fdeutsch, ipinto, jlejosne, lpivarc, rmohr, sgott, stirabos, vromanso, zpeng |
Target Milestone: | --- | ||
Target Release: | 2.6.8 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | virt-operator-container-v2.6.8-5 hco-bundle-registry-container-v2.6.8-22 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | 2008511 | Environment: | |
Last Closed: | 2021-11-17 18:40:02 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 2008511 | ||
Bug Blocks: | 2010742 |
Comment 1
Kedar Bidarkar
2021-10-13 04:28:18 UTC
This new failure is caused by virt-chroot not working properly in CNV 2.6. More specifically, the --user option which fails on every user, even root, saying the user doesn't exist. I have not figured out why that happens. However, after talking to Roman, we figured out that virt-chroot was not needed in the codepath involved in the issue, and in fact just made the code unnecessarily complicated. So I pushed a fix to KubeVirt main and backported it to release-0.36 (linked above), which should fix the issue (by not using virt-chroot anymore). It is worth noting that another (unrelated) function uses `virt-chroot --user`, GetImageInfo(), and in that case the use of virt-chroot makes sense. I assume that function does not work in CNV 2.6 either, but I'm not sure what the impact of it is. (In reply to Jed Lejosne from comment #4) > This new failure is caused by virt-chroot not working properly in CNV 2.6. > More specifically, the --user option which fails on every user, even root, > saying the user doesn't exist. > I have not figured out why that happens. > > However, after talking to Roman, we figured out that virt-chroot was not > needed in the codepath involved in the issue, and in fact just made the code > unnecessarily complicated. > So I pushed a fix to KubeVirt main and backported it to release-0.36 (linked > above), which should fix the issue (by not using virt-chroot anymore). > > It is worth noting that another (unrelated) function uses `virt-chroot > --user`, GetImageInfo(), and in that case the use of virt-chroot makes sense. > I assume that function does not work in CNV 2.6 either, but I'm not sure > what the impact of it is. This is not an issue. This is only called at startup where the new launcher image is already in use (there is a small race windows where new handlers can get old launcher pods and that should normally compatible too but it is not worth fixing here and it would only be a transient error). verify with build: v2.6.8-22 Summary: Start with VM in 2.5.8 (CNV 2.5.8, OCP 4.6) Do OCP upgrade to 4.7, Applied 2.6.8 ICSP immediately and started CNV upgrade with the following scenarios: Scenario 1: Upgrade CNV From: 2.5.8 To: 2.6.4 Migrate VM in CNV 2.6.4 LiveMigration - PASSED Virt-Launcher version - 2.6.4/2.6.3-2 Continue the upgrade to CNV 2.6.8 Virt-Launcher version 2.6.4/2.6.3-2 to 2.6.8-5 - PASSED Scenario 2: Upgrade CNV From: 2.5.8 To: 2.6.5 Migrate VM in CNV 2.6.5 LiveMigration - PASSED Virt-Launcher version - 2.6.5-2 Continue the upgrade to CNV 2.6.8 Virt-Launcher version 2.6.5-2 to 2.6.8-5 - PASSED Scenario 3: Upgrade CNV From: 2.5.8 To: 2.6.6 Migrate VM in CNV 2.6.6 LiveMigration - PASSED Virt-Launcher version - 2.6.6-7 Continue the upgrade to CNV 2.6.8 Virt-Launcher version 2.6.6-7 to 2.6.8-5 - PASSED Scenario 4: Upgrade CNV From: 2.5.8 To: 2.6.7 Migrate VM in CNV 2.6.7 LiveMigration - FAILED (https://bugzilla.redhat.com/show_bug.cgi?id=2019705) Virt-Launcher version 2.5.8 to 2.6.7 Source Virt-Launcher Pod 2.5.8 continues to be in Running state. Target Virt-launcher Pod 2.6.7 enters Completed state VMIM Object shows Status: FAILED Continue the upgrade to CNV 2.6.8 Virt-Launcher version 2.5.8 to 2.6.8-5 - PASSED Scenario 5: Upgrade CNV From: 2.5.8 To: 2.6.8 Was tested as part of Scenario 4 itself. As we see above, the virt-launcher upgrade from version 2.5.8 to 2.6.8-5 - PASSED move this to verified. LiveMigration of VMI with the following scenario: PASSED source virt-launcher Pod: v2.6.7 Target Virt-Launcher Pod: v2.6.8-5 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 2.6.8 Images security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:4725 |