Bug 2042402 - LiveMigration with postcopy misbehave when failure occurs
Summary: LiveMigration with postcopy misbehave when failure occurs
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Virtualization
Version: 4.9.2
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.11.0
Assignee: lpivarc
QA Contact: Denys Shchedrivyi
URL:
Whiteboard:
: 2042858 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-19 12:49 UTC by Kedar Bidarkar
Modified: 2023-11-13 08:16 UTC (History)
4 users (show)

Fixed In Version: hco-bundle-registry-container-v4.11.0-315 virt-launcher-container-v4.11.0-55
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-09-14 19:28:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt kubevirt pull 7230 0 None Merged Fix handling of failed post-copy migration switch 2022-05-10 13:21:29 UTC
Red Hat Issue Tracker CNV-15903 0 None None None 2023-11-13 08:16:21 UTC
Red Hat Product Errata RHSA-2022:6526 0 None None None 2022-09-14 19:28:56 UTC

Description Kedar Bidarkar 2022-01-19 12:49:29 UTC
Description of problem:
 VM Live Migration Starting a VirtualMachineInstance migration postcopy [test_id:4747] should migrate using cluster level config for postcopy


{"component":"virt-launcher","kind":"","level":"info","msg":"Starting post copy mode for migration","name":"testvmi-2ldvf","namespace":"kubevirt-test-default1","pos":"live-migration-source.go:577","timestamp":"2022-01-07T10:26:36.433194Z","uid":"cb664df9-edee-44b1-a8d4-32aa345cac1d"}
{"component":"virt-launcher","level":"error","msg":"internal error: unable to execute QEMU command 'migrate-start-postcopy': Postcopy must be started after migration has been started","pos":"qemuMonitorJSONCheckErrorFull:418","subcomponent":"libvirt","thread":"44","timestamp":"2022-01-07T10:26:36.433000Z"}
{"component":"virt-launcher","kind":"","level":"error","msg":"failed to start post migration","name":"testvmi-2ldvf","namespace":"kubevirt-test-default1","pos":"live-migration-source.go:582","reason":"virError(Code=1, Domain=10, Message='internal error: unable to execute QEMU command 'migrate-start-postcopy': Postcopy must be started after migration has been started')","timestamp":"2022-01-07T10:26:36.433875Z","uid":"cb664df9-edee-44b1-a8d4-32aa345cac1d"}

Version-Release number of selected component (if applicable):
4.9.2

How reproducible:
Hard to reproduce, Need to re-run the testcase multiple times to hit this error.

Steps to Reproduce:
1. update cluster-level config for postcopy
2. LiveMigrate a VM with "postcopy" migration.
3. VM fails to LiveMigrate.

Actual results:
reason":"virError(Code=1, Domain=10, Message='internal error: unable to execute QEMU command 'migrate-start-postcopy': Postcopy must be started after migration has been started')

"msg":"internal error: unable to execute QEMU command 'migrate-start-postcopy': Postcopy must be started after migration has been started"

Expected results:

Migration with postcopy is always successful


Additional info:

Comment 1 lpivarc 2022-01-20 09:16:58 UTC
Additional information:
1. We fail to sync guest time in this case.
2. Migration will be executed but not in post-copy mode.

Comment 2 sgott 2022-01-28 22:12:02 UTC
*** Bug 2042858 has been marked as a duplicate of this bug. ***

Comment 5 Denys Shchedrivyi 2022-05-23 13:47:50 UTC
Verified on CNV v4.11.0-334

No postcopy migration failures after 1000 runs

Comment 8 errata-xmlrpc 2022-09-14 19:28:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Virtualization 4.11.0 Images security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:6526


Note You need to log in before you can comment on or make changes to this bug.