Bug 2042860

Summary: LiveMigration with postcopy misbehave when failure occurs
Product: Container Native Virtualization (CNV) Reporter: lpivarc
Component: VirtualizationAssignee: lpivarc
Status: CLOSED DUPLICATE QA Contact: Kedar Bidarkar <kbidarka>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.8.0CC: cnv-qe-bugs, sgott
Target Milestone: ---   
Target Release: 4.11.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-01-28 22:08:50 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description lpivarc 2022-01-20 09:30:00 UTC
This bug was initially created as a copy of Bug #2042402

I am copying this bug because it affects all versions.



Description of problem:
Detected by the following test - VM Live Migration Starting a VirtualMachineInstance migration postcopy [test_id:4747] should migrate using cluster level config for postcopy


Log of failure to switch to post-copy:

{"component":"virt-launcher","kind":"","level":"info","msg":"Starting post copy mode for migration","name":"testvmi-2ldvf","namespace":"kubevirt-test-default1","pos":"live-migration-source.go:577","timestamp":"2022-01-07T10:26:36.433194Z","uid":"cb664df9-edee-44b1-a8d4-32aa345cac1d"}
{"component":"virt-launcher","level":"error","msg":"internal error: unable to execute QEMU command 'migrate-start-postcopy': Postcopy must be started after migration has been started","pos":"qemuMonitorJSONCheckErrorFull:418","subcomponent":"libvirt","thread":"44","timestamp":"2022-01-07T10:26:36.433000Z"}
{"component":"virt-launcher","kind":"","level":"error","msg":"failed to start post migration","name":"testvmi-2ldvf","namespace":"kubevirt-test-default1","pos":"live-migration-source.go:582","reason":"virError(Code=1, Domain=10, Message='internal error: unable to execute QEMU command 'migrate-start-postcopy': Postcopy must be started after migration has been started')","timestamp":"2022-01-07T10:26:36.433875Z","uid":"cb664df9-edee-44b1-a8d4-32aa345cac1d"}

Version-Release number of selected component (if applicable):
4.8.*

How reproducible:
Hard to reproduce, Need to re-run the testcase multiple times to hit this error.
Switch to post-copy needs to fail.

Steps to Reproduce:
1. update cluster-level config for postcopy
2. LiveMigrate a VM with "postcopy" migration.
3. VM fails to LiveMigrate.

Actual results:
reason":"virError(Code=1, Domain=10, Message='internal error: unable to execute QEMU command 'migrate-start-postcopy': Postcopy must be started after migration has been started')

"msg":"internal error: unable to execute QEMU command 'migrate-start-postcopy': Postcopy must be started after migration has been started"

1. We fail to sync guest time in this case.
2. Migration will be executed but not in post-copy mode.
3. Hostdevices(SRIOV) will not be attached (This is already known bug)

Expected results:

Migration with postcopy is always successful and it is executed in postcopy if needed.


Additional info:

Comment 1 sgott 2022-01-28 22:08:50 UTC
Marking this as a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=2042858 since both clones are now targetted to the same release.

*** This bug has been marked as a duplicate of bug 2042858 ***