Bug 2107892
Summary: | Migrate parameters are not restored if kill virtproxyd/virtqemud during migration | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Fangge Jin <fjin> | ||||
Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> | ||||
libvirt sub component: | General | QA Contact: | Fangge Jin <fjin> | ||||
Status: | CLOSED ERRATA | Docs Contact: | |||||
Severity: | unspecified | ||||||
Priority: | unspecified | CC: | dzheng, jdenemar, lcheng, lmen, virt-maint | ||||
Version: | 9.1 | Keywords: | Triaged | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | libvirt-8.5.0-4.el9 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2022-11-15 10:04:47 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | 8.6.0 | ||||
Embargoed: | |||||||
Attachments: |
|
*** Bug 2107893 has been marked as a duplicate of this bug. *** Patches sent upstream for review: https://listman.redhat.com/archives/libvir-list/2022-July/233114.html BTW, migration capabilities are not reset either, but that's not such a big issue as the unused ones are disabled when a new migration starts. Fixed upstream by commit c7238941357f0d2e94524cf8c5ad7d9c82dcf2f9 Refs: v8.5.0-186-gc723894135 Author: Jiri Denemark <jdenemar> AuthorDate: Tue Jul 19 13:48:44 2022 +0200 Commit: Jiri Denemark <jdenemar> CommitDate: Tue Jul 26 10:09:00 2022 +0200 qemu_migration: Store original migration params in status XML We keep original values of migration parameters so that we can restore them at the end of migration to make sure later migration does not use some random values. However, this does not really work when libvirt daemon is restarted on the source host because we failed to explicitly save the status XML after getting the migration parameters from QEMU. Actually it might work if the status XML is written later for some other reason such as domain state change, but that's not how it should work. https://bugzilla.redhat.com/show_bug.cgi?id=2107892 Signed-off-by: Jiri Denemark <jdenemar> Reviewed-by: Michal Privoznik <mprivozn> commit c0824fd03802085db698c10fe62c98cc95a57941 Refs: v8.5.0-187-gc0824fd038 Author: Jiri Denemark <jdenemar> AuthorDate: Thu Jul 21 15:59:51 2022 +0200 Commit: Jiri Denemark <jdenemar> CommitDate: Tue Jul 26 10:09:00 2022 +0200 qemu_migration_params: Refactor qemuMigrationParamsApply qemuMigrationParamsApply restricts when capabilities can be set, but this is not useful in all cases. Let's create new helpers for setting migration capabilities and parameters which can be reused in more places without the restriction. https://bugzilla.redhat.com/show_bug.cgi?id=2107892 Signed-off-by: Jiri Denemark <jdenemar> Reviewed-by: Michal Privoznik <mprivozn> commit c47f1abb81194461377a0c608a7ecd87f9ce9146 Refs: [fixes], v8.5.0-188-gc47f1abb81 Author: Jiri Denemark <jdenemar> AuthorDate: Thu Jul 21 16:49:09 2022 +0200 Commit: Jiri Denemark <jdenemar> CommitDate: Tue Jul 26 10:09:01 2022 +0200 qemu_migration_params: Refactor qemuMigrationParamsReset Because qemuMigrationParamsReset used to call qemuMigrationParamsApply for resetting migration capabilities and parameters, it did not work well since commit v5.1.0-83-ga1dec315c9 which only allowed capabilities to be set from an async job. However, when reconnecting to running domains after daemon restart we do not have an async job. Thus the capabilities were not properly reset in case the daemon was restarted during an ongoing migration. We need to avoid calling qemuMigrationParamsApply to make sure both parameters and capabilities can be reset by a normal job. https://bugzilla.redhat.com/show_bug.cgi?id=2107892 Signed-off-by: Jiri Denemark <jdenemar> Reviewed-by: Michal Privoznik <mprivozn> Versions: libvirt-client-8.5.0-4.el9.x86_64 qemu-kvm-7.0.0-9.el9.x86_64 Test matrix: p2p, kill src virtqemud, pass p2p, kill dest virtqemud, pass p2p, kill dest virtproxyd, fail non-p2p, kill src virtqemud, pass non-p2p, kill dest virtqemud, pass non-p2p, kill dest virtproxyd, pass Verified with libvirt-8.5.0-5.el9.x86_64 qemu-kvm-7.0.0-10.el9.x86_64 Test matrix: p2p, kill src virtqemud, pass p2p, kill dest virtqemud, pass p2p, kill dest virtproxyd, fail(Bug 2114866 ) non-p2p, kill src virtqemud, pass non-p2p, kill dest virtqemud, pass non-p2p, kill dest virtproxyd, pass Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Low: libvirt security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:8003 |
Created attachment 1897802 [details] libvirt and qemu log Description of problem: Do migration with --parallel-connections, kill dest virtproxy(for p2p migration) or src virtqemud(for p2p/non-p2p migration) during migration. Then do migration again without --parallel-connections, migration will fail. Version-Release number of selected component (if applicable): libvirt-8.5.0-1.el9.x86_64 qemu-kvm-7.0.0-8.el9.x86_64 How reproducible: 100% Steps to Reproduce: 1. Start a vm 2. Do migration with --parallel-connections: # virsh migrate vm1 qemu+tcp://******/system --live --postcopy --parallel --compressed --comp-methods xbzrle --bandwidth 4 --parallel-connections 4 [--p2p] 3. Kill dest virtproxyd if it is p2p migration, or kill src virtqemud if it is p2p or non-p2p migration 4. Check migration status, it failed. And check migration parameters, the parameters are not restored: # virsh qemu-monitor-command vm1 '{"execute":"query-migrate-parameters"}' {"return":{"cpu-throttle-tailslow":false,"xbzrle-cache-size":67108864,"cpu-throttle-initial":20,"announce-max":550,"decompress-threads":2,"compress-threads":8,"compress-level":1,"multifd-channels":4,"multifd-zstd-level":1,"announce-initial":50,"block-incremental":false,"compress-wait-thread":true,"downtime-limit":300,"tls-authz":"","multifd-compression":"none","announce-rounds":5,"announce-step":100,"tls-creds":"","multifd-zlib-level":1,"max-cpu-throttle":99,"max-postcopy-bandwidth":0,"tls-hostname":"","throttle-trigger-threshold":50,"max-bandwidth":4194304,"x-checkpoint-delay":20000,"cpu-throttle-increment":10},"id":"libvirt-14"} 5. Do migration again without --parallel-connections, it failed: # virsh migrate vm1 qemu+tcp://******/system --live --postcopy --parallel --compressed --comp-methods xbzrle --bandwidth 4 [--p2p] Actual results: As step5, migration failed. Expected results: Step5 can succeed Additional info: Dest qemu log: 2022-07-17T08:02:17.058157Z qemu-kvm: socket_accept_incoming_migration: Extra incoming migration connection; ignoring 2022-07-17T08:02:17.058189Z qemu-kvm: socket_accept_incoming_migration: Extra incoming migration connection; ignoring Src qemu log: 2022-07-17 08:02:17.068+0000: initiating migration 2022-07-17T08:02:17.081109Z qemu-kvm: Unable to write to socket: Connection reset by peer 2022-07-17T08:02:17.118964Z qemu-kvm: Unable to read from socket: Bad file descriptor 2022-07-17T08:02:17.118977Z qemu-kvm: Unable to read from socket: Bad file descriptor 2022-07-17T08:02:17.118982Z qemu-kvm: Unable to read from socket: Bad file descriptor