RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2107892 - Migrate parameters are not restored if kill virtproxyd/virtqemud during migration
Summary: Migrate parameters are not restored if kill virtproxyd/virtqemud during migra...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: libvirt
Version: 9.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Jiri Denemark
QA Contact: Fangge Jin
URL:
Whiteboard:
: 2107893 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-07-17 09:19 UTC by Fangge Jin
Modified: 2022-11-15 10:40 UTC (History)
5 users (show)

Fixed In Version: libvirt-8.5.0-4.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-11-15 10:04:47 UTC
Type: Bug
Target Upstream Version: 8.6.0
Embargoed:


Attachments (Terms of Use)
libvirt and qemu log (191.23 KB, application/x-bzip)
2022-07-17 09:19 UTC, Fangge Jin
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker LIBVIRTAT-13416 0 None None None 2022-09-20 17:05:05 UTC
Red Hat Issue Tracker RHELPLAN-127939 0 None None None 2022-07-17 09:56:37 UTC
Red Hat Product Errata RHSA-2022:8003 0 None None None 2022-11-15 10:04:59 UTC

Description Fangge Jin 2022-07-17 09:19:50 UTC
Created attachment 1897802 [details]
libvirt and qemu log

Description of problem:
Do migration with --parallel-connections, kill dest virtproxy(for p2p migration) or src virtqemud(for p2p/non-p2p migration) during migration. Then do migration again without --parallel-connections, migration will fail.

Version-Release number of selected component (if applicable):
libvirt-8.5.0-1.el9.x86_64
qemu-kvm-7.0.0-8.el9.x86_64


How reproducible:
100%

Steps to Reproduce:
1. Start a vm

2. Do migration with --parallel-connections:
   # virsh migrate vm1 qemu+tcp://******/system --live --postcopy --parallel --compressed --comp-methods xbzrle --bandwidth 4 --parallel-connections 4 [--p2p]

3. Kill dest virtproxyd if it is p2p migration, or kill src virtqemud if it is p2p or non-p2p migration

4. Check migration status, it failed. And check migration parameters, the parameters are not restored:
   # virsh qemu-monitor-command vm1 '{"execute":"query-migrate-parameters"}'
{"return":{"cpu-throttle-tailslow":false,"xbzrle-cache-size":67108864,"cpu-throttle-initial":20,"announce-max":550,"decompress-threads":2,"compress-threads":8,"compress-level":1,"multifd-channels":4,"multifd-zstd-level":1,"announce-initial":50,"block-incremental":false,"compress-wait-thread":true,"downtime-limit":300,"tls-authz":"","multifd-compression":"none","announce-rounds":5,"announce-step":100,"tls-creds":"","multifd-zlib-level":1,"max-cpu-throttle":99,"max-postcopy-bandwidth":0,"tls-hostname":"","throttle-trigger-threshold":50,"max-bandwidth":4194304,"x-checkpoint-delay":20000,"cpu-throttle-increment":10},"id":"libvirt-14"}



5. Do migration again without --parallel-connections, it failed:
   #  virsh migrate vm1 qemu+tcp://******/system --live --postcopy --parallel --compressed --comp-methods xbzrle --bandwidth 4 [--p2p]


Actual results:
As step5, migration failed.

Expected results:
Step5 can succeed

Additional info:
Dest qemu log:
2022-07-17T08:02:17.058157Z qemu-kvm: socket_accept_incoming_migration: Extra incoming migration connection; ignoring
2022-07-17T08:02:17.058189Z qemu-kvm: socket_accept_incoming_migration: Extra incoming migration connection; ignoring

Src qemu log:
2022-07-17 08:02:17.068+0000: initiating migration
2022-07-17T08:02:17.081109Z qemu-kvm: Unable to write to socket: Connection reset by peer
2022-07-17T08:02:17.118964Z qemu-kvm: Unable to read from socket: Bad file descriptor
2022-07-17T08:02:17.118977Z qemu-kvm: Unable to read from socket: Bad file descriptor
2022-07-17T08:02:17.118982Z qemu-kvm: Unable to read from socket: Bad file descriptor

Comment 1 Jiri Denemark 2022-07-18 08:48:45 UTC
*** Bug 2107893 has been marked as a duplicate of this bug. ***

Comment 2 Jiri Denemark 2022-07-22 15:14:33 UTC
Patches sent upstream for review: https://listman.redhat.com/archives/libvir-list/2022-July/233114.html

Comment 3 Jiri Denemark 2022-07-25 13:06:02 UTC
BTW, migration capabilities are not reset either, but that's not such a big
issue as the unused ones are disabled when a new migration starts.

Comment 4 Jiri Denemark 2022-07-26 09:38:22 UTC
Fixed upstream by

commit c7238941357f0d2e94524cf8c5ad7d9c82dcf2f9
Refs: v8.5.0-186-gc723894135
Author:     Jiri Denemark <jdenemar>
AuthorDate: Tue Jul 19 13:48:44 2022 +0200
Commit:     Jiri Denemark <jdenemar>
CommitDate: Tue Jul 26 10:09:00 2022 +0200

    qemu_migration: Store original migration params in status XML

    We keep original values of migration parameters so that we can restore
    them at the end of migration to make sure later migration does not use
    some random values. However, this does not really work when libvirt
    daemon is restarted on the source host because we failed to explicitly
    save the status XML after getting the migration parameters from QEMU.
    Actually it might work if the status XML is written later for some other
    reason such as domain state change, but that's not how it should work.

    https://bugzilla.redhat.com/show_bug.cgi?id=2107892

    Signed-off-by: Jiri Denemark <jdenemar>
    Reviewed-by: Michal Privoznik <mprivozn>

commit c0824fd03802085db698c10fe62c98cc95a57941
Refs: v8.5.0-187-gc0824fd038
Author:     Jiri Denemark <jdenemar>
AuthorDate: Thu Jul 21 15:59:51 2022 +0200
Commit:     Jiri Denemark <jdenemar>
CommitDate: Tue Jul 26 10:09:00 2022 +0200

    qemu_migration_params: Refactor qemuMigrationParamsApply

    qemuMigrationParamsApply restricts when capabilities can be set, but
    this is not useful in all cases. Let's create new helpers for setting
    migration capabilities and parameters which can be reused in more places
    without the restriction.

    https://bugzilla.redhat.com/show_bug.cgi?id=2107892

    Signed-off-by: Jiri Denemark <jdenemar>
    Reviewed-by: Michal Privoznik <mprivozn>

commit c47f1abb81194461377a0c608a7ecd87f9ce9146
Refs: [fixes], v8.5.0-188-gc47f1abb81
Author:     Jiri Denemark <jdenemar>
AuthorDate: Thu Jul 21 16:49:09 2022 +0200
Commit:     Jiri Denemark <jdenemar>
CommitDate: Tue Jul 26 10:09:01 2022 +0200

    qemu_migration_params: Refactor qemuMigrationParamsReset

    Because qemuMigrationParamsReset used to call qemuMigrationParamsApply
    for resetting migration capabilities and parameters, it did not work
    well since commit v5.1.0-83-ga1dec315c9 which only allowed capabilities
    to be set from an async job. However, when reconnecting to running
    domains after daemon restart we do not have an async job. Thus the
    capabilities were not properly reset in case the daemon was restarted
    during an ongoing migration. We need to avoid calling
    qemuMigrationParamsApply to make sure both parameters and capabilities
    can be reset by a normal job.

    https://bugzilla.redhat.com/show_bug.cgi?id=2107892

    Signed-off-by: Jiri Denemark <jdenemar>
    Reviewed-by: Michal Privoznik <mprivozn>

Comment 5 Jiri Denemark 2022-07-26 13:17:24 UTC
Backported: https://gitlab.com/redhat/rhel/src/libvirt/-/merge_requests/40

Comment 6 Fangge Jin 2022-08-01 08:33:23 UTC
Versions:
libvirt-client-8.5.0-4.el9.x86_64
qemu-kvm-7.0.0-9.el9.x86_64

Test matrix:
p2p, kill src virtqemud, pass
p2p, kill dest virtqemud, pass
p2p, kill dest virtproxyd, fail
non-p2p, kill src virtqemud, pass
non-p2p, kill dest virtqemud, pass
non-p2p, kill dest virtproxyd, pass

Comment 12 Fangge Jin 2022-08-11 05:59:44 UTC
Verified with
libvirt-8.5.0-5.el9.x86_64
qemu-kvm-7.0.0-10.el9.x86_64


Test matrix:
p2p, kill src virtqemud, pass
p2p, kill dest virtqemud, pass
p2p, kill dest virtproxyd, fail(Bug 2114866 )
non-p2p, kill src virtqemud, pass
non-p2p, kill dest virtqemud, pass
non-p2p, kill dest virtproxyd, pass

Comment 14 errata-xmlrpc 2022-11-15 10:04:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Low: libvirt security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:8003


Note You need to log in before you can comment on or make changes to this bug.