Bug 1731038
Summary: | guest on src host get stuck after execute migrate_cancel for rdma migration | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux Advanced Virtualization | Reporter: | Li Xiaohui <xiaohli> |
Component: | qemu-kvm | Assignee: | Dr. David Alan Gilbert <dgilbert> |
qemu-kvm sub component: | Live Migration | QA Contact: | Li Xiaohui <xiaohli> |
Status: | CLOSED CURRENTRELEASE | Docs Contact: | |
Severity: | unspecified | ||
Priority: | high | CC: | chayang, fjin, jinzhao, juzhang, lvivier, peterx, quintela, virt-maint, xianwang |
Version: | 8.1 | Keywords: | Triaged |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-03-15 07:37:36 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1758964, 1771318, 1897025 |
Description
Li Xiaohui
2019-07-18 07:45:10 UTC
Hi all, I also test this case on rhel8.1.0 fast train with guest win10(q35+seabios), win8-32(pc+seabios), rhel8.1.0(q35+seabios), rhel7.7(pc+seabios), rhel8.0.1(q35+ovmf), 1.rhel8.1.0 and win10 guest hit same issue, like above comment 0 2.rhel8.0.1 and rhel7.7, and win8-32 guest get prompt like followings after migrate_cancel, But I think maybe the prompt isn't right(ibv_poll_cq wc.status=13 RNR retry counter exceeded!...), what do you think? (1)on src host qemu: (qemu) migrate rdma:192.168.0.21:4444 source_resolve_host RDMA Device opened: kernel name mlx4_0 uverbs device name uverbs0, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs0, infiniband class device path /sys/class/infiniband/mlx4_0, transport: (1) Infiniband qemu-kvm: Early error. Sending error. ibv_poll_cq wc.status=13 RNR retry counter exceeded! ibv_poll_cq wrid=CONTROL SEND! qemu-kvm: rdma migration: send polling control error (qemu) info status VM status: running (qemu) info migr migrate migrate_cache_size migrate_capabilities migrate_parameters (qemu) info migrate globals: store-global-state: on only-migratable: off send-configuration: on send-section-footer: on decompress-error-check: on capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: off x-colo: off release-ram: off return-path: off pause-before-switchover: off multifd: off dirty-bitmaps: off postcopy-blocktime: off late-block-activate: off x-ignore-shared: off Migration status: cancelled total time: 0 milliseconds (2)on dst host qemu: (qemu) info status VM status: paused (inmigrate) (qemu) dest_init RDMA Device opened: kernel name mlx4_0 uverbs device name uverbs0, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs0, infiniband class device path /sys/class/infiniband/mlx4_0, transport: (1) Infiniband qemu-kvm: receive cm event, cm event is 10 qemu-kvm: rdma migration: send polling control error qemu-kvm: Failed to send control buffer! qemu-kvm: load of migration failed: Input/output error qemu-kvm: Early error. Sending error. qemu-kvm: rdma migration: send polling control error What's more, I test this case on rhel8.1.0 slow train with guest win10(q35+seabios) and rhel8.1.0(pc+seabios), guest run normal on src host after migrate_cancel, and the prompt is right both on src and dst qemu: (1)on src host qemu: (qemu) migrate rdma:192.168.0.21:4444 source_resolve_host RDMA Device opened: kernel name mlx4_0 uverbs device name uverbs0, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs0, infiniband class device path /sys/class/infiniband/mlx4_0, transport: (1) Infiniband qemu-kvm: migration_iteration_finish: Unknown ending state 2 qemu-kvm: Early error. Sending error. (qemu) info status VM status: running (qemu) info migr migrate migrate_cache_size migrate_capabilities migrate_parameters (qemu) info migrate globals: store-global-state: on only-migratable: off send-configuration: on send-section-footer: on decompress-error-check: on capabilities: xbzrle: off rdma-pin-all: on auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: off x-colo: off release-ram: off return-path: off pause-before-switchover: off x-multifd: off dirty-bitmaps: off late-block-activate: off Migration status: cancelled total time: 0 milliseconds (2)on dst host qemu: QEMU 2.12.0 monitor - type 'help' for more information (qemu) dest_init RDMA Device opened: kernel name mlx4_0 uverbs device name uverbs0, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs0, infiniband class device path /sys/class/infiniband/mlx4_0, transport: (1) Infiniband qemu-kvm: Was expecting a QEMU FILE (3) control message, but got: ERROR (1), length: 0 qemu-kvm: load of migration failed: Input/output error QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened. |