Bug 1525899
Summary: | Migrate to an error destination ip ->"migrate_cancel"->info migrate, there will be segmentation fault | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | xianwang <xianwang> |
Component: | qemu-kvm-rhev | Assignee: | Dr. David Alan Gilbert <dgilbert> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Yumei Huang <yuhuang> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.5 | CC: | chayang, dgilbert, jinzhao, juzhang, knoel, michen, qzhang, virt-maint, xianwang |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-05-15 08:34:54 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1558351 |
Description
xianwang
2017-12-14 10:41:34 UTC
Confirmed (and with upstream 2.11); the crucial thing is that the IP address doesn't reject the connection, but just hangs during the connect. Status is 'cancelling'. This bug is not a regression, it also exist on qemu-kvm-rhev-2.9.0-16.el7_4.1, although the result of qemu-kvm-rhev-2.9.0-16.el7_4.1.ppc64le is not totally same with qemu-kvm-rhev-2.10.0-12.el7.ppc64le, the result is as Dave said in comment2, the status of migration is "cancelling" as following: version: Host: 3.10.0-693.el7.ppc64le qemu-kvm-rhev-2.9.0-16.el7_4.1.ppc64le SLOF-20170724-2.git89f519f.el7.noarch steps are same with bug report. result: (qemu) migrate -d tcp:10.16.110.120:5801 (qemu) info migrate capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: off x-colo: off release-ram: off Migration status: setup total time: 0 milliseconds (qemu) migrate_cancel (qemu) info migrate capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: off x-colo: off release-ram: off Migration status: cancelling (qemu) info migrate Migration status: cancelling ......... endless "cancelling" Yep, there's actually two bugs: a) The seg, for which I've just posted upstream: migration: Guard ram_bytes_remaining against early call b) The endless cancelling, which I've got an idea how to fix - it's related to the error path through the socket code. and posted upstream fixes for (b): [PATCH 1/2] migration: Allow migrate_fd_connect to take an Error * [PATCH 2/2] migration: Route errors down through a) just got merged upstream as bae416e5ba65701d3c5238164517158066d615e5 bumped to 7.6 b) got merged upstream as: 688a3dcba980bf01344a cce8040bb0ea6ff56d88 Also needs: migration: Fix early failure cleanup posted 2018-02-12 and should include the: tests/migration: Add test for migration to bad destination included with it. Also needs: Migration+TLS: Fix crash due to double cleanup |