Bug 2054597
Summary: | Do operation to disk will hang in the guest of target host after hotplugging and migrating | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Meina Li <meili> | |
Component: | qemu-kvm | Assignee: | Igor Mammedov <imammedo> | |
qemu-kvm sub component: | Live Migration | QA Contact: | Li Xiaohui <xiaohli> | |
Status: | CLOSED ERRATA | Docs Contact: | Jiri Herrmann <jherrman> | |
Severity: | high | |||
Priority: | unspecified | CC: | ailan, chayang, coli, dgilbert, fjin, imammedo, jherrman, jinzhao, jmaloy, juzhang, lcheng, lmiksik, mdean, virt-maint, xiaohli, yfu | |
Version: | 8.6 | Keywords: | Regression, Triaged | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | qemu-kvm-6.2.0-9.module+el8.6.0+14480+c0a3aa0f | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2062610 (view as bug list) | Environment: | ||
Last Closed: | 2022-05-10 13:25:26 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | qemu-7.0 | |
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2062610 |
Description
Meina Li
2022-02-15 10:09:32 UTC
Recently I also filed two bugs about hotplug + migration on qemu-kvm-6.2.0-7.el9.x86_64, maybe they're same root cause, see below. Bug 2053526 - Guest hit call trace during reboot after hotplug vdisks + migration Bug 2053584 - watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [cat:2843] Anyway, I would track these bugs later. Hi Igor, COuld you see is this bug for rhel8 same with below bug for rhel9? Bug 2053584 - watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [cat:2843]? (In reply to Li Xiaohui from comment #2) > Hi Igor, > COuld you see is this bug for rhel8 same with below bug for rhel9? > Bug 2053584 - watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [cat:2843]? Can you please provide the output of the HMP command 'info pci' from both the source (after hotplugging) and the destination. Reproduce bug on rhel 8.6.0 (kernel-4.18.0-369.el8.x86_64 & qemu-kvm-6.2.0-8.module+el8.6.0+14324+050a5215.x86_64). It should be same issue with bug 2053584 as: 1. diff the pci info on src host (after hotplug) and on dst host after migration: < BAR1: 32 bit memory at 0xfd600000 [0xfd600fff]. < BAR4: 64 bit prefetchable memory at 0xfb200000 [0xfb203fff]. --- > BAR1: 32 bit memory at 0xffffffffffffffff [0x00000ffe]. > BAR4: 64 bit prefetchable memory at 0xffffffffffffffff [0x00003ffe]. 2. guest hang when operate the hotplugged vdisk after migration on dst host 3. guest hit call trace during reboot, but succeed to start finally: call trace like: ********************* entry_SYSCALL_64_after_hwframe+0x65/0xca ********************* Hi Igor, could we get exception+ for this bug and fix it on rhel 8.6.0? (In reply to Li Xiaohui from comment #5) > Hi Igor, could we get exception+ for this bug and fix it on rhel 8.6.0? done Justification for exception: the bug is regression and breaks migration the latest machine type Hi Igor, tried the scratch build on hosts (kernel-4.18.0-369.el8.x86_64 & qemu-img-6.2.0-7.el8.imammedo202203080816.x86_64), except some existed bugs, others work well. The build should fix this bug. Test following cases, the four error cases due to existed bugs Bug 2043545 & Bug 2028337: --> Running case(1/7): RHEL7-96931-[migration] Migration after hot-plug virtio-serial (3 min 20 sec)--- PASS. --> Running case(2/7): RHEL7-10039-[migration] Do migration after hot plug vdisk (3 min 32 sec)--- PASS. --> Running case(3/7): RHEL7-10040-[migration] Do migration after hot remove vdisk (5 min 24 sec)--- PASS. --> Running case(4/7): RHEL7-10078-[migration] Migrate guest after hot plug/unplug memory balloon device (5 min 16 sec)--- ERROR. --> Running case(5/7): RHEL7-10079-[migration] Migrate guest after cpu hotplug/hotunplug in guest (RHEL only) (7 min 0 sec)--- ERROR. --> Running case(6/7): RHEL7-10047-[migration] Ping-pong live migration with large vcpu and memory values of guest (6 min 0 sec)--- ERROR. --> Running case(7/7): RHEL-178709-[migration] Basic migration test (3 min 28 sec)--- ERROR. BTW, I also have repeated above RHEL7-96931 & RHEL7-10039 for 10 times with checking pci info on source (after hotplugging) and destination host (after migration), they all work well, no difference about pci info. (In reply to Li Xiaohui from comment #13) > Hi Igor, tried the scratch build on hosts (kernel-4.18.0-369.el8.x86_64 & > qemu-img-6.2.0-7.el8.imammedo202203080816.x86_64), except some existed bugs, > others work well. The build should fix this bug. > > Test following cases, the four error cases due to existed bugs Bug 2043545 & > Bug 2028337: > --> Running case(1/7): RHEL7-96931-[migration] Migration after hot-plug > virtio-serial (3 min 20 sec)--- PASS. > --> Running case(2/7): RHEL7-10039-[migration] Do migration after hot plug > vdisk (3 min 32 sec)--- PASS. > --> Running case(3/7): RHEL7-10040-[migration] Do migration after hot remove > vdisk (5 min 24 sec)--- PASS. > --> Running case(4/7): RHEL7-10078-[migration] Migrate guest after hot > plug/unplug memory balloon device (5 min 16 sec)--- ERROR. > --> Running case(5/7): RHEL7-10079-[migration] Migrate guest after cpu > hotplug/hotunplug in guest (RHEL only) (7 min 0 sec)--- ERROR. > --> Running case(6/7): RHEL7-10047-[migration] Ping-pong live migration with > large vcpu and memory values of guest (6 min 0 sec)--- ERROR. > --> Running case(7/7): RHEL-178709-[migration] Basic migration test (3 min > 28 sec)--- ERROR. these are not relevant for this BZ > BTW, I also have repeated above RHEL7-96931 & RHEL7-10039 for 10 times with > checking pci info on source (after hotplugging) and destination host (after > migration), they all work well, no difference about pci info. please, also test RHEL9.0 Bug 2053584 and report results there so PMs could decide on granting an exception. (In reply to Igor Mammedov from comment #14) > > BTW, I also have repeated above RHEL7-96931 & RHEL7-10039 for 10 times with > > checking pci info on source (after hotplugging) and destination host (after > > migration), they all work well, no difference about pci info. > > please, also test RHEL9.0 Bug 2053584 and report results there > so PMs could decide on granting an exception. Test also pass on RHEL9.0.0, I have added the test results in https://bugzilla.redhat.com/show_bug.cgi?id=2053584#c15, please check. QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass. Verify bug on qemu-kvm-6.2.0-9.module+el8.6.0+14480+c0a3aa0f.x86_64, same test steps as Comment 13, test pass. So mark this bug as verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:1759 |