Bug 1626059
| Summary: | RHEL6 guest panics on boot if hotpluggable memory (pc-dimm) is present at boot time | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Igor Mammedov <imammedo> | ||||
| Component: | qemu-kvm-rhev | Assignee: | Igor Mammedov <imammedo> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Yumei Huang <yuhuang> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | urgent | ||||||
| Version: | 7.6 | CC: | chayang, hhuang, imammedo, jinzhao, juzhang, michen, mkalinin, mtessun, virt-maint, yuhuang | ||||
| Target Milestone: | rc | Keywords: | Regression | ||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | qemu-kvm-rhev-2.12.0-16.el7 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | |||||||
| : | 1635625 (view as bug list) | Environment: | |||||
| Last Closed: | 2018-11-01 11:13:32 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1635625 | ||||||
| Attachments: |
|
||||||
|
Description
Igor Mammedov
2018-09-06 13:33:01 UTC
Fix included in qemu-kvm-rhev-2.12.0-16.el7 The issue is gone if boot rhel6.10 guest with the cli in comment 0 on qemu-kvm-rhev-2.12.0-16.el7. But QE still can hit the panic issue with following steps: 1. Boot guest in pause status with 3 nodes and 2 pc-dimms(assigned to node 1 and node 0), please see the cli[1]. 2. Hotplug pc-dimm to node 2 (qemu) object_add memory-backend-ram,id=mem2,host-nodes=0,policy=bind,size=1G (qemu) device_add pc-dimm,id=dimm2,memdev=mem2,node=2 3. Resume guest, hit call trace[2], guest panic [1] QEMU cli: /usr/libexec/qemu-kvm \ -S \ -name 'avocado-vt-vm1' \ -sandbox off \ -machine pc \ -nodefaults \ -device VGA,bus=pci.0,addr=0x2 \ -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel610-64-virtio.qcow2 \ -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=0x4 \ -device virtio-net-pci,mac=9a:11:12:13:14:15,id=idHcQEHf,vectors=4,netdev=idayU70J,bus=pci.0,addr=0x5 \ -netdev tap,id=idayU70J,vhost=on \ -m 4096,slots=16,maxmem=32G \ -object memory-backend-ram,policy=bind,host-nodes=0,size=1G,id=mem-mem1 \ -device pc-dimm,node=1,id=dimm-mem1,memdev=mem-mem1 \ -object memory-backend-ram,policy=bind,host-nodes=0,size=1G,id=mem-mem2 \ -device pc-dimm,node=0,id=dimm-mem2,memdev=mem-mem2 \ -smp 8,maxcpus=8,cores=4,threads=1,sockets=2 \ -numa node,nodeid=0 \ -numa node,nodeid=1 \ -numa node,nodeid=2 \ -cpu 'Opteron_G3',+kvm_pv_unhalt \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot menu=off,strict=off,order=cdn,once=c \ -enable-kvm -monitor stdio -serial tcp:0:4444,server,nowait [2] Call trace info: Kernel panic - not syncing: Fatal exception Pid: 1182, comm: udisks-part-id Tainted: G D W -- ------------ 2.6.32-754.6.2.el6.x86_64 #1 Call Trace: [<ffffffff8155856a>] ? panic+0xa7/0x18b [<ffffffff8155e304>] ? oops_end+0xe4/0x100 [<ffffffff8100f95b>] ? die+0x5b/0x90 [<ffffffff8155ddc2>] ? do_general_protection+0x152/0x160 [<ffffffff8155d235>] ? general_protection+0x25/0x30 [<ffffffff812b84db>] ? list_del+0x1b/0xa0 [<ffffffff8113f7d3>] ? __rmqueue+0xc3/0x4a0 [<ffffffff8114a891>] ? lru_cache_add_lru+0x21/0x40 [<ffffffff811418e0>] ? get_page_from_freelist+0x590/0x870 [<ffffffff811431d9>] ? __alloc_pages_nodemask+0x129/0x960 [<ffffffff811b1522>] ? do_lookup+0xa2/0x230 [<ffffffff811b01d0>] ? path_to_nameidata+0x20/0x60 [<ffffffff8117ed8a>] ? alloc_pages_vma+0x9a/0x150 [<ffffffff8115ee4a>] ? do_wp_page+0x11a/0xa40 [<ffffffff8115c209>] ? __do_fault+0x459/0x540 [<ffffffff8115fa4d>] ? handle_pte_fault+0x2dd/0xc80 [<ffffffff812a9276>] ? prio_tree_insert+0x256/0x2b0 [<ffffffff81154e90>] ? vma_prio_tree_insert+0x30/0x60 [<ffffffff81163c3c>] ? __vma_link_file+0x4c/0x80 [<ffffffff811606f6>] ? handle_mm_fault+0x306/0x450 [<ffffffff81054db1>] ? __do_page_fault+0x141/0x500 [<ffffffff81166de5>] ? do_mmap_pgoff+0x335/0x380 [<ffffffff81155339>] ? sys_mmap_pgoff+0x199/0x340 [<ffffffff8156029e>] ? do_page_fault+0x3e/0xa0 [<ffffffff8155d265>] ? page_fault+0x25/0x30 (In reply to Yumei Huang from comment #8) > The issue is gone if boot rhel6.10 guest with the cli in comment 0 on > qemu-kvm-rhev-2.12.0-16.el7. > > But QE still can hit the panic issue with following steps: backtrace indicates it's not related issue, can you reproduce it with 2.10 version? I suggest to open a bug for it. [...] > [2] Call trace info: > > Kernel panic - not syncing: Fatal exception > Pid: 1182, comm: udisks-part-id Tainted: G D W -- ------------ > 2.6.32-754.6.2.el6.x86_64 #1 > Call Trace: > [<ffffffff8155856a>] ? panic+0xa7/0x18b > [<ffffffff8155e304>] ? oops_end+0xe4/0x100 > [<ffffffff8100f95b>] ? die+0x5b/0x90 > [<ffffffff8155ddc2>] ? do_general_protection+0x152/0x160 > [<ffffffff8155d235>] ? general_protection+0x25/0x30 > [<ffffffff812b84db>] ? list_del+0x1b/0xa0 > [<ffffffff8113f7d3>] ? __rmqueue+0xc3/0x4a0 > [<ffffffff8114a891>] ? lru_cache_add_lru+0x21/0x40 > [<ffffffff811418e0>] ? get_page_from_freelist+0x590/0x870 > [<ffffffff811431d9>] ? __alloc_pages_nodemask+0x129/0x960 > [<ffffffff811b1522>] ? do_lookup+0xa2/0x230 > [<ffffffff811b01d0>] ? path_to_nameidata+0x20/0x60 > [<ffffffff8117ed8a>] ? alloc_pages_vma+0x9a/0x150 > [<ffffffff8115ee4a>] ? do_wp_page+0x11a/0xa40 > [<ffffffff8115c209>] ? __do_fault+0x459/0x540 > [<ffffffff8115fa4d>] ? handle_pte_fault+0x2dd/0xc80 > [<ffffffff812a9276>] ? prio_tree_insert+0x256/0x2b0 > [<ffffffff81154e90>] ? vma_prio_tree_insert+0x30/0x60 > [<ffffffff81163c3c>] ? __vma_link_file+0x4c/0x80 > [<ffffffff811606f6>] ? handle_mm_fault+0x306/0x450 > [<ffffffff81054db1>] ? __do_page_fault+0x141/0x500 > [<ffffffff81166de5>] ? do_mmap_pgoff+0x335/0x380 > [<ffffffff81155339>] ? sys_mmap_pgoff+0x199/0x340 > [<ffffffff8156029e>] ? do_page_fault+0x3e/0xa0 > [<ffffffff8155d265>] ? page_fault+0x25/0x30 (In reply to Igor Mammedov from comment #9) > (In reply to Yumei Huang from comment #8) > > The issue is gone if boot rhel6.10 guest with the cli in comment 0 on > > qemu-kvm-rhev-2.12.0-16.el7. > > > > But QE still can hit the panic issue with following steps: > backtrace indicates it's not related issue, > can you reproduce it with 2.10 version? > > I suggest to open a bug for it. > Yes, it's reproducible with qemu-kvm-rhev-2.10.0-21.el7. A new bug[1] has been filed. [1]https://bugzilla.redhat.com/show_bug.cgi?id=1630850 Verify: qemu-kvm-rhev-2.12.0-16.el7 Guest: RHEL6.10, RHEL7.6, Win2008sp2, Win2008r2, Win2012, Win2012r2, Win2016 QE did the same test as bug 1609234(in comment 7&10), only RHEL6.10 guest failed 3 cases due to bug 1630850, other guests pass all the tests. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3443 |