Bug 1458705
| Summary: | pvdump: QMP reports "GUEST_PANICKED" event but HMP still shows VM running after guest crashed | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | yilzhang |
| Component: | qemu-kvm-rhev | Assignee: | David Gibson <dgibson> |
| Status: | CLOSED ERRATA | QA Contact: | yilzhang |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.4 | CC: | hachen, knoel, lmiksik, mrezanin, mtessun, qzhang, virt-maint, xuma, yilzhang |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | ppc64le | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | qemu-kvm-rhev-2.9.0-12.el7 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-08-02 04:41:00 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Hi, Yilin Is this a power specific bug? X86 doesn't have this issue. (In reply to hachen from comment #3) > X86 doesn't have this issue. Thanks Haotong for confirmation. (In reply to yilzhang from comment #0) > Description of problem: > When testing pvdump, after I stop kdump service and trigger a crash inside > guest, QMP reports "GUEST_PANICKED" event but HMP still shows VM "running". > > Version-Release number of selected component (if applicable): > HOST: > kernel: 3.10.0-675.el7.ppc64le > qemu: qemu-kvm-rhev-2.9.0-7.el7.ppc64le > SLOF: SLOF-20170303-4.git66d250e.el7.noarch > GUEST: kernel-3.10.0-675.el7.ppc64le > > How reproducible: > 100% > > Steps to Reproduce: > 1. Boot up guest > 2. Conenct QMP > # telnet $HostIP 9990 { "execute": "qmp_capabilities" } > 3. Check the HMP monitor status > (qemu) info status > 4. Stop kdump, and trigger crash in guest > # service kdump stop > # echo c >/proc/sysrq-trigger > 5. Check guest status with HMP and QMP > > > Actual results: > HMP: > (qemu) info status > VM status: running > QMP: > {"timestamp": {"seconds": 1496653822, "microseconds": 102423}, "event": > "GUEST_PANICKED", "data": {"action": "pause"}} > > > Expected results: > HMP: > (qemu) info status > VM status: **paused (guest-panicked)** > QMP: > {"timestamp": {"seconds": 1496653822, "microseconds": 102423}, "event": > "GUEST_PANICKED", "data": {"action": "pause"}} > > > Additional info: > Qemu command line used to boot up guest: > /usr/libexec/qemu-kvm \ > -name yilzhang_vm \ > -smp 8,maxcpus=20,sockets=2,cores=2,threads=4 \ > -m 8192 \ > -serial unix:/tmp/ttyS0,server,nowait \ > -no-shutdown \ > -rtc base=localtime,clock=host \ > -boot menu=on \ > -monitor stdio \ > -vnc 0:90 \ > -qmp tcp:0:9990,server,nowait \ > -device usb-tablet,id=usb-table0 \ > -device virtio-net-pci,netdev=net0,id=nic0,mac=52:54:00:c3:e7:84 \ > -netdev > tap,id=net0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,vhost=on \ > -device virtio-scsi-pci,id=scsi0 \ > -drive > file=rhel.qcow2,format=qcow2,id=drive_sysdisk,if=none,cache=none,aio=native, > werror=stop,rerror=stop \ > -device scsi-hd,drive=drive_sysdisk,bus=scsi0.0,id=sysdisk,bootindex=0 \ I've reproduced this both with package and upstream. Indeed it appears that although qemu detects and reports the panic, it doesn't actually pause the VM. Since this is mostly handled in generic code, I'm not quite sure how we get a Power specific bug here, but I'm investigating. Ok, I've located the problem and have written an upstream patch to post shortly. Upstream patch is posted. I don't believe this is a regression, which means it may be to late to look at a backport to RHEL 7.4. We might have to wait until 7.5 (in which case we should get it via rebase). Scratch build incorporating this fix completed at: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=13420539 Hello, Martin Could you provide us a green light for the pm_ack? Thanks. (In reply to Qunfang Zhang from comment #10) > Hello, Martin > > Could you provide us a green light for the pm_ack? Thanks. Sorry, I mean the "blocker+" flag since pm_ack+ is already set. Fix included in qemu-kvm-rhev-2.9.0-12.el7 This bug has been verified on PPC platform
********************* Bug reproduced on PPC platform: *********************
Host: kernel: 3.10.0-681.el7.ppc64le
qemu-kvm-rhev-2.9.0-10.el7.ppc64le
SLOF-20170303-4.git66d250e.el7.noarch
Guest: 3.10.0-681.el7.ppc64le
Steps to Reproduce: the same as bug reported
Actual results:
HMP: (qemu) info status
VM status: running
QMP: {"timestamp": {"seconds": 1497947877, "microseconds": 224213}, "event": "GUEST_PANICKED", "data": {"action": "pause"}}
********************* Bug verify on ppc platform *********************
This bug is verified on the following version:
Host: kernel: 3.10.0-681.el7.ppc64le
qemu-kvm-rhev-2.9.0-12.el7.ppc64le
SLOF-20170303-4.git66d250e.el7.noarch.rpm
Guest: 3.10.0-681.el7.ppc64le
Steps: the same as bug reported
Actual results:
HMP: (qemu) info status
VM status: paused (guest-panicked)
QMP: {"timestamp": {"seconds": 1497951537, "microseconds": 652401}, "event": "GUEST_PANICKED", "data": {"action": "pause"}}
So, the result is expected, this bug is fixed against qemu-kvm-rhev-2.9.0-12.el7.ppc64le
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 |
Description of problem: When testing pvdump, after I stop kdump service and trigger a crash inside guest, QMP reports "GUEST_PANICKED" event but HMP still shows VM "running". Version-Release number of selected component (if applicable): HOST: kernel: 3.10.0-675.el7.ppc64le qemu: qemu-kvm-rhev-2.9.0-7.el7.ppc64le SLOF: SLOF-20170303-4.git66d250e.el7.noarch GUEST: kernel-3.10.0-675.el7.ppc64le How reproducible: 100% Steps to Reproduce: 1. Boot up guest 2. Conenct QMP # telnet $HostIP 9990 3. Check the HMP monitor status (qemu) info status 4. Stop kdump, and trigger crash in guest # service kdump stop # echo c >/proc/sysrq-trigger 5. Check guest status with HMP and QMP Actual results: HMP: (qemu) info status VM status: running QMP: {"timestamp": {"seconds": 1496653822, "microseconds": 102423}, "event": "GUEST_PANICKED", "data": {"action": "pause"}} Expected results: HMP: (qemu) info status VM status: **paused (guest-panicked)** QMP: {"timestamp": {"seconds": 1496653822, "microseconds": 102423}, "event": "GUEST_PANICKED", "data": {"action": "pause"}} Additional info: Qemu command line used to boot up guest: /usr/libexec/qemu-kvm \ -name yilzhang_vm \ -smp 8,maxcpus=20,sockets=2,cores=2,threads=4 \ -m 8192 \ -serial unix:/tmp/ttyS0,server,nowait \ -no-shutdown \ -rtc base=localtime,clock=host \ -boot menu=on \ -monitor stdio \ -vnc 0:90 \ -qmp tcp:0:9990,server,nowait \ -device usb-tablet,id=usb-table0 \ -device virtio-net-pci,netdev=net0,id=nic0,mac=52:54:00:c3:e7:84 \ -netdev tap,id=net0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,vhost=on \ -device virtio-scsi-pci,id=scsi0 \ -drive file=rhel.qcow2,format=qcow2,id=drive_sysdisk,if=none,cache=none,aio=native,werror=stop,rerror=stop \ -device scsi-hd,drive=drive_sysdisk,bus=scsi0.0,id=sysdisk,bootindex=0 \