Bug 1447169
Summary: | [RFE] Support hotplugging/unplugging of i6300esb watchdog | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Fangge Jin <fjin> |
Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> |
Status: | CLOSED ERRATA | QA Contact: | Lili Zhu <lizhu> |
Severity: | medium | Docs Contact: | Jiri Herrmann <jherrman> |
Priority: | medium | ||
Version: | 7.4 | CC: | bhaubeck, dyuan, jdenemar, jherrman, juzhou, laine, mprivozn, mtessun, rjones, xuzhang, zpeng |
Target Milestone: | rc | Keywords: | FutureFeature, TestOnly, Upstream |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | libvirt-3.9.0-1.el7 | Doc Type: | Release Note |
Doc Text: |
The i6300esb watchdog is now supported by *libvirt*
With this update, the *libvirt* API supports the i6300esb watchdog device. As a result, KVM virtual machines can use this device to automatically trigger a specified action, such as saving a core dump of the guest if the guest OS becomes unresponsive or terminates unexpectedly.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-10-30 09:49:43 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Fangge Jin
2017-05-02 01:42:24 UTC
(In reply to Fangge Jin from comment #0) > Description of problem: > Qemu supports hotplugging/unplugging of i6300esb watchdog, but libvirt > forbids it. > Libvirt has no reason to forbid it as long as qemu supports it. > > Version-Release number of selected component: > libvirt-3.2.0-3.el7.x86_64 > qemu-kvm-rhev-2.9.0-1.el7.x86_64 > > How reproducible: > 100% > > Steps to Reproduce: > 1. Prepare a vm with i6300esb watchdog: > # virsh dumpxml rhel7.3 > ... > <watchdog model='i6300esb' action='inject-nmi'/> 1: ^^ > ... > > 2. # virsh start rhel7.3 > > 3. Detach watchdog by libvirt, fail: > # virsh detach-device rhel7.3 /tmp/watchdog.xml --live > error: Failed to detach device from /tmp/watchdog.xml > error: Operation not supported: live detach of device 'watchdog' is not > supported > > 4. Detach watchdog by qemu monitor command, succeed: > # virsh qemu-monitor-command rhel7.3 '{"execute": "device_del", "arguments": > {"id": "watchdog0"}}' > {"return":{},"id":"libvirt-15"} > > # virsh qemu-monitor-command rhel7.3 '{"execute": "device_add", "arguments": > {"driver": "i6300esb", "id": "watchdog0"}}' > {"return":{},"id":"libvirt-16"} > Almost. We also need to be able to set/change watchdog action, because if domain was started with no watchdog, we are unable to honour action here [1]. I've proposed a patch for that here: http://lists.nongnu.org/archive/html/qemu-devel/2017-09/msg00856.html Also, patch for libvirt proposed upstream: https://www.redhat.com/archives/libvir-list/2017-September/msg00078.html Not that it matters, but the real hardware was built into the southbridge and so not hotpluggable. Do you know what happens (or should happen) if the guest kernel loads the i6300esb driver with nowayout=1 and the hardware is unplugged? My reading of the driver says that this will not break or crash, although it probably won't work as the guest user expects. You can tell from the qemu side if the guest selected nowayout because ESB_WDT_LOCK is written to the ESB_LOCK_REG register. Anyway, this may not matter. I've pushed the patches upstream: commit 662140fa68ae099a426006ed5edb1d511921e2c2 Author: Michal Privoznik <mprivozn> AuthorDate: Tue Sep 5 11:08:36 2017 +0200 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Oct 5 14:23:20 2017 +0200 qemu: hot-unplug of watchdog https://bugzilla.redhat.com/show_bug.cgi?id=1447169 Signed-off-by: Michal Privoznik <mprivozn> Reviewed-by: John Ferlan <jferlan> commit 361c8dc179bdbc8946c2ff3de47d19d6a6eba641 Author: Michal Privoznik <mprivozn> AuthorDate: Fri Sep 1 13:39:15 2017 +0200 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Oct 5 14:23:20 2017 +0200 qemu: hot-plug of watchdog https://bugzilla.redhat.com/show_bug.cgi?id=1447169 Since domain can have at most one watchdog it simplifies things a bit. However, since we must be able to set the watchdog action as well, new monitor command needs to be used. Signed-off-by: Michal Privoznik <mprivozn> Reviewed-by: John Ferlan <jferlan> commit 8a54cc1d08a333283c9cfc3fd7788be2642ca71a Author: Michal Privoznik <mprivozn> AuthorDate: Wed Sep 27 13:45:07 2017 +0200 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Oct 5 14:23:20 2017 +0200 qemuDomainDeviceDefValidate: Validate watchdog Currently we don't do it. Therefore we accept senseless combinations of models and buses they are attached to. Moreover, diag288 watchdog is exclusive to s390(x). Signed-off-by: Michal Privoznik <mprivozn> Reviewed-by: John Ferlan <jferlan> v3.8.0-39-g662140fa6 Test with libvirt-3.9.0-1.el7.x86_64 and qemu-kvm-rhev-2.10.0-4.el7.x86_64 Hotplug(blocked by qemu-kvm-rhev now): 1) # cat /tmp/watchdog.xml <watchdog model='i6300esb' action='inject-nmi'/> 2) # virsh attach-device foo=1 /tmp/watchdog.xml error: Failed to attach device from /tmp/watchdog.xml error: internal error: unable to execute QEMU command 'watchdog-set-action': The command watchdog-set-action has not been found Hotunplug: 1) # virsh dumpxml foo=1 |grep '<watchdog' -A3 <watchdog model='i6300esb' action='inject-nmi'> <alias name='watchdog0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x0c' function='0x0'/> </watchdog> 2) Save the watchdog xml snippet into a file 3) # virsh detach-device foo=1 /tmp/watchdogunplug.xml Device detached successfully 4) # virsh dumpxml foo=1 |grep '<watchdog' -A3 Hotunplug again: # virsh detach-device foo=1 /tmp/watchdogunplug.xml error: Failed to detach device from /tmp/watchdogunplug.xml error: Requested operation is not valid: watchdog device not present in domain configuration Coldplug: 1) # virsh attach-device foo=1 /tmp/watchdog.xml --config Device attached successfully 2) # virsh dumpxml foo=1 --inactive|grep '<watchdog' -A2 <watchdog model='i6300esb' action='inject-nmi'> <address type='pci' domain='0x0000' bus='0x00' slot='0x0c' function='0x0'/> </watchdog> Coldplug again: 1) # virsh attach-device foo=1 /tmp/watchdog.xml --config error: Failed to attach device from /tmp/watchdog.xml error: Requested operation is not valid: domain already has a watchdog Coldunplug: # virsh detach-device foo=1 /tmp/watchdogunplug.xml --config Device detached successfully Coldunplug again: # virsh detach-device foo=1 /tmp/watchdogunplug.xml --config error: Failed to detach device from /tmp/watchdogunplug.xml error: operation failed: domain has no watchdog Hi Michal I tried with qemu-kvm-rhev-2.10.0-6.el7.x86_64 today, it also reports error: # virsh attach-device foo=1 /tmp/watchdog.xml error: Failed to attach device from /tmp/watchdog.xml error: internal error: unable to execute QEMU command 'watchdog-set-action': The command watchdog-set-action has not been found So it seems that qemu-kvm-rhev doesn't support set watchdog action on fly? (In reply to Fangge Jin from comment #8) > Hi Michal > > I tried with qemu-kvm-rhev-2.10.0-6.el7.x86_64 today, it also reports error: > > # virsh attach-device foo=1 /tmp/watchdog.xml > error: Failed to attach device from /tmp/watchdog.xml > error: internal error: unable to execute QEMU command 'watchdog-set-action': > The command watchdog-set-action has not been found > > So it seems that qemu-kvm-rhev doesn't support set watchdog action on fly? Yeah, command was introduced in 2.11.0, so unless qemu-kvm-rhev rebases, there's not much libvirt can do. You can verify trying qemu upstream :-) (In reply to Michal Privoznik from comment #9) > (In reply to Fangge Jin from comment #8) > > Hi Michal > > > > I tried with qemu-kvm-rhev-2.10.0-6.el7.x86_64 today, it also reports error: > > > > # virsh attach-device foo=1 /tmp/watchdog.xml > > error: Failed to attach device from /tmp/watchdog.xml > > error: internal error: unable to execute QEMU command 'watchdog-set-action': > > The command watchdog-set-action has not been found > > > > So it seems that qemu-kvm-rhev doesn't support set watchdog action on fly? > > Yeah, command was introduced in 2.11.0, so unless qemu-kvm-rhev rebases, > there's not much libvirt can do. You can verify trying qemu upstream :-) hi, Michal, The latest rebase for qemu-kvm-rhev in RHEL7.5 is 2.10, see BZ 1470749. So, it seems this libvirt BZ can not be verified in RHEL7.5, we'd like to move this BZ to next release 7.6 with keyword Testonly, is it ok for you? (In reply to Xuesong Zhang from comment #10) > hi, Michal, > > The latest rebase for qemu-kvm-rhev in RHEL7.5 is 2.10, see BZ 1470749. So, > it seems this libvirt BZ can not be verified in RHEL7.5, we'd like to move > this BZ to next release 7.6 with keyword Testonly, is it ok for you? Sure. No problem. Verify this bug with: libvirt-4.5.0-1.el7.x86_64 qemu-kvm-rhev-2.12.0-7.el7.x86_64 Hotplugging: steps: 1) prepare watchdog xml # cat /tmp/watchdog.xml <watchdog model='i6300esb' action='inject-nmi'> <alias name='ua-7996c8dc-a4fa-4012-b76f-043d20144263'/> </watchdog> 2) Hotplug a watchdog device: # virsh attach-device rhel76 /tmp/watchdog.xml Device attached successfully 3)Check watchdog in dumpxml: # virsh dumpxml rhel76 | grep '<watchdog' -A3 <watchdog model='i6300esb' action='inject-nmi'> <alias name='ua-7996c8dc-a4fa-4012-b76f-043d20144263'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/> </watchdog> 4)Check watchdog inside guest: # lspci | grep -i watchdog 00:09.0 System peripheral: Intel Corporation 6300ESB Watchdog Timer 5) Monitor guest event: # virsh event rhel7.3 --all 6) Trigger watchdog, refer to: How to trigger watchdog Wait for some time, guest will receive a NMI. Check the guest terminal. [root@localhost ~]# [ 216.057090] Uhhuh. NMI received for unknown reason 30 on CPU 0. [ 216.058071] Do you have a strange power saving mode enabled? [ 216.058968] Dazed and confused, but trying to continue Message from syslogd@localhost at Jul 9 11:21:27 ... kernel:Uhhuh. NMI received for unknown reason 30 on CPU 0. Message from syslogd@localhost at Jul 9 11:21:27 ... kernel:Do you have a strange power saving mode enabled? Message from syslogd@localhost at Jul 9 11:21:27 ... kernel:Dazed and confused, but trying to continue 7)Check output of virsh event event 'watchdog' for domain rhel76: inject-nmi events received: 1 Also tested other actions, including shutdown, dump, none, poweroff, reset, pause, all are working as expected. Hotunplugging: steps: 1) Prepare a running guest with "i6300esb" watchdog device. Dump guest xml and save the watchdog part into a file: # virsh dumpxml rhel76 |grep '<watchdog' -A3 > /tmp/watchdog.xml 2) check the watchdog xml # cat /tmp/watchdog.xml <watchdog model='i6300esb' action='inject-nmi'> <alias name='watchdog0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/> </watchdog> 3) Hot-unplug the watchdog: # virsh detach-device rhel76 /tmp/watchdog.xml Device detached successfully 4) # virsh dumpxml foo=1 |grep '<watchdog' -A3 Nothing output Also tested other actions, including shutdown, dump, none, poweroff, reset, pause, all are working as expected. As all testing results match with expected results, mark it as verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:3113 For reference, there was a bug in these patches that led to Bug 1666559 - failure to hotplug i6300esb on a Q35 virtual machine. |