Bug 1447169 - [RFE] Support hotplugging/unplugging of i6300esb watchdog
Summary: [RFE] Support hotplugging/unplugging of i6300esb watchdog
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.4
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Michal Privoznik
QA Contact: Lili Zhu
Jiri Herrmann
URL:
Whiteboard:
Keywords: FutureFeature, TestOnly, Upstream
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-05-02 01:42 UTC by Fangge Jin
Modified: 2019-01-18 02:46 UTC (History)
11 users (show)

(edit)
The i6300esb watchdog is now supported by *libvirt*

With this update, the *libvirt* API supports the i6300esb watchdog device. As a result, KVM virtual machines can use this device to automatically trigger a specified action, such as saving a core dump of the guest if the guest OS becomes unresponsive or terminates unexpectedly.
Clone Of:
(edit)
Last Closed: 2018-10-30 09:49:43 UTC


Attachments (Terms of Use)

Description Fangge Jin 2017-05-02 01:42:24 UTC
Description of problem:
Qemu supports hotplugging/unplugging of i6300esb watchdog, but libvirt forbids it. 
Libvirt has no reason to forbid it as long as qemu supports it.

Version-Release number of selected component:
libvirt-3.2.0-3.el7.x86_64
qemu-kvm-rhev-2.9.0-1.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Prepare a vm with i6300esb watchdog:
# virsh dumpxml rhel7.3
...
    <watchdog model='i6300esb' action='inject-nmi'/>
...

2. # virsh start rhel7.3

3. Detach watchdog by libvirt, fail:
# virsh detach-device rhel7.3 /tmp/watchdog.xml --live
error: Failed to detach device from /tmp/watchdog.xml
error: Operation not supported: live detach of device 'watchdog' is not supported

4. Detach watchdog by qemu monitor command, succeed:
# virsh qemu-monitor-command rhel7.3 '{"execute": "device_del", "arguments": {"id": "watchdog0"}}'
{"return":{},"id":"libvirt-15"}

# virsh qemu-monitor-command rhel7.3 '{"execute": "device_add", "arguments": {"driver": "i6300esb", "id": "watchdog0"}}'
{"return":{},"id":"libvirt-16"}


Actual results:
Libvirt doesn't support hotplugging/unplugging i6300esb watchdog

Expected results:
Libvirt supports hotplugging/unplugging i6300esb watchdog

Comment 2 Michal Privoznik 2017-09-05 11:27:43 UTC
(In reply to Fangge Jin from comment #0)
> Description of problem:
> Qemu supports hotplugging/unplugging of i6300esb watchdog, but libvirt
> forbids it. 
> Libvirt has no reason to forbid it as long as qemu supports it.
> 
> Version-Release number of selected component:
> libvirt-3.2.0-3.el7.x86_64
> qemu-kvm-rhev-2.9.0-1.el7.x86_64
> 
> How reproducible:
> 100%
> 
> Steps to Reproduce:
> 1. Prepare a vm with i6300esb watchdog:
> # virsh dumpxml rhel7.3
> ...
>     <watchdog model='i6300esb' action='inject-nmi'/>

1: ^^

> ...
> 
> 2. # virsh start rhel7.3
> 
> 3. Detach watchdog by libvirt, fail:
> # virsh detach-device rhel7.3 /tmp/watchdog.xml --live
> error: Failed to detach device from /tmp/watchdog.xml
> error: Operation not supported: live detach of device 'watchdog' is not
> supported
> 
> 4. Detach watchdog by qemu monitor command, succeed:
> # virsh qemu-monitor-command rhel7.3 '{"execute": "device_del", "arguments":
> {"id": "watchdog0"}}'
> {"return":{},"id":"libvirt-15"}
> 
> # virsh qemu-monitor-command rhel7.3 '{"execute": "device_add", "arguments":
> {"driver": "i6300esb", "id": "watchdog0"}}'
> {"return":{},"id":"libvirt-16"}
> 

Almost. We also need to be able to set/change watchdog action, because if domain was started with no watchdog, we are unable to honour action here [1]. I've proposed a patch for that here:

http://lists.nongnu.org/archive/html/qemu-devel/2017-09/msg00856.html

Comment 3 Michal Privoznik 2017-09-05 11:46:33 UTC
Also, patch for libvirt proposed upstream:

https://www.redhat.com/archives/libvir-list/2017-September/msg00078.html

Comment 4 Richard W.M. Jones 2017-09-06 09:14:01 UTC
Not that it matters, but the real hardware was built into the southbridge
and so not hotpluggable.

Do you know what happens (or should happen) if the guest kernel loads
the i6300esb driver with nowayout=1 and the hardware is unplugged?
My reading of the driver says that this will not break or crash,
although it probably won't work as the guest user expects.  You can
tell from the qemu side if the guest selected nowayout because
ESB_WDT_LOCK is written to the ESB_LOCK_REG register.

Anyway, this may not matter.

Comment 5 Michal Privoznik 2017-10-05 12:26:19 UTC
I've pushed the patches upstream:

commit 662140fa68ae099a426006ed5edb1d511921e2c2
Author:     Michal Privoznik <mprivozn@redhat.com>
AuthorDate: Tue Sep 5 11:08:36 2017 +0200
Commit:     Michal Privoznik <mprivozn@redhat.com>
CommitDate: Thu Oct 5 14:23:20 2017 +0200

    qemu: hot-unplug of watchdog
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1447169
    
    Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
    Reviewed-by: John Ferlan <jferlan@redhat.com>

commit 361c8dc179bdbc8946c2ff3de47d19d6a6eba641
Author:     Michal Privoznik <mprivozn@redhat.com>
AuthorDate: Fri Sep 1 13:39:15 2017 +0200
Commit:     Michal Privoznik <mprivozn@redhat.com>
CommitDate: Thu Oct 5 14:23:20 2017 +0200

    qemu: hot-plug of watchdog
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1447169
    
    Since domain can have at most one watchdog it simplifies things a
    bit. However, since we must be able to set the watchdog action as
    well, new monitor command needs to be used.
    
    Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
    Reviewed-by: John Ferlan <jferlan@redhat.com>

commit 8a54cc1d08a333283c9cfc3fd7788be2642ca71a
Author:     Michal Privoznik <mprivozn@redhat.com>
AuthorDate: Wed Sep 27 13:45:07 2017 +0200
Commit:     Michal Privoznik <mprivozn@redhat.com>
CommitDate: Thu Oct 5 14:23:20 2017 +0200

    qemuDomainDeviceDefValidate: Validate watchdog
    
    Currently we don't do it. Therefore we accept senseless
    combinations of models and buses they are attached to.
    Moreover, diag288 watchdog is exclusive to s390(x).
    
    Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
    Reviewed-by: John Ferlan <jferlan@redhat.com>

v3.8.0-39-g662140fa6

Comment 7 Fangge Jin 2017-11-07 05:24:34 UTC
Test with libvirt-3.9.0-1.el7.x86_64 and qemu-kvm-rhev-2.10.0-4.el7.x86_64

Hotplug(blocked by qemu-kvm-rhev now):
1) # cat /tmp/watchdog.xml 
    <watchdog model='i6300esb' action='inject-nmi'/>

2) # virsh attach-device foo=1 /tmp/watchdog.xml 
error: Failed to attach device from /tmp/watchdog.xml
error: internal error: unable to execute QEMU command 'watchdog-set-action': The command watchdog-set-action has not been found


Hotunplug:
1) # virsh dumpxml foo=1 |grep '<watchdog' -A3
    <watchdog model='i6300esb' action='inject-nmi'>
      <alias name='watchdog0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0c' function='0x0'/>
    </watchdog>

2) Save the watchdog xml snippet into a file

3) # virsh detach-device foo=1 /tmp/watchdogunplug.xml 
Device detached successfully

4) # virsh dumpxml foo=1 |grep '<watchdog' -A3


Hotunplug again:
# virsh detach-device foo=1 /tmp/watchdogunplug.xml
error: Failed to detach device from /tmp/watchdogunplug.xml
error: Requested operation is not valid: watchdog device not present in domain configuration


Coldplug:
1) # virsh attach-device foo=1 /tmp/watchdog.xml --config
Device attached successfully

2) # virsh dumpxml foo=1 --inactive|grep '<watchdog' -A2
    <watchdog model='i6300esb' action='inject-nmi'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0c' function='0x0'/>
    </watchdog>


Coldplug again:
1) # virsh attach-device foo=1 /tmp/watchdog.xml --config
error: Failed to attach device from /tmp/watchdog.xml
error: Requested operation is not valid: domain already has a watchdog


Coldunplug:
# virsh detach-device foo=1 /tmp/watchdogunplug.xml --config
Device detached successfully


Coldunplug again:
# virsh detach-device foo=1 /tmp/watchdogunplug.xml --config
error: Failed to detach device from /tmp/watchdogunplug.xml
error: operation failed: domain has no watchdog

Comment 8 Fangge Jin 2017-11-24 10:21:27 UTC
Hi Michal

I tried with qemu-kvm-rhev-2.10.0-6.el7.x86_64 today, it also reports error:

# virsh attach-device foo=1 /tmp/watchdog.xml 
error: Failed to attach device from /tmp/watchdog.xml
error: internal error: unable to execute QEMU command 'watchdog-set-action': The command watchdog-set-action has not been found

So it seems that qemu-kvm-rhev doesn't support set watchdog action on fly?

Comment 9 Michal Privoznik 2017-11-24 15:15:22 UTC
(In reply to Fangge Jin from comment #8)
> Hi Michal
> 
> I tried with qemu-kvm-rhev-2.10.0-6.el7.x86_64 today, it also reports error:
> 
> # virsh attach-device foo=1 /tmp/watchdog.xml 
> error: Failed to attach device from /tmp/watchdog.xml
> error: internal error: unable to execute QEMU command 'watchdog-set-action':
> The command watchdog-set-action has not been found
> 
> So it seems that qemu-kvm-rhev doesn't support set watchdog action on fly?

Yeah, command was introduced in 2.11.0, so unless qemu-kvm-rhev rebases, there's not much libvirt can do. You can verify trying qemu upstream :-)

Comment 10 Xuesong Zhang 2017-11-27 08:19:50 UTC
(In reply to Michal Privoznik from comment #9)
> (In reply to Fangge Jin from comment #8)
> > Hi Michal
> > 
> > I tried with qemu-kvm-rhev-2.10.0-6.el7.x86_64 today, it also reports error:
> > 
> > # virsh attach-device foo=1 /tmp/watchdog.xml 
> > error: Failed to attach device from /tmp/watchdog.xml
> > error: internal error: unable to execute QEMU command 'watchdog-set-action':
> > The command watchdog-set-action has not been found
> > 
> > So it seems that qemu-kvm-rhev doesn't support set watchdog action on fly?
> 
> Yeah, command was introduced in 2.11.0, so unless qemu-kvm-rhev rebases,
> there's not much libvirt can do. You can verify trying qemu upstream :-)

hi, Michal,

The latest rebase for qemu-kvm-rhev in RHEL7.5 is 2.10, see BZ 1470749. So, it seems this libvirt BZ can not be verified in RHEL7.5, we'd like to move this BZ to next release 7.6 with keyword Testonly, is it ok for you?

Comment 11 Michal Privoznik 2017-11-27 08:22:37 UTC
(In reply to Xuesong Zhang from comment #10)
> hi, Michal,
> 
> The latest rebase for qemu-kvm-rhev in RHEL7.5 is 2.10, see BZ 1470749. So,
> it seems this libvirt BZ can not be verified in RHEL7.5, we'd like to move
> this BZ to next release 7.6 with keyword Testonly, is it ok for you?

Sure. No problem.

Comment 13 Lili Zhu 2018-07-09 04:57:39 UTC
Verify this bug with:
libvirt-4.5.0-1.el7.x86_64
qemu-kvm-rhev-2.12.0-7.el7.x86_64

Hotplugging:
steps:
1) prepare watchdog xml
# cat /tmp/watchdog.xml
   <watchdog model='i6300esb' action='inject-nmi'>
<alias name='ua-7996c8dc-a4fa-4012-b76f-043d20144263'/>
</watchdog>

2) Hotplug a watchdog device:
# virsh attach-device rhel76 /tmp/watchdog.xml
Device attached successfully

3)Check watchdog in dumpxml:
# virsh dumpxml rhel76 | grep '<watchdog' -A3
<watchdog model='i6300esb' action='inject-nmi'>
  <alias name='ua-7996c8dc-a4fa-4012-b76f-043d20144263'/>
  <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
</watchdog>

4)Check watchdog inside guest:
# lspci | grep -i watchdog
00:09.0 System peripheral: Intel Corporation 6300ESB Watchdog Timer

5) Monitor guest event:
# virsh event rhel7.3 --all

6) Trigger watchdog, refer to: How to trigger watchdog 
Wait for some time, guest will receive a NMI. Check the guest terminal.
[root@localhost ~]# [  216.057090] Uhhuh. NMI received for unknown reason 30 on CPU 0.
[  216.058071] Do you have a strange power saving mode enabled?
[  216.058968] Dazed and confused, but trying to continue

Message from syslogd@localhost at Jul  9 11:21:27 ...
 kernel:Uhhuh. NMI received for unknown reason 30 on CPU 0.

Message from syslogd@localhost at Jul  9 11:21:27 ...
 kernel:Do you have a strange power saving mode enabled?

Message from syslogd@localhost at Jul  9 11:21:27 ...
 kernel:Dazed and confused, but trying to continue

7)Check output of virsh event
event 'watchdog' for domain rhel76: inject-nmi
events received: 1

Also tested other actions, including shutdown, dump, none, poweroff, reset, pause, all are working as expected.


Hotunplugging:
steps:
1) Prepare a running guest with "i6300esb" watchdog device.
Dump guest xml and save the watchdog part into a file:
# virsh dumpxml rhel76 |grep '<watchdog' -A3 > /tmp/watchdog.xml

2) check the watchdog xml
# cat /tmp/watchdog.xml 
    <watchdog model='i6300esb' action='inject-nmi'>
      <alias name='watchdog0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </watchdog>

3) Hot-unplug the watchdog:
# virsh detach-device rhel76 /tmp/watchdog.xml
Device detached successfully

4) # virsh dumpxml foo=1 |grep '<watchdog' -A3
Nothing output

Also tested other actions, including shutdown, dump, none, poweroff, reset, pause, all are working as expected.

As all testing results match with expected results, mark it as verified.

Comment 15 errata-xmlrpc 2018-10-30 09:49:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3113

Comment 16 Laine Stump 2019-01-18 02:46:02 UTC
For reference, there was a bug in these patches that led to Bug 1666559 - failure to hotplug i6300esb on a Q35 virtual machine.


Note You need to log in before you can comment on or make changes to this bug.