Bug 637180

Summary: watchdog timer isn't reset when qemu resets
Product: Red Hat Enterprise Linux 6 Reporter: Richard W.M. Jones <rjones>
Component: qemu-kvmAssignee: Richard W.M. Jones <rjones>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: low    
Version: 6.1CC: chayang, dallan, khong, lihuang, michen, mkenneth, shu, virt-maint
Target Milestone: rc   
Target Release: 6.1   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-0.12.1.2-2.134.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-05-19 11:34:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 580954, 634607    

Description Richard W.M. Jones 2010-09-24 14:36:01 UTC
Description of problem:

The IB700 and I6300ESB watchdog timers don't include a handler
to reset them when the VM / qemu is reset.

The could lead to a failure case as follows:

(a) guest boots, watchdog is enabled

(b) guest does a reset eg:
  echo 'b' > /proc/sysrq-trigger
(note that an ordinary /sbin/reboot wouldn't hit this case
since that would properly disable the watchdog)

(c) the reboot takes longer than the remaining time on the
watchdog

(d) the watchdog therefore fires during the reboot

(e) probably the VM would just reboot again at this point which
  is pretty benign, but it could depend on the action that the
  user had selected for the watchdog

Version-Release number of selected component (if applicable):

All versions of qemu including upstream.

How reproducible:

Always.

Steps to Reproduce:

For the purpose of this demonstration, I'm going to directly
invoke qemu-kvm.  You need a guest with the watchdog daemon
installed and configured.  The guest should also have a grub
timeout so you can interrupt the boot.

(1) Run:

qemu-kvm \
    -enable-kvm \
    -m 1024 -vnc :11 \
    -drive file=watchdog-guest.img \
    -watchdog i6300esb \
    -watchdog-action reset

(2) Wait until this boots, check watchdog daemon is running.

(3) Inside the guest do:

sync
echo 1 > /proc/sys/kernel/sysrq
echo b > /proc/sysrq-trigger

Guest will immediately reboot.  Same qemu process will
still be running.

(4) Enter grub as the guest boots (ie. hit Esc), and wait for
a few seconds.

Actual results:

The watchdog will fire, causing the guest to reboot again while
you are waiting in grub.

Expected results:

Watchdog should have been reset by the action of resetting
qemu.

Additional info:

Comment 1 Richard W.M. Jones 2010-09-24 16:28:24 UTC
Patch posted upstream:
http://lists.nongnu.org/archive/html/qemu-devel/2010-09/msg01754.html

Comment 2 Richard W.M. Jones 2011-01-04 11:48:15 UTC
Fixed (eventually) upstream in the following commits:

http://www.qemu.com/qemu.git/commit/?id=36888c6335422f07bbc50bf3443a39f24b90c7c6
Watchdog: disable watchdog timer when hard-rebooting a guest.

http://www.qemu.com/qemu.git/commit/?id=fa82e9c300df6f7b8bd44a26ac752c4ea5da02c1
wdt_i6300esb: register a reset function

(Note that both patches are needed)

Comment 9 Shaolong Hu 2011-02-17 09:52:40 UTC
Reproduced on qemu-kvm-0.12.1.2-2.131.el6 as following steps.

Reproduce Procedure:
---------------------
1. boot guest with "-watchdog i6300esb -watchdog-action reset"
2. in the guest, enable watchdog:
   #cat /dev/watchdog
3. reboot guest using following commands before watch dog heartbeat expire:
   #sync
   #echo 1 > /proc/sys/kernel/sysrq
   #echo b > /proc/sysrq-trigger
4. enter grub interface, press ESC to interrupt the timer, and wait.

Actual results:
----------------
After step 4, for a few seconds, the guest reset.


Verify this bug on qemu-kvm-0.12.1.2-2.146.el6 as the same steps above.

Actual results:
----------------
After step 4, the guest stay in grub interface without reset.

Conclusion:
-------------
According to above results, this bug has been resolved.

Comment 10 Richard W.M. Jones 2011-03-02 10:33:10 UTC
Should this be moved to VERIFIED now?

Comment 11 Miya Chen 2011-03-02 10:41:01 UTC
Based on comment#9, change status to verified.

Comment 12 errata-xmlrpc 2011-05-19 11:34:42 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0534.html

Comment 13 errata-xmlrpc 2011-05-19 12:49:16 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0534.html