Bug 1203914

Summary: qemu's i6300esb watchdog implementation will trigger immediately if timeout is set sufficiently large
Product: Red Hat Enterprise Linux 7 Reporter: David Gibson <dgibson>
Component: qemu-kvm-rhevAssignee: David Gibson <dgibson>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.2CC: dgibson, juzhang, knoel, michen, mrezanin, rjones, virt-maint, xuhan
Target Milestone: rc   
Target Release: 7.2   
Hardware: All   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1247893 (view as bug list) Environment:
Last Closed: 2015-12-04 16:32:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1247893    

Description David Gibson 2015-03-19 23:59:10 UTC
Description of problem:

An integer overflow in qemu's implementation of the i6300esb watchdog timer means that if the guest programs a sufficiently large timeout time (but still within the range the device is supposed to support), the watchdog will trigger immediately, potentially killing the guest.

Version-Release number of selected component (if applicable):

qemu-kvm-rhev-2.2.0-5.el7.ppc64

How reproducible:

100%, with correct configuraiton

Steps to Reproduce:
1. Create a qemu guest including the i6300esb watchdog device.  For example the following qemu command line is suitable:

qemu-system-x86_64 \
    -machine pc \
    -enable-kvm \
    -m 2048 \
    -no-user-config \
    -nodefaults \
    -vga std \
    -chardev stdio,id=charmonitor,mux=on,signal=off \
    -mon chardev=charmonitor,id=monitor \
    -rtc base=utc \
    -boot strict=on \
    -drive file=DISKIMAGE.qcow2,if=none,id=drive0,format=qcow2 \
    -device virtio-blk-pci,drive=drive0,id=blk0,bootindex=1 \
    -netdev user,id=net0 \
    -device virtio-net-pci,netdev=net0 \
    -device i6300esb,id=watchdog0 -watchdog-action pause

2. Install a Linux guest under this qemu
3. Log into the guest, unload the i6300esb driver:
    # rmmod i6300esb
4. Reload the i6300esb driver with an altered heartbeat module parameter:
    # modprobe i6300esb heartbeat=2046
5. Open the watchdog device, for example with:
    # python
    >>> open("/dev/watchdog")

Actual results:

As soon as the watchdog device is opened, the watchdog immediately triggers, pausing the guest (with the example qemu command line above).

Expected results:

Watchdog does not trigger for ~2046 seconds, as specified by the heartbeat parameter.

Additional info:

This bug probably affects qemu-kvm (not RHEV) and RHEL6 as well, though I haven't tested so far.

Comment 1 David Gibson 2015-03-20 03:47:09 UTC
I've posted an upstream fix for this (and bug 1198936).

See http://lists.gnu.org/archive/html/qemu-devel/2015-03/msg04372.html

Comment 2 Shaolong Hu 2015-03-20 08:01:11 UTC
Hi David,

Should we fix this in RHEL6?

Bests,

Comment 5 David Gibson 2015-05-06 02:14:44 UTC
Patch was merged upstream and incorporated downstream in the qemu-2.3.0 rebase.

Comment 6 Xu Han 2015-06-23 08:01:36 UTC
Tested this issue with qemu-kvm-rhev-2.3.0-2 both on x86_64 and ppc64le host.

The watchdog timer did not fire immediately, but not fire on time either.

Details:
# modprobe -r i6300esb; modprobe i6300esb heartbeat=2046
# dmesg | grep -i i6300
...
[   35.981953] i6300esb: initialized (0xffffc900003ba000). heartbeat=2046 sec (nowayout=0)

# cat wd.py
import time

wd = open('/dev/watchdog', 'rw')
wd.close()

for i in range(1, 2049):
    time.sleep(1)
    print i

# python wd.py
1
2
...
257
258   <- watchdog fired (VM paused)

Comment 8 Shaolong Hu 2015-07-23 09:20:19 UTC
Hi David,

Do you think comment 6 is a problem?

Comment 9 David Gibson 2015-07-24 01:04:09 UTC
Comment 6 looks like a real bug, but a different and less serious one from this one.

I think we can still verify this, and file comment 6 as a new bug.

Comment 10 Shaolong Hu 2015-07-29 08:00:06 UTC
As comment 9, set verified and file new one:

Bug 1247893 - qemu's i6300esb watchdog does not fire on time with large heartbeat like 2046

Comment 12 errata-xmlrpc 2015-12-04 16:32:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2546.html