Bug 1198936

Summary: wdt_i6300esb immediately fires on big endian (ppc64)
Product: Red Hat Enterprise Linux 7 Reporter: Xu Han <xuhan>
Component: qemu-kvm-rhevAssignee: David Gibson <dgibson>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.2CC: hannsj_uhl, hhuang, knoel, michal.skrivanek, michen, mrezanin, ngu, qzhang, rjones, shuyu, virt-maint, xuhan, ypu
Target Milestone: rc   
Target Release: ---   
Hardware: ppc64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-04 16:31:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1152690, 1154205    

Description Xu Han 2015-03-05 06:55:45 UTC
Description of problem:
The timer expires immediately just by opening the '/dev/watchdog' interface. Both BE and LE guests hit this issue.

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.2.0-5.el7.ppc64

How reproducible:
100%

Steps to Reproduce:
1. Boot guest with i6300esb watchdog timer.
/usr/libexec/qemu-kvm ... \
    -device i6300esb,id=watchdog0 -watchdog-action pause

# dmesg | grep -i i6300esb
[    3.499000] i6300esb: Intel 6300ESB WatchDog Timer Driver v0.05
[    3.499044] i6300ESB timer 0000:00:00.0: enabling device (0100 -> 0102)
[    3.500952] i6300esb: initialized (0xd000080080080000). heartbeat=30 sec (nowayout=0)

2. Open watchdog interface.
# python
Python 2.7.5 (default, Dec 19 2014, 13:08:28) 
[GCC 4.8.3 20140911 (Red Hat 4.8.3-9)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> wd = open("/dev/watchdog", "w")


Actual results:
The timer expires immediately.
(qemu) info status 
VM status: paused (watchdog)

Expected results:
This device works properly.

Additional info:
CMDline:
/usr/libexec/qemu-kvm \
    -name watchdog-test-xuhan \
    -machine pseries,accel=kvm,usb=off \
    -m 4096 \
    -cpu POWER8 \
    -smp 4,sockets=2,cores=2,threads=1 \
    -no-user-config \
    -nodefaults \
    -chardev socket,id=charmonitor,path=monitor,server,nowait \
    -mon chardev=charmonitor,id=monitor,mode=control \
    -rtc base=utc \
    -boot strict=on \
    -device pci-ohci,id=usb,bus=pci.0,addr=0x1 \
    -device spapr-vscsi,id=scsi0,reg=0x2000 \
    -drive file=rhel72.qcow2,if=none,id=drive-scsi0-0-0-0,format=qcow2 \
    -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 \
    -netdev tap,id=hostnet0 \
    -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:5d:c7:9e,bus=pci.0,addr=0x2 \
    -chardev socket,id=charserial0,path=serial,server,nowait \
    -device spapr-vty,chardev=charserial0,reg=0x30000000 \
    -device usb-kbd,id=input0 \
    -device usb-mouse,id=input1 \
    -device i6300esb,id=watchdog0 -watchdog-action ${action} \
    -vnc 0.0.0.0:50 \
    -k en-us \
    -device VGA,id=video0,vgamem_mb=16,bus=pci.0,addr=0x3 \
    -monitor stdio

Comment 2 David Gibson 2015-03-18 06:13:58 UTC
I've finally had a chance to look at this.

It looks as if the i6300esb code is setting its timeout to a negative value, due to an integer overflow.

What I haven't understood yet, is how it's working on x86, which looks as if it should be subject to exactly the same problem.

Comment 3 Miya Chen 2015-03-19 09:53:31 UTC
Hi xu, could you please help a try of this case on x86 to see if it also happens when you have time? Thanks.

Comment 4 David Gibson 2015-03-19 23:50:11 UTC
Xu, never mind.

I've found a different, ppc64 only bug that caused the problem here.

A side effect of the bug was that it made it easier to trigger the first bug I found, which will also appear on x86, but not with the default configuration of the watchdog driver in the guest side.  I will file a different BZ for that bug.

Comment 5 David Gibson 2015-03-20 00:06:10 UTC
Ok, overflow bug filed as 1203914.

The ppc64 specific bug reported here (technically it will happen on any platform considered by qemu to be big-endian by default) is much simpler.  The IO ports on the i6300esb device are marked in qemu as being NATIVE_ENDIAN, when they should (as a PCI device) be LITTLE_ENDIAN.

Comment 6 David Gibson 2015-03-20 03:47:30 UTC
I've posted an upstream fix for this (and bug 1203914).  See:

http://lists.gnu.org/archive/html/qemu-devel/2015-03/msg04372.html

Comment 8 David Gibson 2015-05-06 02:14:14 UTC
The fix has now been merged upstream, and was incorporated downstream by the qemu-2.3.0 rebase.

Comment 10 Shuang Yu 2015-08-12 09:14:40 UTC
Reproduce this bug with BE guest ang LE guest on the "qemu-kvm-rhev-2.1.2-23.el7.ppc64" host,hit the same issue.

Verify this bug with BE guest and LE guest on the "qemu-kvm-rhev-2.3.0-12.el7.ppc64" host,not hit this issue.

Verify this bug with Be guest and LE guest on the "qemu-kvm-rhev-2.3.0-13.el7.ppc64le" host,not hit this issue.

Comment 11 Shuang Yu 2015-08-13 09:35:30 UTC
Steps as follows:

1.Boot guest with i6300esb watchdog timer.
/usr/libexec/qemu-kvm ... \
    -device i6300esb,id=watchdog0 -watchdog-action pause

2.In the guest:
[root@dhcp106-227 ~]# dmesg |grep -i i6300esb
[    6.749806] i6300esb: Intel 6300ESB WatchDog Timer Driver v0.05
[    6.749851] i6300ESB timer 0000:00:00.0: enabling device (0100 -> 0102)
[    6.750588] i6300esb: initialized (0xd000080080080000). heartbeat=30 sec (nowayout=0)

3.[root@dhcp106-227 ~]# python
Python 2.7.5 (default, Jul  8 2015, 05:06:00) 
[GCC 4.8.3 20140911 (Red Hat 4.8.3-9)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> wd=open("/dev/watchdog","w")

Reproduce result:
Guest paused immediately.
(qemu) info status 
VM status: paused (watchdog)
 
Verify result:
Wait about 30s,guest paused
(qemu) info status
VM status: running
(qemu) info status
VM status: running
(qemu) info status
VM status: running
(qemu) info status
VM status: running
(qemu) info status
VM status: paused (watchdog)

Additional info:
CMDline:
/usr/libexec/qemu-kvm -name bug1198936-be-2.3-host-le -machine pseries,accel=kvm,usb=off -m 4096 -cpu POWER8 -smp 4,sockets=2,cores=2,threads=1 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -boot strict=on -device pci-ohci,id=usb,bus=pci.0,addr=0x1 -device spapr-vscsi,id=scsi0,reg=0x2000 -drive file=RHEL-7.2-20150806.1-Server-ppc64.qcow2,if=none,id=drive-scsi0-0-0-0,format=qcow2 -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 -drive file=RHEL-7.2-20150806.1-Server-ppc64-dvd1.iso,if=none,id=drive-scsi0-0-1-0,readonly=on,format=raw -device scsi-cd,bus=scsi0.0,drive=drive-scsi0-0-1-0,bootindex=2,id=scsi0-0-1-0 -netdev tap,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:5d:c7:8e,bus=pci.0,addr=0x2 -chardev socket,id=charserial0,path=serial,server,nowait -device spapr-vty,chardev=charserial0,reg=0x30000000 -device usb-kbd,id=input0 -device usb-mouse,id=input1 -device usb-tablet,id=tablet1 -vnc :10 -k en-us -device VGA,id=video0,vgamem_mb=16,bus=pci.0,addr=0x3 -monitor stdio -device i6300esb,id=watchdog0 -watchdog-action pause

Comment 13 errata-xmlrpc 2015-12-04 16:31:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2546.html