Bug 1247893

Summary: qemu's i6300esb watchdog does not fire on time with large heartbeat like 2046
Product: Red Hat Enterprise Linux 7 Reporter: Shaolong Hu <shu>
Component: qemu-kvm-rhevAssignee: Laurent Vivier <lvivier>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.2CC: dgibson, juzhang, knoel, lvivier, michen, mrezanin, ngu, rjones, virt-bugs, virt-maint, xuhan
Target Milestone: rc   
Target Release: 7.2   
Hardware: All   
OS: Unspecified   
Fixed In Version: qemu-kvm-rhev-2.3.0-17.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1203914 Environment:
Last Closed: 2015-12-04 16:52:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1203914    
Bug Blocks:    

Comment 2 Shaolong Hu 2015-07-29 07:59:00 UTC
This is for the qemu's i6300esb watchdog does not fire on time with large heartbeat problem, the original one has been solved in bug 1203914.

Comment 3 David Gibson 2015-07-30 00:49:03 UTC
Please don't use clone in this situation.  Cloning bugs is for when the same bug affects multiple products / versions.  Using a clone in this case means the initial bug report is filled with details about the wrong bug.

Comment 4 Gu Nini 2015-07-31 09:24:50 UTC
I have done some test on following sw versions:

host kernel: 3.10.0-300.el7.ppc64le
Guest kernel: 3.10.0-295.el7.ppc64/3.10.0-295.el7.ppc64le
Qemu-kvm-rhev: qemu-kvm-rhev-2.3.0-13.el7.ppc64le

There is following test data:

heartbeat_set   real_time_used_before_watchdog_fire
30              30
60              60
128             129
140             142
180             182
240             243
255             258
256             0
257             1
258             2
259             3
260             4
511             258
512             0
1024            0
1025            1
2045            256
2046            259

So the hearbeat of i6300esb watchdog must treat 256 as a count cycle unit.

Comment 5 Laurent Vivier 2015-08-03 14:04:18 UTC
The timer value is not correctly computed: the function computes the number of QEMU ticks to wait whereas the timer function uses nanoseconds. The value is generally the same (if we suppose QEMU clock is 1GHz), but it is easier (and avoid overflow) to multiply by 30 instead of by 1000000000/33000000. I've sent a patch upstream.

Comment 6 David Gibson 2015-08-12 00:22:32 UTC
Since we have a fix ready to go, and we should be able to get it into qemu-kvm-rhev for the 7.2 timeframe, moving back to rhel 7.2.

Comment 7 Miroslav Rezanina 2015-08-12 09:03:56 UTC
Fix included in qemu-kvm-rhev-2.3.0-17.el7

Comment 9 Shaolong Hu 2015-08-20 09:14:53 UTC
Verified on qemu-kvm-rhev-2.3.0-17.el7.x86_64:

boot guest with "-device i6300esb,id=watchdog0 -watchdog-action pause

"modprobe i6300esb heartbeat=2045" in guest

cat /dev/watchdog in guest, after about 34 minutes guest got paused.

Comment 12 errata-xmlrpc 2015-12-04 16:52:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.