Bug 1247893
Summary: | qemu's i6300esb watchdog does not fire on time with large heartbeat like 2046 | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Shaolong Hu <shu> |
Component: | qemu-kvm-rhev | Assignee: | Laurent Vivier <lvivier> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.2 | CC: | dgibson, juzhang, knoel, lvivier, michen, mrezanin, ngu, rjones, virt-bugs, virt-maint, xuhan |
Target Milestone: | rc | ||
Target Release: | 7.2 | ||
Hardware: | All | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-rhev-2.3.0-17.el7 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | 1203914 | Environment: | |
Last Closed: | 2015-12-04 16:52:47 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Bug Depends On: | 1203914 | ||
Bug Blocks: |
Comment 2
Shaolong Hu
2015-07-29 07:59:00 UTC
Please don't use clone in this situation. Cloning bugs is for when the same bug affects multiple products / versions. Using a clone in this case means the initial bug report is filled with details about the wrong bug. I have done some test on following sw versions: host kernel: 3.10.0-300.el7.ppc64le Guest kernel: 3.10.0-295.el7.ppc64/3.10.0-295.el7.ppc64le Qemu-kvm-rhev: qemu-kvm-rhev-2.3.0-13.el7.ppc64le There is following test data: heartbeat_set real_time_used_before_watchdog_fire 30 30 60 60 128 129 140 142 180 182 240 243 255 258 256 0 257 1 258 2 259 3 260 4 511 258 512 0 1024 0 1025 1 2045 256 2046 259 So the hearbeat of i6300esb watchdog must treat 256 as a count cycle unit. The timer value is not correctly computed: the function computes the number of QEMU ticks to wait whereas the timer function uses nanoseconds. The value is generally the same (if we suppose QEMU clock is 1GHz), but it is easier (and avoid overflow) to multiply by 30 instead of by 1000000000/33000000. I've sent a patch upstream. Since we have a fix ready to go, and we should be able to get it into qemu-kvm-rhev for the 7.2 timeframe, moving back to rhel 7.2. Fix included in qemu-kvm-rhev-2.3.0-17.el7 Verified on qemu-kvm-rhev-2.3.0-17.el7.x86_64: boot guest with "-device i6300esb,id=watchdog0 -watchdog-action pause "modprobe i6300esb heartbeat=2045" in guest cat /dev/watchdog in guest, after about 34 minutes guest got paused. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2546.html |