Created attachment 401370 [details] Skip the timer_irq_works check when on VMware. Description of problem: We are hitting the IO-APIC + timer bug (a.k.a pester mingo on mainline) with the RHEL 5.4 32bit kernel at bootup. The problem is specific to virtualization since in some cases the hypervisor can be de-scheduled when the kernel is doing the timer_irq_works call, as a result the TSC and jiffies values can go out of sync. This is more prominent on VMware since we enable the LazyTimerEmulation mode for 32bit kernels on VMware platform for 5.4 kernel (as part of PR 463573). Since this problem is VMware specific, I have added a condition to skip the timer_irq_works call when running on VMware platform. Please note that this problem is not prominent on mainline kernel since the PIT is not used to drive timekeeping interrupts on those kernels as a result we don't have to enter the LazyTimerEmulation mode for correct timekeeping over there. I am attaching a patch which skips the check when on VMware.
The patch is fine, it is definitely the easiest fix.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Is this patch committed for any update release yet ? Please let me know if you are waiting on any info from me. Thanks.
in kernel-2.6.18-208.el5 You can download this test kernel from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed.
Thanks for including the fix.
Hi Alok, Could you please provide some guidance how to reproduce the bug(setup notes or a known system/environment) or will you test this bug? Thanks a lot!
Hi Zhang, Apologies for the delay. We were hitting this bug while running boot halt tests, don't remember the frequency now, but was not 100%. In any case I got hold of the kernel-2.6.18-236.el5.src.rpm and verified that the fix is included. You can close this as verified now. Thanks.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0017.html