Bug 1145751
Summary: | kvm_clock lacks protection against tsc going backwards | ||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Roman Kagan <rvkagan> | ||||||||||||||||||||||||||||||
Component: | kernel | Assignee: | Prarit Bhargava <prarit> | ||||||||||||||||||||||||||||||
kernel sub component: | Other | QA Contact: | Cui Chun <ccui> | ||||||||||||||||||||||||||||||
Status: | CLOSED ERRATA | Docs Contact: | |||||||||||||||||||||||||||||||
Severity: | high | ||||||||||||||||||||||||||||||||
Priority: | unspecified | CC: | ccui, vvs | ||||||||||||||||||||||||||||||
Version: | 6.5 | ||||||||||||||||||||||||||||||||
Target Milestone: | rc | ||||||||||||||||||||||||||||||||
Target Release: | --- | ||||||||||||||||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||||||||||||||
Whiteboard: | |||||||||||||||||||||||||||||||||
Fixed In Version: | kernel-2.6.32-532.el6 | Doc Type: | Bug Fix | ||||||||||||||||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||||||||||||||||
Clone Of: | |||||||||||||||||||||||||||||||||
: | 1148398 (view as bug list) | Environment: | |||||||||||||||||||||||||||||||
Last Closed: | 2015-07-22 08:21:07 UTC | Type: | Bug | ||||||||||||||||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||||||||||||||
Embargoed: | |||||||||||||||||||||||||||||||||
Bug Depends On: | |||||||||||||||||||||||||||||||||
Bug Blocks: | 1148398, 1300182 | ||||||||||||||||||||||||||||||||
Attachments: |
|
Description
Roman Kagan
2014-09-23 16:08:19 UTC
Relevant commits are: commit 09ec54429c6d10f87d1f084de53ae2c1c3a81108 Author: Thomas Gleixner <tglx> Date: Wed Jul 16 21:05:12 2014 +0000 clocksource: Move cycle_last validation to core code The only user of the cycle_last validation is the x86 TSC. In order to provide NMI safe accessor functions for clock monotonic and monotonic_raw we need to do that in the core. We can't do the TSC specific if (now < cycle_last) now = cycle_last; for the other wrapping around clocksources, but TSC has CLOCKSOURCE_MASK(64) which actually does not mask out anything so if now is less than cycle_last the subtraction will give a negative result. So we can check for that in clocksource_delta() and return 0 for that case. Implement and enable it for x86 Signed-off-by: Thomas Gleixner <tglx> Signed-off-by: John Stultz <john.stultz> commit 3a97837784acbf9fed699fc04d1799b0eb742fdf Author: Thomas Gleixner <tglx> Date: Wed Jul 16 21:05:10 2014 +0000 clocksource: Make delta calculation a function We want to move the TSC sanity check into core code to make NMI safe accessors to clock monotonic[_raw] possible. For this we need to sanity check the delta calculation. Create a helper function and convert all sites to use it. [ Build fix from jstultz ] Signed-off-by: Thomas Gleixner <tglx> Signed-off-by: John Stultz <john.stultz> For the record, the problem was observed on systems with AMD erratum #759 (see http://support.amd.com/TechDocs/48063_15h_Mod_00h-0Fh_Rev_Guide.pdf, p. 90): 759 One Core May Observe a Time Stamp Counter Skew ================================================== Description ----------- During a P-state change or following a C-state change, the processor core may synchronize an internal copy of the time stamp counter (TSC) incorrectly. The processor may then observe TSC values (e.g., RDTSC, RDTSCP and RDMSR 0000_0010h instructions) or MPERF (MSR0000_000E7) values that do not account for the time spent performing this last P-state or C-state change. This error is normally temporary in nature, in that it may be resolved after the next P-state or C-state change. Potential Effect on System -------------------------- System software or software with multiple threads may observe that one thread or processor core provides TSC values that are behind all of the other threads or processor cores. While a single thread operating on a single core can not observe successively stored TSC values that incorrectly decrement, it is possible that a single thread may be dispatched on one core, where the software observes a TSC, and is then dispatched by the operating system on another core that has encountered the conditions of the erratum. In this sequence of events, the thread may observe a TSC that appears to decrement. In addition, software may calculate a higher effective frequency (APERF, MSR0000_00E8, divided by MPERF). Suggested Workaround -------------------- Contact your AMD representative for information on a BIOS update. Fix Planned ----------- Yes According to https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1200533, the latest amd-ucode has a fix to this erratum; however, the bug was seen on systems with that revision of ucode, too. [I didn't mean the bug to be private; I'd appreciate if it could be made public] (In reply to Roman Kagan from comment #4) > [I didn't mean the bug to be private; I'd appreciate if it could be made > public] np. P. Created attachment 941540 [details]
RHEL PATCH 1/6
Created attachment 941541 [details]
RHEL PATCH 2/6
Created attachment 941542 [details]
RHEL PATCH 3/6
Created attachment 941543 [details]
RHEL PATCH 4/6
Created attachment 941544 [details]
RHEL PATCH 5/6
Created attachment 941545 [details]
RHEL PATCH 6/6
Sorry everyone, I made the changes for RHEL7 first and accidentally used this BZ. I'm going to clone this to RHEL7 and POST for RHEL7 from there. P. Created attachment 943020 [details]
RHEL PATCH 1/2
Created attachment 943021 [details]
RHEL PATCH 2/2
Created attachment 943022 [details]
RHEL PATCH 3/2
Created attachment 943023 [details]
RHEL PATCH 4/2
Created attachment 943024 [details]
RHEL PATCH 5/2
Created attachment 943025 [details]
RHEL PATCH 6/2
Sorry everyone, I mucked up this BZ pretty badly and am cleaning it up. I'll push 6.7 patches shortly. P. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux release for currently deployed products. This request is not yet committed for inclusion in a release. Created attachment 975178 [details]
RHEL PATCH 1/2
Created attachment 975179 [details]
RHEL PATCH 2/2
Patch(es) available on kernel-2.6.32-532.el6 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1272.html |