Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1183773 - clock_event_device:min_delta_ns can overflow and can never go down
clock_event_device:min_delta_ns can overflow and can never go down
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel (Show other bugs)
6.6
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Prarit Bhargava
Cui Chun
:
Depends On:
Blocks: 1300182
  Show dependency treegraph
 
Reported: 2015-01-19 13:39 EST by Roman Kagan
Modified: 2016-01-20 03:09 EST (History)
2 users (show)

See Also:
Fixed In Version: kernel-2.6.32-542.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-07-22 04:38:04 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
RHEL PATCH 1/2 (4.74 KB, patch)
2015-01-26 07:23 EST, Prarit Bhargava
no flags Details | Diff
RHEL PATCH 2/2 (12.58 KB, patch)
2015-01-26 07:23 EST, Prarit Bhargava
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:1272 normal SHIPPED_LIVE Moderate: kernel security, bug fix, and enhancement update 2015-07-22 07:56:25 EDT

  None (edit)
Description Roman Kagan 2015-01-19 13:39:33 EST
Description of problem:

As can be seen in kernel/time/tick-oneshot.c:tick_dev_program_event(), clock_event_device:min_delta_ns, which represents the granularity of the clockevent timer increments, can grow till overflow and can never be reduced.

One possible observable consequence of that is, if it ever overflows, the loop in this function becomes endless, because

      expires = ktime_add_ns(now, dev->min_delta_ns);

gives either negative expires or expires less than now, either of which resulting in error return from clockevents_program_event() which causes the loop to start over.


Version-Release number of selected component (if applicable):

2.6.32-504.3.3.el6.x86_64


How reproducible:

The endless loop on one of the CPUs was seen once in a virtual machine in Parallels Server.  The exact details are still investigated.


Additional info:

These problems have been addressed by the following commits in the mainline kernel:

commit 80a05b9ffa7dc13f6693902dd8999a2b61a3a0d7
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Fri Mar 12 17:34:14 2010 +0100

    clockevents: Sanitize min_delta_ns adjustment and prevent overflows
    
    The current logic which handles clock events programming failures can
    increase min_delta_ns unlimited and even can cause overflows.
    
    Sanitize it by:
     - prevent zero increase when min_delta_ns == 1
     - limiting min_delta_ns to a jiffie
     - bail out if the jiffie limit is hit
     - add retries stats for /proc/timer_list so we can gather data
    
    Reported-by: Uwe Kleine-Koenig <u.kleine-koenig@pengutronix.de>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


commit d1748302f70be7469809809283fe164156a34231
Author: Martin Schwidefsky <schwidefsky@de.ibm.com>
Date:   Tue Aug 23 15:29:42 2011 +0200

    clockevents: Make minimum delay adjustments configurable
    
    The automatic increase of the min_delta_ns of a clockevents device
    should be done in the clockevents code as the minimum delay is an
    attribute of the clockevents device.
    
    In addition not all architectures want the automatic adjustment, on a
    massively virtualized system it can happen that the programming of a
    clock event fails several times in a row because the virtual cpu has
    been rescheduled quickly enough. In that case the minimum delay will
    erroneously be increased with no way back. The new config symbol
    GENERIC_CLOCKEVENTS_MIN_ADJUST is used to enable the automatic
    adjustment. The config option is selected only for x86.
    
    Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
    Cc: john stultz <johnstul@us.ibm.com>
    Link: http://lkml.kernel.org/r/20110823133142.494157493@de.ibm.com
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Comment 1 Roman Kagan 2015-01-19 13:43:02 EST
Could you please make this bug public? Thanks!
Comment 3 Prarit Bhargava 2015-01-21 08:05:40 EST
Hi Roman, I agree that these changes need to be made.  Do you have a reproducer that leads to a bug?

P.
Comment 4 Roman Kagan 2015-01-21 08:37:46 EST
Unfortunately, no.

We've seen several (very few) sporadic reproductions during routine automated testing of Parallels Server, but have been unable yet to identify the exact scenario of how min_delta_ns can grow up to those pathological values initially.
Comment 5 Prarit Bhargava 2015-01-23 07:41:40 EST
(In reply to Roman Kagan from comment #4)
> Unfortunately, no.
> 
> We've seen several (very few) sporadic reproductions during routine
> automated testing of Parallels Server, but have been unable yet to identify
> the exact scenario of how min_delta_ns can grow up to those pathological
> values initially.

Okay, I've run this in our testing suite and don't see any issues so I'm going to do some additional testing over the weekend.

P.
Comment 6 Prarit Bhargava 2015-01-26 07:23:29 EST
Created attachment 984199 [details]
RHEL PATCH 1/2
Comment 7 Prarit Bhargava 2015-01-26 07:23:30 EST
Created attachment 984200 [details]
RHEL PATCH 2/2
Comment 9 RHEL Product and Program Management 2015-01-26 07:29:54 EST
This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.
Comment 10 Rafael Aquini 2015-03-07 00:37:37 EST
Patch(es) available on kernel-2.6.32-542.el6
Comment 16 errata-xmlrpc 2015-07-22 04:38:04 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1272.html

Note You need to log in before you can comment on or make changes to this bug.