Bug 1745646

Summary: [RT] sched/fair: Robustify CFS-bandwidth timer locking
Product: Red Hat Enterprise Linux 8 Reporter: Clark Williams <williams>
Component: kernel-rtAssignee: Clark Williams <williams>
kernel-rt sub component: Scheduler QA Contact: Qiao Zhao <qzhao>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: unspecified CC: bhu, jlelli, lgoncalv, mstowell, qzhao, tieli
Version: 8.1   
Target Milestone: rc   
Target Release: 8.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-rt-4.18.0-80.5.rt16.1.el8 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-11-05 20:38:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1696402    

Description Clark Williams 2019-08-26 14:54:02 UTC
This commit fixes a system lockup that happens with no system log reporting and affects all Red Hat RT kernels. 

From the commit log:

    Traditionally hrtimer callbacks were run with IRQs disabled, but with
    the introduction of HRTIMER_MODE_SOFT it is possible they run from
    SoftIRQ context, which does _NOT_ have IRQs disabled.
    
    Allow for the CFS bandwidth timers (period_timer and slack_timer) to
    be ran from SoftIRQ context; this entails removing the assumption that
    IRQs are already disabled from the locking.
    
    While mainline doesn't strictly need this, -RT forces all timers not
    explicitly marked with MODE_HARD into MODE_SOFT and trips over this.
    And marking these timers as MODE_HARD doesn't make sense as they're
    not required for RT operation and can potentially be quite expensive.
    
    Reported-by: Tom Putzeys <tom.putzeys.com>
    Tested-by: Mike Galbraith <efault>
    Signed-off-by: Peter Zijlstra (Intel) <peterz>
    Cc: Linus Torvalds <torvalds>
    Cc: Peter Zijlstra <peterz>
    Cc: Sebastian Andrzej Siewior <bigeasy>
    Cc: Thomas Gleixner <tglx>
    Link: https://lkml.kernel.org/r/20190107125231.GE14122@hirez.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar <mingo>

Comment 1 Clark Williams 2019-08-26 14:56:40 UTC
Upstream commit from PREEMPT_RT tree:  c0ad4aa4d8

Comment 2 Clark Williams 2019-08-26 17:33:22 UTC
The code from commit c0ad4aa4d8 got pulled in with other 5.2-rt backports, so this is mainly relevant for pulling back to zstreams

Comment 4 Clark Williams 2019-08-26 17:45:12 UTC
(In reply to Clark Williams from comment #2)
> The code from commit c0ad4aa4d8 got pulled in with other 5.2-rt backports,
> so this is mainly relevant for pulling back to zstreams

kernel-rt-4.18.0-80.5.rt16.1.el8 (2019-03-27) and newer contains this commit

Comment 9 errata-xmlrpc 2019-11-05 20:38:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:3309