Bug 1699438

Summary: add_timer_on on remote CPUs not firing
Product: Red Hat Enterprise Linux 7 Reporter: Marcelo Tosatti <mtosatti>
Component: kernel-rtAssignee: Luis Claudio R. Goncalves <lgoncalv>
kernel-rt sub component: Other QA Contact: Qiao Zhao <qzhao>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: bhu, fiezzi, jlelli, mstowell, qzhao, williams
Version: 7.8   
Target Milestone: rc   
Target Release: 7.8   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kernel-rt-3.10.0-1063.rt56.1023.el7 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-31 19:48:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1715542    

Description Marcelo Tosatti 2019-04-12 17:25:24 UTC
The following patch 

commit 11cd96866fe5b5d02f53fe2fcf7e3d554a4fe5da
Author: Daniel Bristot de Oliveira <bristot>
Date:   Tue May 26 12:07:54 2015 -0400

    tick: sched: Remove hrtimer_active() checks

breaks add_timer_on on remote CPUs, because it stops 
tick_nohz_activate from calling timers_update_migration,
which in turn results in base->nohz_active to remain false.

Fix is to revert this hunk:

hrtimer_forward(&ts->sched_timer, now, tick_period);
hrtimer_start_expires(&ts->sched_timer,
HRTIMER_MODE_ABS_PINNED);
-   tick_nohz_activate(ts, NOHZ_MODE_HIGHRES);
+
+#ifdef CONFIG_NO_HZ_COMMON
+   if (tick_nohz_enabled) {
+           ts->nohz_mode = NOHZ_MODE_HIGHRES;
+           tick_nohz_active = 1;
+   }
+#endif

Sending patch to rt list shortly.

Comment 1 Juri Lelli 2019-04-15 07:16:23 UTC
Hi Marcelo,

I couldn't find the patch you mention in the 8.x trees,
only 7.7 (I didn't check others 7.x) seems to have it.

8.x however have the upstream corresponding patch
afc08b15cc2a ("tick: sched: Remove hrtimer_active() checks", 2015-04-14).

What am I missing?

Thanks!

Comment 4 Marcelo Tosatti 2019-08-27 12:58:01 UTC
(In reply to Juri Lelli from comment #1)
> Hi Marcelo,
> 
> I couldn't find the patch you mention in the 8.x trees,
> only 7.7 (I didn't check others 7.x) seems to have it.
> 
> 8.x however have the upstream corresponding patch
> afc08b15cc2a ("tick: sched: Remove hrtimer_active() checks", 2015-04-14).
> 
> What am I missing?
> 
> Thanks!

This issue is a mismerge (only present in RHEL-RT 7.x tree).

Comment 5 Marcelo Tosatti 2019-08-27 13:01:57 UTC
Fix posted and integrated to RHEL-7 RT kernel tree:

commit 2ac7a754416a079fb54188ca975c68f117fe0cdc
Author: Marcelo Tosatti <mtosatti>
Date:   Fri Jun 28 09:14:43 2019 -0300

    Revert "tick: sched: Remove hrtimer_active() checks"
    
    The following patch
    
    commit 11cd96866fe5b5d02f53fe2fcf7e3d554a4fe5da
    Author: Daniel Bristot de Oliveira <bristot>
    Date:   Tue May 26 12:07:54 2015 -0400
    
        tick: sched: Remove hrtimer_active() checks
    
    breaks add_timer_on on remote CPUs, because it stops
    tick_nohz_activate from calling timers_update_migration,
    which in turn results in base->nohz_active to remain false.
    
    BZ: 1699438
    BZ: 1690543
    BZ: 1550584
    
    Acked-by: Daniel Bristot de Oliveira <bristot>
    Acked-by: Juri Lelli <juri.lelli>
    Acked-by: Luis Claudio R. Goncalves <lgoncalv>
    Signed-off-by: Marcelo Tosatti <mtosatti>

Comment 6 Beth Uptagrafft 2019-08-27 15:20:15 UTC
(In reply to Marcelo Tosatti from comment #4)
> (In reply to Juri Lelli from comment #1)
> > Hi Marcelo,
> > 
> > I couldn't find the patch you mention in the 8.x trees,
> > only 7.7 (I didn't check others 7.x) seems to have it.
> > 
> > 8.x however have the upstream corresponding patch
> > afc08b15cc2a ("tick: sched: Remove hrtimer_active() checks", 2015-04-14).
> > 
> > What am I missing?
> > 
> > Thanks!
> 
> This issue is a mismerge (only present in RHEL-RT 7.x tree).

This is a RHEL8 BZ, so if this issue only affects RHEL7, then I assume we can close this issue?

Comment 14 errata-xmlrpc 2020-03-31 19:48:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:1070