Bug 1423432

Summary: Consider update migrate_disable/enable on kernel-rt
Product: Red Hat Enterprise Linux 7 Reporter: Daniel Bristot de Oliveira <daolivei>
Component: kernel-rtAssignee: Daniel Bristot de Oliveira <daolivei>
kernel-rt sub component: Memory Management QA Contact: Qiao Zhao <qzhao>
Status: CLOSED NOTABUG Docs Contact:
Severity: medium    
Priority: medium CC: bhu, lgoncalv, williams
Version: 7.4   
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1429676 (view as bug list) Environment:
Last Closed: 2019-05-15 14:29:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1428028    
Bug Blocks: 1410158, 1429676    
Attachments:
Description Flags
removed code, do we need it?!
none
[PATCH 1/5] sched: Move p->nr_cpus_allowed check to select_task_rq()
none
[PATCH 2/5] sched: fixup migrate disable (all tasks were bound to CPU0)
none
[PATCH 3/5] sched: fix ->nr_cpus_allowed = 1 transcription error during migrate_disable() cleanup
none
[PATCH 4/5] kernel: softirq: unlock with irqs on
none
[PATCH 5/5] kernel: migrate_disable() do fastpath in atomic & irqs-off
none
Subject: [PATCH 1/3] Revert "sched: fixup migrate disable (all tasks were
none
[PATCH 2/3] Revert "kernel: softirq: unlock with irqs on"
none
[PATCH 3/3] Revert "kernel: migrate_disable() do fastpath in atomic & irqs-off" none

Description Daniel Bristot de Oliveira 2017-02-17 10:18:13 UTC
Created attachment 1251865 [details]
removed code, do we need it?!

Description of problem:
Since v4.4-x-rt stable kernel the migrate_disable/enable operations were modified.

The code now is simpler than it used to be, but we need to be sure that the simplification does not depend on another part of the kernel, mainly the control of the task's cpumask.

Version-Release number of selected component (if applicable):
RHEL-RT kernel based on RHEL 7.3 based

How reproducible:
No bug on the current code.

Actual results:
It works.

Expected results:
It should work with the newer code.

Additional info:
The main problem so far is that a considerable part of the code was removed. See attachment.

Comment 2 Daniel Bristot de Oliveira 2017-02-23 18:47:00 UTC
The major part of the changes come from this patch:

Backporting it and some more.

--------------- %< --------------
commit 55c041cf3750894df6f304ad0489c777e27402e7
Author: Sebastian Andrzej Siewior <bigeasy>
Date:   Thu Jan 21 15:58:56 2016 +0100

    sched: fixup migrate disable (all tasks were bound to CPU0)
    
    Includes:
    |From: Thomas Gleixner <tglx>
    |
    |Subject: [PATCH] sched: use tsk_cpus_allowed() instead of accessing ->cpus_allowed
    |
    |Use the future-safe accessor for struct task_struct's.
    |
    |Signed-off-by: Thomas Gleixner <tglx>
    |Signed-off-by: Sebastian Andrzej Siewior <bigeasy>
    
    and
    
    |From: Thomas Gleixner <tglx>
    |Subject: [PATCH] sched: provide a tsk_nr_cpus_allowed() helper
    |
    |tsk_nr_cpus_allowed() is an accessor for task->nr_cpus_allowed which allows
    |us to change the representation of ->nr_cpus_allowed if required.
    |
    |Signed-off-by: Thomas Gleixner <tglx>
    |Signed-off-by: Sebastian Andrzej Siewior <bigeasy>
    
    and the folded changes from introduce_migrate_disable_cpu_light.patch.
    
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy>
--------------- >% ---------------

Comment 3 Daniel Bristot de Oliveira 2017-02-27 22:11:34 UTC
Created attachment 1258194 [details]
[PATCH 1/5] sched: Move p->nr_cpus_allowed check to select_task_rq()

        commit 6c1d9410f007a26d13173cf17204cfd965f49b83
        From: Wanpeng Li <wanpeng.li.com>
        Date: Wed, 5 Nov 2014 09:14:37 +0800
        Subject: sched: Move p->nr_cpus_allowed check to select_task_rq()

        Move the p->nr_cpus_allowed check into kernel/sched/core.c: select_task_rq().
        This change will make fair.c, rt.c, and deadline.c all start with the
        same logic.

        Suggested-and-Acked-by: Steven Rostedt <rostedt>
        Signed-off-by: Wanpeng Li <wanpeng.li.com>
        Signed-off-by: Peter Zijlstra (Intel) <peterz>
        Cc: "pang.xunlei" <pang.xunlei>
        Cc: Linus Torvalds <torvalds>
        Link: http://lkml.kernel.org/r/1415150077-59053-1-git-send-email-wanpeng.li@linux.intel.com
        Signed-off-by: Ingo Molnar <mingo>

Signed-off-by: Daniel Bristot de Oliveira <bristot>

Comment 4 Daniel Bristot de Oliveira 2017-02-27 22:13:34 UTC
Created attachment 1258195 [details]
[PATCH 2/5] sched: fixup migrate disable (all tasks were bound to CPU0)

        commit 55c041cf3750894df6f304ad0489c777e27402e7
        Author: Sebastian Andrzej Siewior <bigeasy>
        Date:   Thu Jan 21 15:58:56 2016 +0100
        Subject: sched: fixup migrate disable (all tasks were bound to CPU0)

        Includes:
        |From: Thomas Gleixner <tglx>
        |
        |Subject: [PATCH] sched: use tsk_cpus_allowed() instead of accessing ->cpus_allowed
        |
        |Use the future-safe accessor for struct task_struct's.
        |
        |Signed-off-by: Thomas Gleixner <tglx>
        |Signed-off-by: Sebastian Andrzej Siewior <bigeasy>

        and

        |From: Thomas Gleixner <tglx>
        |Subject: [PATCH] sched: provide a tsk_nr_cpus_allowed() helper
        |
        |tsk_nr_cpus_allowed() is an accessor for task->nr_cpus_allowed which allows
        |us to change the representation of ->nr_cpus_allowed if required.
        |
        |Signed-off-by: Thomas Gleixner <tglx>
        |Signed-off-by: Sebastian Andrzej Siewior <bigeasy>

        and the folded changes from introduce_migrate_disable_cpu_light.patch.

        Signed-off-by: Sebastian Andrzej Siewior <bigeasy>

Signed-off-by: Daniel Bristot de Oliveira <bristot>

Comment 5 Daniel Bristot de Oliveira 2017-02-27 22:15:34 UTC
Created attachment 1258196 [details]
[PATCH 3/5] sched: fix ->nr_cpus_allowed = 1 transcription error during migrate_disable() cleanup

        commit b3b36ee61c780407737a321007a24cfe9c3668a7
        Author: Mike Galbraith <umgwanakikbuti>
        Date:   Sat Jan 23 08:11:53 2016 +0100
        Subject: sched: fix ->nr_cpus_allowed = 1 transcription error during migrate_disable() cleanup

        Setting p->nr_cpus_allowed accidentally wandered into migrate_disable()
        during the cleanup - kill it.

        Signed-off-by: Mike Galbraith <umgwanakikbuti>
        Signed-off-by: Sebastian Andrzej Siewior <bigeasy>

Signed-off-by: Daniel Bristot de Oliveira <bristot>

Comment 6 Daniel Bristot de Oliveira 2017-02-27 22:16:58 UTC
Created attachment 1258197 [details]
[PATCH 4/5] kernel: softirq: unlock with irqs on

        commit f848b439cd995a2016422fa0c08daff9e8bb45f7
        Author: Sebastian Andrzej Siewior <bigeasy>
        Date:   Tue Feb 9 18:17:18 2016 +0100
        Subject: kernel: softirq: unlock with irqs on

        kernel: softirq: unlock with irqs on

        We unlock the lock while the interrupts are off. This isn't a problem
        now but will get because the migrate_disable() + enable are not
        symmetrical in regard to the status of interrupts.

        Signed-off-by: Sebastian Andrzej Siewior <bigeasy>

Signed-off-by: Daniel Bristot de Oliveira <bristot>

Comment 7 Daniel Bristot de Oliveira 2017-02-27 22:19:04 UTC
Created attachment 1258198 [details]
[PATCH 5/5] kernel: migrate_disable() do fastpath in atomic &  irqs-off

        commit d09adb58ff92f87f3d2823f23c269c9ce241dd69
        Author: Sebastian Andrzej Siewior <bigeasy>
        Date:   Tue Feb 9 18:18:01 2016 +0100
        Subject: kernel: migrate_disable() do fastpath in atomic & irqs-off

        With interrupts off it makes no sense to do the long path since we can't
        leave the CPU anyway. Also we might end up in a recursion with lockdep.

        Signed-off-by: Sebastian Andrzej Siewior <bigeasy>

Signed-off-by: Daniel Bristot de Oliveira <bristot>

Comment 12 Daniel Bristot de Oliveira 2017-06-12 17:52:34 UTC
There is a problem in the updated version of the
migrate_disable()/enable() implementation backported in this patch.

The problem is the following:

In the new behavior, when a task is attached to the rt runqueue, it is
checked if it either can run in more than one CPU, or if it is with
migration disable. If either check is true, the rt_rq->rt_nr_migratory
counter is not increased. The counter increases otherwise.

When the task is detached, the same check is done. If either check is
true, the rt_rq->rt_nr_migratory counter is not decreased. The counter
decreases otherwise.

One important thing is that, migrate disable/enable does not touch this
counter for tasks attached to the -rt rq. So suppose the following chain
of events.

Assumptions:
Task A is the only runnable task in A              Task B runs in the CPU B
rt_nr_migratory is 0, so.                          Task B has RT priority
Task A can run on all CPUS.                        B is running
A is running                                       
Task A runs on CFS (non-rt)

        CPU A/TASK A                            CPU B/TASK B
A takes the rt mutex X                                .
A disables migration                                  .
           .                             B rt task tries to take the rt mutex X
           .                             As it is locked by A:
           .                               A inherits the rt priority of B
           .                               A is dequeued from CFS RQ of CPU A
           .                               A is enqueued in the RT RQ of CPU A
           .                               As migration is disabled:
           .                               rt_nr_migratory in A is not increased
           .
A enables migration
A releases the rt mutex {
  A returns to its original priority
  A ask to be dequeued from RT RQ {
    As migration is now enabled and it can run on all CPUS {
       rt_nr_migratory should be decreased
       As it is 0, rt_nr_migratory under flows.
    }
}

The rt_nr_migratory value must be at least 0, and that is the cause
of the BUG().

This variable is important because it notifies if there are more than one
runnable & migratable task in the runqueue. If there are more than one
tasks, the rt_rq is set as overloaded, and this tries to migrate some
tasks. This rule is important to keep the scheduler working conserving,
that is in a system with M CPUs, the M highest priority tasks should be
running.

Comment 13 Daniel Bristot de Oliveira 2017-06-12 17:57:37 UTC
Created attachment 1287084 [details]
Subject: [PATCH 1/3] Revert "sched: fixup migrate disable (all tasks were

This reverts commit c3d80fe97a84a8d137b93e61b2ccbca974022d84.

Signed-off-by: Daniel Bristot de Oliveira <bristot>

Comment 14 Daniel Bristot de Oliveira 2017-06-12 17:58:31 UTC
Created attachment 1287085 [details]
[PATCH 2/3] Revert "kernel: softirq: unlock with irqs on"

This reverts commit 93bc0e814e212229047d2b5c9690493ef8d9da2e.

Signed-off-by: Daniel Bristot de Oliveira <bristot>

Comment 15 Daniel Bristot de Oliveira 2017-06-12 17:59:31 UTC
Created attachment 1287086 [details]
[PATCH 3/3] Revert "kernel: migrate_disable() do fastpath in atomic & irqs-off"

This reverts commit 334af746c8f6bebc99a9687b144e14c250f5d6ec.

Signed-off-by: Daniel Bristot de Oliveira <bristot>

Comment 16 Daniel Bristot de Oliveira 2017-06-14 14:09:37 UTC
Returning to assigned, we have the bug reported in the BZ1441552. I checked and this bug is also present upstream, and I am working on it.

-- Daniel

Comment 20 Daniel Bristot de Oliveira 2017-08-09 17:04:14 UTC
The maintainer replied saying he will apply the patch to the next -rt release.

-- Daniel

Comment 24 Daniel Bristot de Oliveira 2019-05-15 14:29:59 UTC
We thought on backporting this as a proactive action, but since we created, no real problem occurred to justify this BZ to become a priority. So, let's close this BZ... Feel free to open it if any problem justifies it.