Bug 1340922

Summary: backport of the latest "printk: Make rt aware" from PREEMPT-RT
Product: Red Hat Enterprise Linux 7 Reporter: Daniel Bristot de Oliveira <daolivei>
Component: kernel-rtAssignee: Clark Williams <williams>
kernel-rt sub component: Other QA Contact: Jiri Kastner <jkastner>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: bhu, jshortt
Version: 7.3   
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-03 19:52:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1340919    
Bug Blocks: 1274397    
Attachments:
Description Flags
Patch
none
change preempt_{disable,enable} to migrate_{disable,enable} to prevent latency spike none

Description Daniel Bristot de Oliveira 2016-05-30 17:32:27 UTC
Description of problem:

The BZ1267425 reports a hotplug bug in the kernel-rt. The mentioned BZ was closed because we applied a workaround to avoid the deadlock that bugs the hotplug. We need the real fix for this problem, though.

The real fix is already in place in the mainline -rt kernel, in the "printk: Make rt aware" patch. However, we depend on a non-rt-specific patch to apply the -rt specific patch in our kernel-rt. The non-rt-specific patch we need is:

5874af2 printk: enable interrupts before calling console_trylock_for_printk()

The backport of this patch was requested in the BZ1340919. Once it is done, we can backport the current, e.g., from v4.4-rt, version of the:

"printk: Make rt aware"

PREEMPT-RT's patch.

Version-Release number of selected component (if applicable):
kernel-rt-3.10.0-327.18.2.rt56.223.el7_2 

How reproducible:
The workaround applied in the BZ1267425 inhibits the BUG. So it is not reproducible in the current -rt kernel.

Comment 2 Daniel Bristot de Oliveira 2016-06-30 03:53:41 UTC
Clark,

When the patches from BZ1340919 land on RHEL source, we will need to
revert the following patches on -rt (I think it is better to do it before
merge with BZ1340919's patches):

The old printk-rt-aware.patch from -rt.
-------------------------%<------------------------------------
commit 733052ff18c6ce3573fd5da0643de1c768db94da
Author: Thomas Gleixner <tglx>
Date:   Wed Sep 19 14:50:37 2012 +0200

    printk-rt-aware.patch
    
    Signed-off-by: Thomas Gleixner <tglx>
------------------------->%------------------------------------

The workaround patch:
-------------------------%<------------------------------------
commit b9f02ada6804c62aff768eeb7453e95238f88f6d
Author: Daniel Bristot de Oliveira <bristot>
Date:   Wed Apr 6 12:41:03 2016 -0300

    printk: Prevent console freeze due to out-of-order deadlock [1269647]
    
    We have deadlock because the following two locks were taken
    in a different order:
    
              hotplug thread                         printk thread
    t1:   hotplug lock (aquired)
    t2:                                  console lock (aquired)
    t3:  console lock (contention)
    t4:                                  hotplug lock (contention)
                             DEADLOCK!
    
    So, to avoid the deadlock we have to grab the hotplug lock prior
    to the console lock. The migrate_disable() is taking the hot
    plug lock in both cases, so this patch just disable migration
    before grab the console lock.
    
    I test this patch in the past, but I did not have time to test
    it now, so please test it :-)
    
    Finally, it may be a good idea to disable/enable migration prior
    to all up and down of the console semaphore, but I did not test
    it too.
    
    Signed-off-by: Daniel Bristot de Oliveira <bristot>
    Signed-off-by: Clark Williams <williams>
------------------------->%------------------------------------

After the reverts and the merge, we need to add the 3.18-rt version of
the printk-rt-aware.patch.

This patch disables migration before up and down the console semaphore,
fixing the problem mentioned in the workaround patch.

Link to the patch on steven's stable tree:
http://git.kernel.org/cgit/linux/kernel/git/rt/linux-stable-rt.git/commit/kernel/printk/printk.c?h=v3.18-rt-rebase&id=07aaac41e0ab6ea2e9e5007b9f1b5270f0b98254

you just need to remove the:
-------------------------%<------------------------------------
 		if (do_cond_resched)
 			cond_resched();
------------------------->%------------------------------------

in the very end of the patch to make it apply.

A build with these changes (RHEL's patches + reverts +
new printk-rt-aware)
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=11289392

It booted fine

-- Daniel

Comment 3 Daniel Bristot de Oliveira 2016-07-06 12:18:46 UTC
Created attachment 1176851 [details]
Patch

Patch backported - we need to wait for the patches from BZ1340919 to officially appear on RHEL's code base before apply it.

Comment 4 Clark Williams 2016-07-12 18:33:46 UTC
RHEL -495 build merged into RHEL-RT tree.

Comment 6 Clark Williams 2016-07-14 14:16:51 UTC
Created attachment 1179879 [details]
change preempt_{disable,enable} to migrate_{disable,enable} to prevent latency spike

small refinement to printk update; in vprintk_emit(), change the preempt_disable/preempt_enable pair to migrate_disable/migrate_enable to prevent latency spikes.

Comment 9 errata-xmlrpc 2016-11-03 19:52:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2584.html