Red Hat Bugzilla – Bug 1340922
backport of the latest "printk: Make rt aware" from PREEMPT-RT
Last modified: 2016-11-03 15:52:31 EDT
Description of problem: The BZ1267425 reports a hotplug bug in the kernel-rt. The mentioned BZ was closed because we applied a workaround to avoid the deadlock that bugs the hotplug. We need the real fix for this problem, though. The real fix is already in place in the mainline -rt kernel, in the "printk: Make rt aware" patch. However, we depend on a non-rt-specific patch to apply the -rt specific patch in our kernel-rt. The non-rt-specific patch we need is: 5874af2 printk: enable interrupts before calling console_trylock_for_printk() The backport of this patch was requested in the BZ1340919. Once it is done, we can backport the current, e.g., from v4.4-rt, version of the: "printk: Make rt aware" PREEMPT-RT's patch. Version-Release number of selected component (if applicable): kernel-rt-3.10.0-327.18.2.rt56.223.el7_2 How reproducible: The workaround applied in the BZ1267425 inhibits the BUG. So it is not reproducible in the current -rt kernel.
Clark, When the patches from BZ1340919 land on RHEL source, we will need to revert the following patches on -rt (I think it is better to do it before merge with BZ1340919's patches): The old printk-rt-aware.patch from -rt. -------------------------%<------------------------------------ commit 733052ff18c6ce3573fd5da0643de1c768db94da Author: Thomas Gleixner <tglx@linutronix.de> Date: Wed Sep 19 14:50:37 2012 +0200 printk-rt-aware.patch Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ------------------------->%------------------------------------ The workaround patch: -------------------------%<------------------------------------ commit b9f02ada6804c62aff768eeb7453e95238f88f6d Author: Daniel Bristot de Oliveira <bristot@redhat.com> Date: Wed Apr 6 12:41:03 2016 -0300 printk: Prevent console freeze due to out-of-order deadlock [1269647] We have deadlock because the following two locks were taken in a different order: hotplug thread printk thread t1: hotplug lock (aquired) t2: console lock (aquired) t3: console lock (contention) t4: hotplug lock (contention) DEADLOCK! So, to avoid the deadlock we have to grab the hotplug lock prior to the console lock. The migrate_disable() is taking the hot plug lock in both cases, so this patch just disable migration before grab the console lock. I test this patch in the past, but I did not have time to test it now, so please test it :-) Finally, it may be a good idea to disable/enable migration prior to all up and down of the console semaphore, but I did not test it too. Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Clark Williams <williams@redhat.com> ------------------------->%------------------------------------ After the reverts and the merge, we need to add the 3.18-rt version of the printk-rt-aware.patch. This patch disables migration before up and down the console semaphore, fixing the problem mentioned in the workaround patch. Link to the patch on steven's stable tree: http://git.kernel.org/cgit/linux/kernel/git/rt/linux-stable-rt.git/commit/kernel/printk/printk.c?h=v3.18-rt-rebase&id=07aaac41e0ab6ea2e9e5007b9f1b5270f0b98254 you just need to remove the: -------------------------%<------------------------------------ if (do_cond_resched) cond_resched(); ------------------------->%------------------------------------ in the very end of the patch to make it apply. A build with these changes (RHEL's patches + reverts + new printk-rt-aware) https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=11289392 It booted fine -- Daniel
Created attachment 1176851 [details] Patch Patch backported - we need to wait for the patches from BZ1340919 to officially appear on RHEL's code base before apply it.
RHEL -495 build merged into RHEL-RT tree.
Created attachment 1179879 [details] change preempt_{disable,enable} to migrate_{disable,enable} to prevent latency spike small refinement to printk update; in vprintk_emit(), change the preempt_disable/preempt_enable pair to migrate_disable/migrate_enable to prevent latency spikes.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2016-2584.html