By code inspection, it is possible to see that this problem can also happen in RHEL7-rt.
Created attachment 1527249 [details] padata: Make padata_do_serial() use get_cpu_light() Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1664380 BrewBuild: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=20081122 Internal fix. We hit the following BUG in RHEL8: BUG: scheduling while atomic: kworker/1:1/24117/0x00000002 Preemption disabled at: [<ffffffffb61fd824>] padata_do_serial+0x24/0x110 CPU: 1 PID: 24117 Comm: kworker/1:1 Not tainted 4.18.0-56.rt9.107.el8.x86_64 #1 Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 11/14/2017 Workqueue: pencrypt padata_parallel_worker Call Trace: dump_stack+0x5c/0x80 ? padata_do_serial+0x24/0x110 __schedule_bug.cold.83+0x8e/0x9b __schedule+0x5a0/0x680 schedule+0x39/0xd0 rt_spin_lock_slowlock_locked+0x10e/0x2b0 rt_spin_lock_slowlock+0x50/0x80 padata_do_serial+0x4d/0x110 padata_parallel_worker+0xaf/0xe0 process_one_work+0x183/0x3b0 ? process_one_work+0x3b0/0x3b0 worker_thread+0x30/0x3d0 ? process_one_work+0x3b0/0x3b0 kthread+0x112/0x130 ? kthread_create_worker_on_cpu+0x70/0x70 ret_from_fork+0x35/0x40 and the cause is a spin_lock() taken inside a get_cpu() section. Convert the get/put_cpu to get/put_cpu_light to fix the BUG while reducing the preempt_disable section. As we also have this code on RHEL7, we also need this patch. This patch differs from RHEL8 one because there is only one get_cpu() usage. Signed-off-by: Daniel Bristot de Oliveira <bristot>
[Tiefu Li on 15 May 2019] I have done two different ways to verify the bug. Here is the first approach: Step 1.cd /mnt/tests/kernel/distribution/ltp/generic/ltp-full-20190115/testcases/kernel/crypto Step 2. Run the test :./pcrypt_aead01 All passed The second approach is: Step 1.cd /mnt/tests/kernel/distribution/ltp/generic/ Step 2. FILTERTESTS="cve-2017-18075" make run Both of the testing was conducted on 3.10.0-999.rt56.956.el7.x86_64 and kernel-rt-3.10.0-1010.rt56.968.el7.x86_64 respectively. I haven't seen any issue therefore I mark the bug as verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:2043