Bug 1664257
| Summary: | BUG: scheduling while atomic: kworker/1:1/24117/0x00000002 | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Qiao Zhao <qzhao> | ||||||
| Component: | kernel-rt | Assignee: | Daniel Bristot de Oliveira <daolivei> | ||||||
| kernel-rt sub component: | Scheduler | QA Contact: | Qiao Zhao <qzhao> | ||||||
| Status: | CLOSED CURRENTRELEASE | Docs Contact: | |||||||
| Severity: | high | ||||||||
| Priority: | high | CC: | bhu, daolivei, jlelli, jshortt, pmatouse, williams | ||||||
| Version: | 8.0 | Flags: | rule-engine:
mirror+
|
||||||
| Target Milestone: | rc | ||||||||
| Target Release: | 8.0 | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | kernel-rt-4.18.0-58.rt9.110.el8 | Doc Type: | No Doc Update | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | |||||||||
| : | 1664380 (view as bug list) | Environment: | |||||||
| Last Closed: | 2019-06-14 00:54:35 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 1649545 | ||||||||
| Attachments: |
|
||||||||
|
Description
Qiao Zhao
2019-01-08 08:52:31 UTC
(In reply to Red Hat Bugzilla Rules Engine from comment #1) > Blocker requests need a proper justification to be processed and approved. > > Please answer the following questions (after reviewing with SST leads from > Dev/QE/PM) and provide whatever context might be helpful, including > requested target milestone: > * What is the risk with putting this into the release vs. doing it in the > next release? (For example, is this a regression?) This is a regression that *only* affects the RHEL8-RT product. > * Is this related to an MVP or Layered Product request, if so what? No, this is strictly a RHEL8-RT bug, and should be addressed in an RT build. > * Do you have signoff from your Dev/QE/PM leads on the status of this as an > exception for the team? Yes, this bug is due to a RHEL change calling get_cpu/put_cpu causing the RT spinlock to be acquired in atomic context (which is a bad thing). Changing to get_cpu_light/put_cpu_light fixes this for RT and remains the same for RHEL. Created attachment 1519296 [details]
[PATCH RT] padata: Make padata_do_serial() use get_cpu_light()
Patch to be submitted upstream after testing
[PATCH RT] padata: Make padata_do_serial() use get_cpu_light()
We hit the following BUG:
BUG: scheduling while atomic: kworker/1:1/24117/0x00000002
Preemption disabled at:
[<ffffffffb61fd824>] padata_do_serial+0x24/0x110
CPU: 1 PID: 24117 Comm: kworker/1:1 Not tainted 4.18.0-56.rt9.107.el8.x86_64 #1
Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 11/14/2017
Workqueue: pencrypt padata_parallel_worker
Call Trace:
dump_stack+0x5c/0x80
? padata_do_serial+0x24/0x110
__schedule_bug.cold.83+0x8e/0x9b
__schedule+0x5a0/0x680
schedule+0x39/0xd0
rt_spin_lock_slowlock_locked+0x10e/0x2b0
rt_spin_lock_slowlock+0x50/0x80
padata_do_serial+0x4d/0x110
padata_parallel_worker+0xaf/0xe0
process_one_work+0x183/0x3b0
? process_one_work+0x3b0/0x3b0
worker_thread+0x30/0x3d0
? process_one_work+0x3b0/0x3b0
kthread+0x112/0x130
? kthread_create_worker_on_cpu+0x70/0x70
ret_from_fork+0x35/0x40
and the cause is a spin_lock() taken inside a get_cpu() section.
Convert the get/put_cpu to get/put_cpu_light to fix the BUG while reducing the
preempt_disable section.
Signed-off-by: Daniel Bristot de Oliveira <bristot>
Cc: Sebastian Andrzej Siewior <bigeasy>
Cc: Thomas Gleixner <tglx>
Cc: Clark Williams <williams>
Cc: linux-rt-users.org
Cc: linux-crypto.org
Cc: linux-kernel.org
Created attachment 1519527 [details]
[PATCH RT] padata: Make padata_do_serial() use get_cpu_light()
We hit the following BUG:
BUG: scheduling while atomic: kworker/1:1/24117/0x00000002
Preemption disabled at:
[<ffffffffb61fd824>] padata_do_serial+0x24/0x110
CPU: 1 PID: 24117 Comm: kworker/1:1 Not tainted 4.18.0-56.rt9.107.el8.x86_64 #1
Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 11/14/2017
Workqueue: pencrypt padata_parallel_worker
Call Trace:
dump_stack+0x5c/0x80
? padata_do_serial+0x24/0x110
__schedule_bug.cold.83+0x8e/0x9b
__schedule+0x5a0/0x680
schedule+0x39/0xd0
rt_spin_lock_slowlock_locked+0x10e/0x2b0
rt_spin_lock_slowlock+0x50/0x80
padata_do_serial+0x4d/0x110
padata_parallel_worker+0xaf/0xe0
process_one_work+0x183/0x3b0
? process_one_work+0x3b0/0x3b0
worker_thread+0x30/0x3d0
? process_one_work+0x3b0/0x3b0
kthread+0x112/0x130
? kthread_create_worker_on_cpu+0x70/0x70
ret_from_fork+0x35/0x40
and the cause is a spin_lock() taken inside a get_cpu() section.
Convert the get/put_cpu to get/put_cpu_light to fix the BUG while reducing the
preempt_disable section.
Signed-off-by: Daniel Bristot de Oliveira <bristot>
Reviewed-by: Clark Williams <williams>
Cc: Sebastian Andrzej Siewior <bigeasy>
Cc: Thomas Gleixner <tglx>
Cc: Clark Williams <williams>
Cc: linux-rt-users.org
Cc: linux-crypto.org
Cc: linux-kernel.org
I managed to build kernel-rt-4.18.0-58.rt9.109.el8 + Daniel's fix: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=19731657 Qiao Zhao, could you please try to see if problem is still reproducible with the kernel above? (In reply to Juri Lelli from comment #8) > I managed to build kernel-rt-4.18.0-58.rt9.109.el8 + Daniel's fix: > https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=19731657 > > Qiao Zhao, could you please try to see if problem is still reproducible with > the kernel above? Hi Juri, Cool, the same test case passed by fix kernel, no call trace and panic. # uname -r 4.18.0-58.rt9.109.el8.bz1664257.x86_64 # ./pcrypt_aead01 tst_test.c:1085: INFO: Timeout per run is 0h 05m 00s pcrypt_aead01.c:71: PASS: Nothing bad appears to have happened Summary: passed 1 failed 0 skipped 0 warnings 0 -- Thanks, Qiao Fixed in build kernel-rt-4.18.0-58.rt9.110.el8 |