Bug 2008541

Summary:	gfs2: schedule while atomic
Product:	Red Hat Enterprise Linux 9	Reporter:	Alexander Aring <aahringo>
Component:	kernel	Assignee:	Andreas Gruenbacher <agruenba>
kernel sub component:	GFS-GFS2	QA Contact:	cluster-qe <cluster-qe>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	unspecified
Priority:	unspecified	CC:	adas, agruenba, gfs2-maint
Version:	9.0	Keywords:	Triaged
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	kernel-5.14.0-59.el9	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2022-05-17 15:40:24 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Alexander Aring 2021-09-28 13:59:04 UTC

Description of problem:

In some cases there is a schedule while atomic when iterating over the glock hash which helds rcu_read_lock() and there is a "dlm_unlock()" call.

The problem is that "thaw_glock()" will call "sdp->sd_lockstruct.ls_ops->lm_put_lock(gl);" which ends in dlm case in callback gdlm_put_lock() and this finally calls in some cases "dlm_unlock()" but "dlm_unlock()" can't be called from atomic context because semaphores, mutexes, etc.

How reproducible:

It's hard to reproduce but happens on unmount (because thaw_glock() is called then)

With the right kernel settings you will get:


[  993.426039] =============================
[  993.426765] WARNING: suspicious RCU usage
[  993.427522] 5.14.0-rc2+ #265 Tainted: G        W
[  993.428492] -----------------------------
[  993.429237] include/linux/rcupdate.h:328 Illegal context switch in RCU read-side critical section!
[  993.430860]
               other info that might help us debug this:

[  993.432304]
               rcu_scheduler_active = 2, debug_locks = 1
[  993.433493] 3 locks held by kworker/u32:2/194:
[  993.434319]  #0: ffff888109c23148 ((wq_completion)gfs2_control){+.+.}-{0:0}, at: process_one_work+0x452/0xad0
[  993.436135]  #1: ffff888109507e10 ((work_completion)(&(&sdp->sd_control_work)->work)){+.+.}-{0:0}, at: process_one_work+0x452/0xad0
[  993.438081]  #2: ffffffff85ee05c0 (rcu_read_lock){....}-{1:2}, at: rhashtable_walk_start_check+0x0/0x520
[  993.439665]
               stack backtrace:
[  993.440402] CPU: 13 PID: 194 Comm: kworker/u32:2 Tainted: G        W         5.14.0-rc2+ #265
[  993.441786] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.14.0-1.module+el8.6.0+12648+6ede71a5 04/01/2014
[  993.443304] Workqueue: gfs2_control gfs2_control_func
[  993.444147] Call Trace:
[  993.444565]  dump_stack_lvl+0x56/0x6f
[  993.445186]  ___might_sleep+0x191/0x1e0
[  993.445838]  down_read+0x7b/0x460
[  993.446400]  ? down_write_killable+0x2b0/0x2b0
[  993.447141]  ? find_held_lock+0xb3/0xd0
[  993.447794]  ? do_raw_spin_unlock+0xa2/0x130
[  993.448521]  dlm_unlock+0x9e/0x1a0
[  993.449102]  ? dlm_lock+0x260/0x260
[  993.449695]  ? pvclock_clocksource_read+0xdc/0x180
[  993.450495]  ? kvm_clock_get_cycles+0x14/0x20
[  993.451210]  ? ktime_get_with_offset+0xc6/0x170
[  993.451971]  gdlm_put_lock+0x29e/0x2d0
[  993.452599]  ? gfs2_cancel_delete_work+0x40/0x40
[  993.453361]  glock_hash_walk+0x16c/0x180
[  993.454014]  ? gfs2_glock_seq_stop+0x30/0x30
[  993.454754]  process_one_work+0x55e/0xad0
[  993.455443]  ? pwq_dec_nr_in_flight+0x110/0x110
[  993.456219]  worker_thread+0x65/0x5e0
[  993.456839]  ? process_one_work+0xad0/0xad0
[  993.457524]  kthread+0x1ed/0x220
[  993.458067]  ? set_kthread_struct+0x80/0x80
[  993.458764]  ret_from_fork+0x22/0x30
[  993.459426] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1352
[  993.460816] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 194, name: kworker/u32:2
[  993.462172] 3 locks held by kworker/u32:2/194:
[  993.462916]  #0: ffff888109c23148 ((wq_completion)gfs2_control){+.+.}-{0:0}, at: process_one_work+0x452/0xad0
[  993.464542]  #1: ffff888109507e10 ((work_completion)(&(&sdp->sd_control_work)->work)){+.+.}-{0:0}, at: process_one_work+0x452/0xad0
[  993.466467]  #2: ffffffff85ee05c0 (rcu_read_lock){....}-{1:2}, at: rhashtable_walk_start_check+0x0/0x520
[  993.468016] CPU: 13 PID: 194 Comm: kworker/u32:2 Tainted: G        W         5.14.0-rc2+ #265
[  993.469378] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.14.0-1.module+el8.6.0+12648+6ede71a5 04/01/2014

Comment 1 Alexander Aring 2021-09-28 14:03:36 UTC

https://listman.redhat.com/archives/cluster-devel/2021-September/msg00082.html

possible solution for it?

Comment 2 Andreas Gruenbacher 2021-10-06 20:16:34 UTC

Alex, I'm having difficulties following your problem description and what the actual call stack is; I don't see thaw_glock there at all.  In general, the glock code is pretty careful not do drop the final reference to a glock while holding rcu_read_lock or a spin lock; instead, whenever it needs to drop a reference that might be the final one in such a context, it delegates that to glock_work_func (see glock_work_func, gfs2_glock_queue_put, __gfs2_glock_queue_work).  Do you have any more data?

Comment 3 Andreas Gruenbacher 2021-10-08 11:44:25 UTC

Hmm, I see now that thaw_glock calls gfs2_glock_put when it hits a glock that doesn't need thawing.  So when gfs2_control_func calls gfs2_glock_thaw, that gfs2_glock_put can indeed lead to the bug you describe; it's only the stack trace that's been confusing me.

A quick fix is to call gfs2_glock_queue_put instead of gfs2_glock_put in thaw_glock.  Taking glock references unnecessarily during glock_hash_walk has always been ugly though, so maybe we should get rid of that instead.

Comment 4 Alexander Aring 2021-10-13 14:11:16 UTC

clearing needinfo, I think it's not necessary anymore.

Comment 11 Nate Straz 2022-02-07 22:50:12 UTC

Regression tests passed with kernel-5.14.0-54.mr242_220204_1900.el9.x86_64

Comment 15 Nate Straz 2022-04-06 18:53:54 UTC

Regression tests passed with kernel-5.14.0-70.2.1.el9_0.x86_64

Comment 17 errata-xmlrpc 2022-05-17 15:40:24 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (new packages: kernel), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:3907