Bug 253768
Summary: | GFS2: deadlock on distributed mmap test case | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Ben Marzinski <bmarzins> | ||||||
Component: | kernel | Assignee: | Steve Whitehouse <swhiteho> | ||||||
Status: | CLOSED DUPLICATE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | low | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | rawhide | ||||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2007-08-25 14:01:02 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Ben Marzinski
2007-08-21 19:25:55 UTC
Created attachment 162049 [details]
Attempt to solve the bug
The stack trace posts what I think is a pretty clear picture of whats going on.
The run_queue() has tried to demote the lock and push out the pages, but since
its a writable mapping and a write has occurred, its got to write out the page,
so it tried to lock it, but since we are in a page fault, its already locked by
the higher layers.
My solution to this is to move the run_queue() call from gfs2_glock_dq()
directly to a workqueue. In fact its my eventual aim to move _all_ run_queue
calls to the work queue to avoid issues just like this. We have to be a bit
careful with the delay that we choose in order not to upset the very careful
balance we've previously established to fix the original bug, but again, I
think this will work well in that case.
If I'm right about the cause, then its something that will affect RHEL 5.1 as
well, so we ought to try and get it fixed now I think.
Created attachment 164161 [details]
Revised patch, that fixes some bugs in the previous version.
when the glock workqueue finishes it's work on the glock, it drops the
reference count. However gfs2_glock_dq() didn't ever grab a reference to the
glock before it scheduled the work. This was causing the glock's reference
count to reach zero
while it was still in use, and caused panics on mount. This version of the
patch adds the grabs a reference before it queues the work in gfs2_glock_dq()
The bug still exists with the patch. It looks like the same run_queue issue, but this one is in gfs2_glock_nq. Here is the call trace of the process with the glock. d_doio D f7d52800 2076 2906 2903 f52e7b14 00000082 00000000 f7d52800 00000000 f7d52800 f52e7000 ea7e195a 0000003f f5c787c0 f5c7896c c2010080 00000000 f5c6d040 06000000 c04d5d03 c23d406c c04d6b2c f52e7b48 0001ea25 00000000 c20fdc3c 0006101a c20fdc3c Call Trace: [<c04d5d03>] __generic_unplug_device+0x1d/0x1f [<c04d6b2c>] generic_unplug_device+0x15/0x22 [<c061ad46>] io_schedule+0x34/0x56 [<c0452324>] sync_page+0x0/0x3b [<c045235c>] sync_page+0x38/0x3b [<c061ae52>] __wait_on_bit_lock+0x2a/0x52 [<c0452316>] __lock_page+0x58/0x5e [<c0437aa6>] wake_bit_function+0x0/0x3c [<c0456ad4>] write_cache_pages+0x105/0x27b [<c0456778>] __writepage+0x0/0x21 [<f8c8c923>] gfs2_writepages+0x0/0x38 [gfs2] [<c0456c69>] generic_writepages+0x1f/0x26 [<c0456c90>] do_writepages+0x20/0x30 [<c0452cb8>] __filemap_fdatawrite_range+0x65/0x70 [<c0452ee6>] filemap_fdatawrite+0x23/0x27 [<f8c85dfb>] inode_go_sync+0x44/0xbe [gfs2] [<f8c849ba>] gfs2_glock_drop_th+0x1c/0x111 [gfs2] [<f8c84f4a>] run_queue+0xbf/0x249 [gfs2] [<c0420d30>] __wake_up+0x32/0x43 [<f8c8541f>] gfs2_glock_nq+0x154/0x19a [gfs2] [<c0434ce6>] insert_work+0x50/0x53 [<f8c865b1>] gfs2_glock_nq_atime+0x106/0x2ec [gfs2] [<f8c8c9ab>] gfs2_prepare_write+0x50/0x23b [gfs2] [<c045239c>] find_lock_page+0x1a/0x7e [<c04533b6>] generic_file_buffered_write+0x256/0x5d5 [<c0453bc6>] __generic_file_aio_write_nolock+0x491/0x4f0 [<c05e342a>] tcp_recvmsg+0x8ed/0x9f9 [<c061b11f>] __mutex_lock_slowpath+0x52/0x7a [<c0453c7a>] generic_file_aio_write+0x55/0xb3 [<c046d756>] do_sync_readv_writev+0xc1/0xfe [<c045606e>] get_page_from_freelist+0x23c/0x2b0 [<c0437a71>] autoremove_wake_function+0x0/0x35 [<c04629b4>] anon_vma_prepare+0x11/0xa5 [<c04e37e3>] copy_from_user+0x23/0x4f [<c046d611>] rw_copy_check_uvector+0x5c/0xb0 [<c046de53>] do_readv_writev+0xbc/0x187 [<c0453c25>] generic_file_aio_write+0x0/0xb3 [<c061d2be>] do_page_fault+0x269/0x58b [<c044cc89>] audit_syscall_entry+0x10d/0x137 [<c046df5b>] vfs_writev+0x3d/0x48 [<c046e370>] sys_writev+0x41/0x67 [<c0404e12>] syscall_call+0x7/0xb This is actually a different bug, although it looks similar. This can only happen in the upstream code as its the page lock/glock bug which we fixed ages ago in RHEL, but for which the upstream fix is in Nick Piggin's patch set. That patch set should have been pushed to Linus at his last merge window, but its still pending since Nick decided not to push it due to there being lots of other VM changes at the time. So I think we are probably safe to push the patch in its current form to upstream now as well as RHEL. I guess we can close this, or mark as a dup of the other bz? |