171488 – GFS: processes deadlock using Direct IO

Bug 171488 - GFS: processes deadlock using Direct IO

Summary: GFS: processes deadlock using Direct IO

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Cluster Suite
Classification:	Retired
Component:	gfs
Sub Component:
Version:	4
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Wendy Cheng
QA Contact:	GFS Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	164915
TreeView+	depends on / blocked

Reported:	2005-10-21 19:24 UTC by Wendy Cheng
Modified:	2010-01-12 03:08 UTC (History)
CC List:	1 user (show)
Fixed In Version:	RHBA-2006-0169
Clone Of:
Environment:
Last Closed:	2006-01-06 20:20:12 UTC
Embargoed:

Attachments	(Terms of Use)
The patch to fix this issue - gfs_i_sem.patch. (888 bytes, patch) 2005-10-21 19:24 UTC, Wendy Cheng	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2006:0169	0	normal	SHIPPED_LIVE	GFS-kernel bug fix update	2006-01-06 05:00:00 UTC

Description Wendy Cheng 2005-10-21 19:24:31 UTC

Created attachment 120268 [details]
The patch to fix this issue - gfs_i_sem.patch.

Comment 1 Wendy Cheng 2005-10-21 19:24:31 UTC

Description of problem:
When there are multiple processes read and/or write to *the same file* on *the
SMP same node* under direct IO, they could end up deadlocking each other with
the following thread back trace:

Writer:
 #0 [1002a1edb18] schedule at ffffffff8030332e
 #1 [1002a1edbf0] wait_for_completion at ffffffff803034ff
 #2 [1002a1edc70] glock_wait_internal at ffffffffa02e34a5
 #3 [1002a1edcb0] gfs_glock_nq at ffffffffa02e3cd2
 #4 [1002a1edcf0] do_write_direct at ffffffffa02f7310
 #5 [1002a1edd80] avc_has_perm at ffffffff801ce330
 #6 [1002a1eddb0] write_chan at ffffffff80228b97
 #7 [1002a1ede00] walk_vm at ffffffffa02f6cca
 #8 [1002a1eded0] gfs_write at ffffffffa02f7deb
 #9 [1002a1edf10] vfs_write at ffffffff80177098
#10 [1002a1edf40] sys_pwrite64 at ffffffff80177275
#11 [1002a1edf80] system_call at ffffffff80110052

Reader:
 #0 [1001d8bf988] schedule at ffffffff8030332e
 #1 [1001d8bfa60] __sched_text_start at ffffffff80302637
 #2 [1001d8bfac0] __down_failed at ffffffff80303c13
 #3 [1001d8bfb10] .text.lock.direct_io at ffffffff80198222
 #4 [1001d8bfbb0] gfs_direct_IO at ffffffffa02f6093
 #5 [1001d8bfc30] generic_file_direct_IO at ffffffff8015a12e
 #6 [1001d8bfc70] __generic_file_aio_read at ffffffff8015aa69
 #7 [1001d8bfce0] generic_file_read at ffffffff8015acbe
 #8 [1001d8bfd60] glock_wait_internal at ffffffffa02e360d
 #9 [1001d8bfda0] gfs_glock_nq at ffffffffa02e3cd2
#10 [1001d8bfde0] do_read_direct at ffffffffa02f7092
#11 [1001d8bfe40] walk_vm at ffffffffa02f6cca
#12 [1001d8bff10] vfs_read at ffffffff80176ebb
#13 [1001d8bff40] sys_pread64 at ffffffff801771ff
#14 [1001d8bff80] system_call at ffffffff80110052

Note that my test machine is x86_64 but this is a platform independent problem. 

Version-Release number of selected component (if applicable):
GFS-kernel-smp-2.6.9-42.1

How reproducible:
Will upload the test case (originally written by Stephen Tweedie -
sct) with trivial modifications to run on GFS. 

Steps to Reproduce:
1. Compile the test program (make)
2. On one GFS node (SMP), run 
   shell> ./verify-data -w file-name-on-gfs-file-system
3. On another GFS node (SMP), run
   shell> ./verify-data -r same-file-as-in-step-2.
4. When the output frozen, using "crash" command to check what the threads.
5. At this point on, any access to the same file will hang forever, includeing
the above two processes. The processes are un-killable and filesytem can't
umount until reboot.

Additional info:
The deadlock is caused by:

1. Writer has obtained VFS layer's i_sem(aphore), then tries to get the
exclusive gfs glock.
2. Reader has obtained gfs shared glock and passes its control to blockdev layer.
3. Block device layer prepares the direct IO under reader's context that tries
to obtain the i_sem(aphore). 
4. Deadlock while writer waits for exclusive gfs glock and reader waits for i_sem.

Comment 2 Wendy Cheng 2005-10-21 19:44:36 UTC

I had been "fixing" this problem from reader side that:

1. Ask GFS's do_read_direct() to grab/release i_sem before glock (GFS-kernel).
2. Tell __blockdev_direct_IO to bypass all the i_sem grabbing/releasing code
(2.6.9-22.ELsmp base kernel).

This brought in un-necessary complications (such as performance hits and/or
packaging issues since we had to change base kernel). It just occurred to me
today that this could be easily fixed in GFS writer code as the uploaded patch. 

After the testing, I'm pretty confident that this can be shipped together with
bz 169154 without any base kernel complication.

Comment 4 Wendy Cheng 2005-12-02 19:41:32 UTC

Code in CVS already.

Comment 5 Wendy Cheng 2005-12-02 19:42:34 UTC

Note that this is read-write deadlock, differing from write-truncate deadlock.

Comment 8 Red Hat Bugzilla 2006-01-06 20:20:13 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0169.html

Note You need to log in before you can comment on or make changes to this bug.