Bug 171488

Summary: GFS: processes deadlock using Direct IO
Product: [Retired] Red Hat Cluster Suite Reporter: Wendy Cheng <nobody+wcheng>
Component: gfsAssignee: Wendy Cheng <nobody+wcheng>
Status: CLOSED ERRATA QA Contact: GFS Bugs <gfs-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 4CC: rkenna
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2006-0169 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-01-06 20:20:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 164915    
Attachments:
Description Flags
The patch to fix this issue - gfs_i_sem.patch. none

Description Wendy Cheng 2005-10-21 19:24:31 UTC
Created attachment 120268 [details]
The patch to fix this issue - gfs_i_sem.patch.

Comment 1 Wendy Cheng 2005-10-21 19:24:31 UTC
Description of problem:
When there are multiple processes read and/or write to *the same file* on *the
SMP same node* under direct IO, they could end up deadlocking each other with
the following thread back trace:

Writer:
 #0 [1002a1edb18] schedule at ffffffff8030332e
 #1 [1002a1edbf0] wait_for_completion at ffffffff803034ff
 #2 [1002a1edc70] glock_wait_internal at ffffffffa02e34a5
 #3 [1002a1edcb0] gfs_glock_nq at ffffffffa02e3cd2
 #4 [1002a1edcf0] do_write_direct at ffffffffa02f7310
 #5 [1002a1edd80] avc_has_perm at ffffffff801ce330
 #6 [1002a1eddb0] write_chan at ffffffff80228b97
 #7 [1002a1ede00] walk_vm at ffffffffa02f6cca
 #8 [1002a1eded0] gfs_write at ffffffffa02f7deb
 #9 [1002a1edf10] vfs_write at ffffffff80177098
#10 [1002a1edf40] sys_pwrite64 at ffffffff80177275
#11 [1002a1edf80] system_call at ffffffff80110052

Reader:
 #0 [1001d8bf988] schedule at ffffffff8030332e
 #1 [1001d8bfa60] __sched_text_start at ffffffff80302637
 #2 [1001d8bfac0] __down_failed at ffffffff80303c13
 #3 [1001d8bfb10] .text.lock.direct_io at ffffffff80198222
 #4 [1001d8bfbb0] gfs_direct_IO at ffffffffa02f6093
 #5 [1001d8bfc30] generic_file_direct_IO at ffffffff8015a12e
 #6 [1001d8bfc70] __generic_file_aio_read at ffffffff8015aa69
 #7 [1001d8bfce0] generic_file_read at ffffffff8015acbe
 #8 [1001d8bfd60] glock_wait_internal at ffffffffa02e360d
 #9 [1001d8bfda0] gfs_glock_nq at ffffffffa02e3cd2
#10 [1001d8bfde0] do_read_direct at ffffffffa02f7092
#11 [1001d8bfe40] walk_vm at ffffffffa02f6cca
#12 [1001d8bff10] vfs_read at ffffffff80176ebb
#13 [1001d8bff40] sys_pread64 at ffffffff801771ff
#14 [1001d8bff80] system_call at ffffffff80110052

Note that my test machine is x86_64 but this is a platform independent problem. 

Version-Release number of selected component (if applicable):
GFS-kernel-smp-2.6.9-42.1

How reproducible:
Will upload the test case (originally written by Stephen Tweedie -
sct) with trivial modifications to run on GFS. 

Steps to Reproduce:
1. Compile the test program (make)
2. On one GFS node (SMP), run 
   shell> ./verify-data -w file-name-on-gfs-file-system
3. On another GFS node (SMP), run
   shell> ./verify-data -r same-file-as-in-step-2.
4. When the output frozen, using "crash" command to check what the threads.
5. At this point on, any access to the same file will hang forever, includeing
the above two processes. The processes are un-killable and filesytem can't
umount until reboot.

Additional info:
The deadlock is caused by:

1. Writer has obtained VFS layer's i_sem(aphore), then tries to get the
exclusive gfs glock.
2. Reader has obtained gfs shared glock and passes its control to blockdev layer.
3. Block device layer prepares the direct IO under reader's context that tries
to obtain the i_sem(aphore). 
4. Deadlock while writer waits for exclusive gfs glock and reader waits for i_sem.

Comment 2 Wendy Cheng 2005-10-21 19:44:36 UTC
I had been "fixing" this problem from reader side that:

1. Ask GFS's do_read_direct() to grab/release i_sem before glock (GFS-kernel).
2. Tell __blockdev_direct_IO to bypass all the i_sem grabbing/releasing code
(2.6.9-22.ELsmp base kernel).

This brought in un-necessary complications (such as performance hits and/or
packaging issues since we had to change base kernel). It just occurred to me
today that this could be easily fixed in GFS writer code as the uploaded patch. 

After the testing, I'm pretty confident that this can be shipped together with
bz 169154 without any base kernel complication. 

Comment 4 Wendy Cheng 2005-12-02 19:41:32 UTC
Code in CVS already. 

Comment 5 Wendy Cheng 2005-12-02 19:42:34 UTC
Note that this is read-write deadlock, differing from write-truncate deadlock. 

Comment 8 Red Hat Bugzilla 2006-01-06 20:20:13 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0169.html