Bug 173913 - GFS deadlock - gfs_write (do_write_direct) and gfs_setattr (do_truncate)
GFS deadlock - gfs_write (do_write_direct) and gfs_setattr (do_truncate)
Status: CLOSED ERRATA
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: gfs (Show other bugs)
4
All Linux
medium Severity high
: ---
: ---
Assigned To: Wendy Cheng
GFS Bugs
:
Depends On:
Blocks: 164915
  Show dependency treegraph
 
Reported: 2005-11-22 10:40 EST by Wendy Cheng
Modified: 2010-01-11 22:08 EST (History)
1 user (show)

See Also:
Fixed In Version: RHBA-2006-0234
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-03-09 14:46:25 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
gfs_kernel_i_alloc.patch (1.31 KB, patch)
2005-11-23 00:39 EST, Wendy Cheng
no flags Details | Diff
gfs_i_alloc.patch (1.78 KB, patch)
2005-11-23 00:40 EST, Wendy Cheng
no flags Details | Diff

  None (edit)
Description Wendy Cheng 2005-11-22 10:40:27 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.7) Gecko/20050427 Red Hat/1.7.7-1.1.3.4

Description of problem:
GFS deadlock due to the following lock sequence:

(vfs) do_truncate {
        down(.... i_sem);
        down_write(.... i_alloc_sem);
        err = notify_change(dentry, &newattrs); -> call gfs_setattr -> glock
        up_write(.... i_alloc_sem);
        up(.... i_sem);
      }

(gfs) do_write_direct {
         glock
         down(.... i_sem)
         __blockdev_direct_IO -> down(.... i_alloc_sem) -> up(.... i_alloc_sem)
         up(.... i_sem)
       }
(gfs) gfs_read {
         glock
         _blockdev_direct_IO -> down(.... i_sem) -> up(.... i_sem)
       }
         



Version-Release number of selected component (if applicable):
2.6.9-22.ELsmp

How reproducible:
Sometimes

Steps to Reproduce:
1. (Running Oracle performance stress test)
2. Havn't tried using simpler test case.
  

Actual Results:  Oracle process hang. 

Additional info:

oracle        D 0000010061fdfe38     0  4588      1          4619  4571 (NOTLB)
Call Trace:<ffffffff803034ff>{wait_for_completion+167} <ffffffff80132e8d>{default_wake_function+0}
           <ffffffffa019c83b>{:gfs:do_write_direct+1523} <ffffffff80132e8d>{default_wake_function+0}
           <ffffffffa01884a5>{:gfs:glock_wait_internal+350} <ffffffffa0188cd2>{:gfs:gfs_glock_nq+961}
           <ffffffffa0188efb>{:gfs:gfs_glock_nq_init+20} <ffffffffa01a09ae>{:gfs:gfs_setattr+75}
           <ffffffff80132ede>{__wake_up_common+67} <ffffffff80190126>{notify_change+340}
           <ffffffff801756ed>{do_truncate+135} <ffffffff801759be>{sys_ftruncate+248}
oracle        D 00000100e69f3270     0  4548      1          4550  4546 (NOTLB)
Call Trace:<ffffffff8030353f>{wait_for_completion+231} <ffffffff8030353f>{wait_for_completion+231}
           <ffffffff80132e8d>{default_wake_function+0} <ffffffff80302637>{__down+147}
           <ffffffff80132e8d>{default_wake_function+0} <ffffffffa01884a5>{:gfs:glock_wait_internal+350}
           <ffffffff80303c13>{__down_failed+53} <ffffffffa019daa7>{:gfs:.text.lock.ops_file+15}
           <ffffffff801313f5>{recalc_task_prio+337} <ffffffff80131483>{activate_task+124}
           <ffffffff80131931>{try_to_wake_up+734} <ffffffffa019bcca>{:gfs:walk_vm+265}
           <ffffffffa019c248>{:gfs:do_write_direct+0} <ffffffffa019ce66>{:gfs:gfs_write+194}
           <ffffffff80177098>{vfs_write+207} <ffffffff80177275>{sys_pwrite64+86}
Comment 1 Wendy Cheng 2005-11-23 00:36:03 EST
Got patch ready - tested by:

1. Run sct's Verify-data (I did a small tweak so it can run on top of multiple
nodes) doing a forever write on GFS file.
2. Run a simple program that does a forever ftruncate() on the very same file as
being written by Verify-data.

The processes get locked up easily without the patch. With the patch, it seems
to be able to run forever.

Will send the RPMs to Oracle test to further verify the patch.
Comment 2 Wendy Cheng 2005-11-23 00:39:55 EST
Created attachment 121385 [details]
gfs_kernel_i_alloc.patch

Base kernel patch.
Comment 3 Wendy Cheng 2005-11-23 00:40:37 EST
Created attachment 121386 [details]
gfs_i_alloc.patch

GFS patch.
Comment 4 Wendy Cheng 2005-12-14 15:43:45 EST
Code checked into CVS. Move into bugzilla into Modified state. 
Comment 7 Red Hat Bugzilla 2006-03-09 14:46:25 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0234.html

Note You need to log in before you can comment on or make changes to this bug.