Bug 173203 - Fix i_sem(aphore) logic in error code path
Fix i_sem(aphore) logic in error code path
Status: CLOSED ERRATA
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: gfs (Show other bugs)
4
All Linux
medium Severity medium
: ---
: ---
Assigned To: Wendy Cheng
GFS Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-11-14 17:19 EST by Wendy Cheng
Modified: 2010-01-11 22:08 EST (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-10-16 23:42:15 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Wendy Cheng 2005-11-14 17:19:36 EST
Description of problem:
The gfs_write() obtains inode i_sem(aphore) before passing the logic into lower
level routine such as do_write_direct(). A patch was added via bugzilla 171488
to solve a deadlock issue that drops the i_sem before requesting an exclusive
glock within do_write_direct(). It re-locks the i_sem after the exclusive glock.
The "down(&inode->i_sem)" call should have placed right after the
gfs_glock_nq_m() call but it is currently added after the if (error) clause:

 restart:
        up(&inode->i_sem);
                                                                                
        gfs_holder_init(ip->i_gl, state, 0, &ghs[num_gh]);
                                                                                
        error = gfs_glock_nq_m(num_gh + 1, ghs);
        if (error)
                goto out;
                                                                                
        down(&inode->i_sem);

If gfs_glock_nq_m() returns error (it rarely happens though), the call will
return back to gfs_write() without i_sem locked. This semaphore count will not
be correct after that. We need to add a new patch to correc this issue as:

--- gfs.old/src/gfs/ops_file.c  2005-11-11 10:03:09.000000000 -0500
+++ gfs.new/src/gfs/ops_file.c  2005-11-11 10:04:24.000000000 -0500
@@ -603,11 +603,12 @@ do_write_direct(struct file *file, char
        gfs_holder_init(ip->i_gl, state, 0, &ghs[num_gh]);
                                                                                
        error = gfs_glock_nq_m(num_gh + 1, ghs);
-       if (error)
-               goto out;
                                                                                
        down(&inode->i_sem);
 
+       if (error)
+               goto out;
+
        error = -EINVAL;
        if (gfs_is_jdata(ip))
                goto out_gunlock;


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Since feist has included the patch in bugzilla 171488 into his new build, can't
re-use 171488. Open this new bugzilla to log this change.
Comment 1 Wendy Cheng 2005-11-14 17:22:05 EST
Found this issue while doing self code review.  
Comment 2 Wendy Cheng 2005-11-14 17:50:53 EST
Changes checked into CVS RHEL 4 branch. 

Note You need to log in before you can comment on or make changes to this bug.