Description of problem: The gfs_write() obtains inode i_sem(aphore) before passing the logic into lower level routine such as do_write_direct(). A patch was added via bugzilla 171488 to solve a deadlock issue that drops the i_sem before requesting an exclusive glock within do_write_direct(). It re-locks the i_sem after the exclusive glock. The "down(&inode->i_sem)" call should have placed right after the gfs_glock_nq_m() call but it is currently added after the if (error) clause: restart: up(&inode->i_sem); gfs_holder_init(ip->i_gl, state, 0, &ghs[num_gh]); error = gfs_glock_nq_m(num_gh + 1, ghs); if (error) goto out; down(&inode->i_sem); If gfs_glock_nq_m() returns error (it rarely happens though), the call will return back to gfs_write() without i_sem locked. This semaphore count will not be correct after that. We need to add a new patch to correc this issue as: --- gfs.old/src/gfs/ops_file.c 2005-11-11 10:03:09.000000000 -0500 +++ gfs.new/src/gfs/ops_file.c 2005-11-11 10:04:24.000000000 -0500 @@ -603,11 +603,12 @@ do_write_direct(struct file *file, char gfs_holder_init(ip->i_gl, state, 0, &ghs[num_gh]); error = gfs_glock_nq_m(num_gh + 1, ghs); - if (error) - goto out; down(&inode->i_sem); + if (error) + goto out; + error = -EINVAL; if (gfs_is_jdata(ip)) goto out_gunlock; Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: Since feist has included the patch in bugzilla 171488 into his new build, can't re-use 171488. Open this new bugzilla to log this change.
Found this issue while doing self code review.
Changes checked into CVS RHEL 4 branch.