Bug 157710

Summary: rename(2) can deadlock on a distributed filesystem.
Product: Red Hat Enterprise Linux 4 Reporter: Michael Gaughen <mgaughen>
Component: kernelAssignee: Alexander Viro <aviro>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: davej, djn, elan, swhiteho
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-20 16:18:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Proposed patch to fix rename(2) deadlock. none

Description Michael Gaughen 2005-05-13 22:14:55 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.2) Gecko/20040803

Description of problem:
A problem with distributed filesystems is that there is no guarantee that a
path_lookup() will return a valid dentry if another node is executing a rename(2) on that same path hierarchy.  lock_rename() performs this check:        

  struct dentry *lock_rename(struct dentry *p1, struct dentry *p2)
  {
        ...                                       
        if (p1 == p2) {
                down(&p1->d_inode->i_sem);
                return NULL;
        }
        ...

and in the case of a distributed filesystem, the dentries (p1 and p2) can be different, yet refer to the same inode.  In that case, the above check will
fail, and a later attempt to do:

        ...
        down(&p2->d_inode->i_sem);
        down(&p1->d_inode->i_sem);
        ...

will result in an attempt to down() the *same* ->i_sem twice, resulting in a
deadlock.


Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
I don't have a good test case to reproduce this.  It requires multiple nodes,
performing renames on the same path hierarchy, on a distributed filesystem.
  

Additional info:

Comment 1 Michael Gaughen 2005-05-13 22:16:39 UTC
Created attachment 114367 [details]
Proposed patch to fix rename(2) deadlock.

Instead of comparing the two dentries for equality, the patch changes
lock_rename() and unlock_rename() to compare the dentries ->i_sem.

Comment 2 Alexander Viro 2005-06-14 09:08:47 UTC
Which distributed fs are we talking about and what other changes of
locking scheme does it make?  If we ever get multiple dentries for
a directory inode, we are in much more trouble than just lock_rename()
deadlock.

Comment 3 Michael Gaughen 2005-06-22 19:41:36 UTC
We are talking about PolyServe's PSFS filesystem.  I haven't tried to reproduce
this problem on other distributed filesystems (eg. GFS), so I can't say for sure
whether it would encounter this deadlock, though it seems likely.  The problem
is that there is no guarantee that the path_lookup()s, inside of do_rename(),
will return valid old/new dentry/inode pairs when multiple nodes are renaming
the same path hierarchy.  And (at least for us) that is alright as our
filesystem can deal with that.  However, lock_rename() deadlocks before we are
even called.  Of course this problem doesn't exist on a single node, and may or
may not exist on other distributed filesystems.

Comment 4 Jiri Pallich 2012-06-20 16:18:03 UTC
Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life. 
Please See https://access.redhat.com/support/policy/updates/errata/

If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue.