Bug 848240

Summary: inodelk hang from marker_rename_release_newp_lock
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Vidya Sakar <vinaraya>
Component: glusterfsAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED WONTFIX QA Contact: storage-qa-internal <storage-qa-internal>
Severity: high Docs Contact:
Priority: medium    
Version: 2.0CC: aavati, gluster-bugs, jaw171, jdarcy, rwheeler, saujain, vagarwal, vbellur, vmallika
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 833586 Environment:
Last Closed: 2015-01-13 11:39:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 833586    
Bug Blocks:    

Description Vidya Sakar 2012-08-15 01:28:03 UTC
+++ This bug was initially created as a clone of Bug #833586 +++

rename has hung on the server side because of the following lockup and client witnesses a RENAME() RPC timeout + bail out.

Excerpt from http://www.pitt.edu/~jaw171/brick-1.1429.dump

[global.callpool.stack.1.frame.1]
ref_count=1
translator=vol_home-server
complete=0

[global.callpool.stack.1.frame.2]
ref_count=0
translator=vol_home-locks
complete=1
parent=vol_home-io-threads
wind_from=iot_inodelk_wrapper
wind_to=FIRST_CHILD (this)->fops->inodelk
unwind_from=pl_common_inodelk
unwind_to=iot_inodelk_cbk

[global.callpool.stack.1.frame.3]
ref_count=0
translator=vol_home-io-threads
complete=1
parent=vol_home-index
wind_from=default_inodelk
wind_to=FIRST_CHILD(this)->fops->inodelk
unwind_from=iot_inodelk_cbk
unwind_to=default_inodelk_cbk

[global.callpool.stack.1.frame.4]
ref_count=0
translator=vol_home-index
complete=1
parent=vol_home-marker
wind_from=marker_rename_release_oldp_lock
wind_to=FIRST_CHILD(this)->fops->inodelk
unwind_from=default_inodelk_cbk
unwind_to=marker_rename_release_newp_lock

[global.callpool.stack.1.frame.5]
ref_count=0
translator=vol_home-posix
complete=1
parent=vol_home-access-control
wind_from=posix_acl_getxattr
wind_to=FIRST_CHILD(this)->fops->getxattr
unwind_from=posix_getxattr
unwind_to=posix_acl_getxattr_cbk

[global.callpool.stack.1.frame.6]
ref_count=0
translator=vol_home-access-control
complete=1
parent=vol_home-locks
wind_from=pl_getxattr
wind_to=FIRST_CHILD(this)->fops->getxattr
unwind_from=posix_acl_getxattr_cbk
unwind_to=pl_getxattr_cbk

[global.callpool.stack.1.frame.7]
ref_count=0
translator=vol_home-locks
complete=1
parent=vol_home-io-threads
wind_from=iot_getxattr_wrapper
wind_to=FIRST_CHILD (this)->fops->getxattr
unwind_from=pl_getxattr_cbk
unwind_to=iot_getxattr_cbk

[global.callpool.stack.1.frame.8]
ref_count=0
translator=vol_home-io-threads
complete=1
parent=vol_home-index
wind_from=index_getxattr
wind_to=FIRST_CHILD(this)->fops->getxattr
unwind_from=iot_getxattr_cbk
unwind_to=default_getxattr_cbk

[global.callpool.stack.1.frame.9]
ref_count=0
translator=vol_home-index
complete=1
parent=vol_home-marker
wind_from=marker_get_oldpath_contribution
wind_to=FIRST_CHILD(this)->fops->getxattr
unwind_from=default_getxattr_cbk
unwind_to=marker_get_newpath_contribution

[global.callpool.stack.1.frame.10]
ref_count=0
translator=vol_home-locks
complete=1
parent=vol_home-io-threads
wind_from=iot_inodelk_wrapper
wind_to=FIRST_CHILD (this)->fops->inodelk
unwind_from=pl_common_inodelk
unwind_to=iot_inodelk_cbk

[global.callpool.stack.1.frame.11]
ref_count=0
translator=vol_home-io-threads
complete=1
parent=vol_home-index
wind_from=default_inodelk
wind_to=FIRST_CHILD(this)->fops->inodelk
unwind_from=iot_inodelk_cbk
unwind_to=default_inodelk_cbk

[global.callpool.stack.1.frame.12]
ref_count=0
translator=vol_home-index
complete=1
parent=vol_home-marker
wind_from=marker_rename
wind_to=FIRST_CHILD(this)->fops->inodelk
unwind_from=default_inodelk_cbk
unwind_to=marker_rename_inodelk_cbk

[global.callpool.stack.1.frame.13]
ref_count=0
translator=vol_home-marker
complete=0
parent=/brick/1
wind_from=io_stats_rename
wind_to=FIRST_CHILD(this)->fops->rename
unwind_to=io_stats_rename_cbk

[global.callpool.stack.1.frame.14]
ref_count=1
translator=/brick/1
complete=0
parent=vol_home-server
wind_from=server_rename_resume
wind_to=bound_xl->fops->rename
unwind_to=server_rename_cbk

[global.callpool.stack.1.frame.15]
ref_count=0
translator=vol_home-posix
complete=1
parent=vol_home-access-control
wind_from=posix_acl_lookup
wind_to=FIRST_CHILD (this)->fops->lookup
unwind_from=posix_lookup
unwind_to=posix_acl_lookup_cbk

[global.callpool.stack.1.frame.16]
ref_count=0
translator=vol_home-access-control
complete=1
parent=vol_home-locks
wind_from=pl_lookup
wind_to=FIRST_CHILD(this)->fops->lookup
unwind_from=posix_acl_lookup_cbk
unwind_to=pl_lookup_cbk

[global.callpool.stack.1.frame.17]
ref_count=0
translator=vol_home-locks
complete=1
parent=vol_home-io-threads
wind_from=iot_lookup_wrapper
wind_to=FIRST_CHILD (this)->fops->lookup
unwind_from=pl_lookup_cbk
unwind_to=iot_lookup_cbk

[global.callpool.stack.1.frame.18]
ref_count=0
translator=vol_home-io-threads
complete=1
parent=vol_home-index
wind_from=index_lookup
wind_to=FIRST_CHILD(this)->fops->lookup
unwind_from=iot_lookup_cbk
unwind_to=default_lookup_cbk

[global.callpool.stack.1.frame.19]
ref_count=0
translator=vol_home-index
complete=1
parent=vol_home-marker
wind_from=marker_lookup
wind_to=FIRST_CHILD(this)->fops->lookup
unwind_from=default_lookup_cbk
unwind_to=marker_lookup_cbk

[global.callpool.stack.1.frame.20]
ref_count=0
translator=vol_home-marker
complete=1
parent=/brick/1
wind_from=io_stats_lookup
wind_to=FIRST_CHILD(this)->fops->lookup
unwind_from=marker_lookup_cbk
unwind_to=io_stats_lookup_cbk

[global.callpool.stack.1.frame.21]
ref_count=0
translator=/brick/1
complete=1
parent=vol_home-server
wind_from=resolve_gfid_cbk
wind_to=BOUND_XL (frame)->fops->lookup
unwind_from=io_stats_lookup_cbk
unwind_to=resolve_gfid_entry_cbk

[global.callpool.stack.1.frame.22]
ref_count=0
translator=vol_home-posix
complete=1
parent=vol_home-access-control
wind_from=posix_acl_lookup
wind_to=FIRST_CHILD (this)->fops->lookup
unwind_from=posix_lookup
unwind_to=posix_acl_lookup_cbk

[global.callpool.stack.1.frame.23]
ref_count=0
translator=vol_home-access-control
complete=1
parent=vol_home-locks
wind_from=pl_lookup
wind_to=FIRST_CHILD(this)->fops->lookup
unwind_from=posix_acl_lookup_cbk
unwind_to=pl_lookup_cbk

[global.callpool.stack.1.frame.24]
ref_count=0
translator=vol_home-locks
complete=1
parent=vol_home-io-threads
wind_from=iot_lookup_wrapper
wind_to=FIRST_CHILD (this)->fops->lookup
unwind_from=pl_lookup_cbk
unwind_to=iot_lookup_cbk

[global.callpool.stack.1.frame.25]
ref_count=0
translator=vol_home-io-threads
complete=1
parent=vol_home-index
wind_from=index_lookup
wind_to=FIRST_CHILD(this)->fops->lookup
unwind_from=iot_lookup_cbk
unwind_to=default_lookup_cbk

[global.callpool.stack.1.frame.26]
ref_count=0
translator=vol_home-index
complete=1
parent=vol_home-marker
wind_from=marker_lookup
wind_to=FIRST_CHILD(this)->fops->lookup
unwind_from=default_lookup_cbk
unwind_to=marker_lookup_cbk

[global.callpool.stack.1.frame.27]
ref_count=0
translator=vol_home-marker
complete=1
parent=/brick/1
wind_from=io_stats_lookup
wind_to=FIRST_CHILD(this)->fops->lookup
unwind_from=marker_lookup_cbk
unwind_to=io_stats_lookup_cbk

[global.callpool.stack.1.frame.28]
ref_count=0
translator=/brick/1
complete=1
parent=vol_home-server
wind_from=resolve_gfid
wind_to=BOUND_XL (frame)->fops->lookup
unwind_from=io_stats_lookup_cbk
unwind_to=resolve_gfid_cbk

--- Additional comment from saujain on 2012-06-21 07:29:45 EDT ---

I tried to reproduce this and rename for small or large files works fine.
glusterfs version : 3.3.0rc2

It will be helpful if we can get some more information like, 
vol-type,
glusterfs version
or steps to reproduce this issue.

--- Additional comment from jaw171 on 2012-06-21 09:02:14 EDT ---

I had a distributed volume on 3.2.5.  I moved the data out of it, killed the volume and uninstalled GlusterFS (removed vol files, etc.).  I then installed glusterfs-3.3.0-1.el6.x86_64 from the packages on gluster.org and created a new distributed volume (with different bricks) and moved the data back into it.

Could the fact that the files were in another volume previously be causing the issue?  Is it safe to move data from one volume to another?

This issue still occurs on the current volume when one user tries to compile large programs with gcc.

# gluster volume info
Volume Name: vol_home
Type: Distribute
Volume ID: 07ec60be-ec0c-4579-a675-069bb34c12ab
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: storage0-dev.cssd.pitt.edu:/brick/0
Brick2: storage1-dev.cssd.pitt.edu:/brick/2
Brick3: storage0-dev.cssd.pitt.edu:/brick/1
Brick4: storage1-dev.cssd.pitt.edu:/brick/3
Options Reconfigured:
diagnostics.brick-log-level: INFO
diagnostics.client-log-level: INFO
features.limit-usage: /home/cssd/jaw171:50GB,/cssd:200GB,/cssd/jaw171:100GB
nfs.rpc-auth-allow: 10.54.50.*,127.*
auth.allow: 10.54.50.*,127.*
performance.io-cache: off
cluster.min-free-disk: 5
performance.cache-size: 128000000
features.quota: on
nfs.disable: on

# uname -r
2.6.32-220.17.1.el6.x86_64

# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 6.2 (Santiago)

# rpm -qa | grep gluster
glusterfs-fuse-3.3.0-1.el6.x86_64
glusterfs-server-3.3.0-1.el6.x86_64
glusterfs-3.3.0-1.el6.x86_64

# gluster --version
glusterfs 3.3.0 built on May 31 2012 11:16:29

Comment 3 Vijaikumar Mallikarjuna 2015-01-13 11:39:21 UTC
This issue is not seen in rhs-3.0.
This will not be fixed in 2.0, so closing as wontfix