+++ This bug was initially created as a clone of Bug #833586 +++ rename has hung on the server side because of the following lockup and client witnesses a RENAME() RPC timeout + bail out. Excerpt from http://www.pitt.edu/~jaw171/brick-1.1429.dump [global.callpool.stack.1.frame.1] ref_count=1 translator=vol_home-server complete=0 [global.callpool.stack.1.frame.2] ref_count=0 translator=vol_home-locks complete=1 parent=vol_home-io-threads wind_from=iot_inodelk_wrapper wind_to=FIRST_CHILD (this)->fops->inodelk unwind_from=pl_common_inodelk unwind_to=iot_inodelk_cbk [global.callpool.stack.1.frame.3] ref_count=0 translator=vol_home-io-threads complete=1 parent=vol_home-index wind_from=default_inodelk wind_to=FIRST_CHILD(this)->fops->inodelk unwind_from=iot_inodelk_cbk unwind_to=default_inodelk_cbk [global.callpool.stack.1.frame.4] ref_count=0 translator=vol_home-index complete=1 parent=vol_home-marker wind_from=marker_rename_release_oldp_lock wind_to=FIRST_CHILD(this)->fops->inodelk unwind_from=default_inodelk_cbk unwind_to=marker_rename_release_newp_lock [global.callpool.stack.1.frame.5] ref_count=0 translator=vol_home-posix complete=1 parent=vol_home-access-control wind_from=posix_acl_getxattr wind_to=FIRST_CHILD(this)->fops->getxattr unwind_from=posix_getxattr unwind_to=posix_acl_getxattr_cbk [global.callpool.stack.1.frame.6] ref_count=0 translator=vol_home-access-control complete=1 parent=vol_home-locks wind_from=pl_getxattr wind_to=FIRST_CHILD(this)->fops->getxattr unwind_from=posix_acl_getxattr_cbk unwind_to=pl_getxattr_cbk [global.callpool.stack.1.frame.7] ref_count=0 translator=vol_home-locks complete=1 parent=vol_home-io-threads wind_from=iot_getxattr_wrapper wind_to=FIRST_CHILD (this)->fops->getxattr unwind_from=pl_getxattr_cbk unwind_to=iot_getxattr_cbk [global.callpool.stack.1.frame.8] ref_count=0 translator=vol_home-io-threads complete=1 parent=vol_home-index wind_from=index_getxattr wind_to=FIRST_CHILD(this)->fops->getxattr unwind_from=iot_getxattr_cbk unwind_to=default_getxattr_cbk [global.callpool.stack.1.frame.9] ref_count=0 translator=vol_home-index complete=1 parent=vol_home-marker wind_from=marker_get_oldpath_contribution wind_to=FIRST_CHILD(this)->fops->getxattr unwind_from=default_getxattr_cbk unwind_to=marker_get_newpath_contribution [global.callpool.stack.1.frame.10] ref_count=0 translator=vol_home-locks complete=1 parent=vol_home-io-threads wind_from=iot_inodelk_wrapper wind_to=FIRST_CHILD (this)->fops->inodelk unwind_from=pl_common_inodelk unwind_to=iot_inodelk_cbk [global.callpool.stack.1.frame.11] ref_count=0 translator=vol_home-io-threads complete=1 parent=vol_home-index wind_from=default_inodelk wind_to=FIRST_CHILD(this)->fops->inodelk unwind_from=iot_inodelk_cbk unwind_to=default_inodelk_cbk [global.callpool.stack.1.frame.12] ref_count=0 translator=vol_home-index complete=1 parent=vol_home-marker wind_from=marker_rename wind_to=FIRST_CHILD(this)->fops->inodelk unwind_from=default_inodelk_cbk unwind_to=marker_rename_inodelk_cbk [global.callpool.stack.1.frame.13] ref_count=0 translator=vol_home-marker complete=0 parent=/brick/1 wind_from=io_stats_rename wind_to=FIRST_CHILD(this)->fops->rename unwind_to=io_stats_rename_cbk [global.callpool.stack.1.frame.14] ref_count=1 translator=/brick/1 complete=0 parent=vol_home-server wind_from=server_rename_resume wind_to=bound_xl->fops->rename unwind_to=server_rename_cbk [global.callpool.stack.1.frame.15] ref_count=0 translator=vol_home-posix complete=1 parent=vol_home-access-control wind_from=posix_acl_lookup wind_to=FIRST_CHILD (this)->fops->lookup unwind_from=posix_lookup unwind_to=posix_acl_lookup_cbk [global.callpool.stack.1.frame.16] ref_count=0 translator=vol_home-access-control complete=1 parent=vol_home-locks wind_from=pl_lookup wind_to=FIRST_CHILD(this)->fops->lookup unwind_from=posix_acl_lookup_cbk unwind_to=pl_lookup_cbk [global.callpool.stack.1.frame.17] ref_count=0 translator=vol_home-locks complete=1 parent=vol_home-io-threads wind_from=iot_lookup_wrapper wind_to=FIRST_CHILD (this)->fops->lookup unwind_from=pl_lookup_cbk unwind_to=iot_lookup_cbk [global.callpool.stack.1.frame.18] ref_count=0 translator=vol_home-io-threads complete=1 parent=vol_home-index wind_from=index_lookup wind_to=FIRST_CHILD(this)->fops->lookup unwind_from=iot_lookup_cbk unwind_to=default_lookup_cbk [global.callpool.stack.1.frame.19] ref_count=0 translator=vol_home-index complete=1 parent=vol_home-marker wind_from=marker_lookup wind_to=FIRST_CHILD(this)->fops->lookup unwind_from=default_lookup_cbk unwind_to=marker_lookup_cbk [global.callpool.stack.1.frame.20] ref_count=0 translator=vol_home-marker complete=1 parent=/brick/1 wind_from=io_stats_lookup wind_to=FIRST_CHILD(this)->fops->lookup unwind_from=marker_lookup_cbk unwind_to=io_stats_lookup_cbk [global.callpool.stack.1.frame.21] ref_count=0 translator=/brick/1 complete=1 parent=vol_home-server wind_from=resolve_gfid_cbk wind_to=BOUND_XL (frame)->fops->lookup unwind_from=io_stats_lookup_cbk unwind_to=resolve_gfid_entry_cbk [global.callpool.stack.1.frame.22] ref_count=0 translator=vol_home-posix complete=1 parent=vol_home-access-control wind_from=posix_acl_lookup wind_to=FIRST_CHILD (this)->fops->lookup unwind_from=posix_lookup unwind_to=posix_acl_lookup_cbk [global.callpool.stack.1.frame.23] ref_count=0 translator=vol_home-access-control complete=1 parent=vol_home-locks wind_from=pl_lookup wind_to=FIRST_CHILD(this)->fops->lookup unwind_from=posix_acl_lookup_cbk unwind_to=pl_lookup_cbk [global.callpool.stack.1.frame.24] ref_count=0 translator=vol_home-locks complete=1 parent=vol_home-io-threads wind_from=iot_lookup_wrapper wind_to=FIRST_CHILD (this)->fops->lookup unwind_from=pl_lookup_cbk unwind_to=iot_lookup_cbk [global.callpool.stack.1.frame.25] ref_count=0 translator=vol_home-io-threads complete=1 parent=vol_home-index wind_from=index_lookup wind_to=FIRST_CHILD(this)->fops->lookup unwind_from=iot_lookup_cbk unwind_to=default_lookup_cbk [global.callpool.stack.1.frame.26] ref_count=0 translator=vol_home-index complete=1 parent=vol_home-marker wind_from=marker_lookup wind_to=FIRST_CHILD(this)->fops->lookup unwind_from=default_lookup_cbk unwind_to=marker_lookup_cbk [global.callpool.stack.1.frame.27] ref_count=0 translator=vol_home-marker complete=1 parent=/brick/1 wind_from=io_stats_lookup wind_to=FIRST_CHILD(this)->fops->lookup unwind_from=marker_lookup_cbk unwind_to=io_stats_lookup_cbk [global.callpool.stack.1.frame.28] ref_count=0 translator=/brick/1 complete=1 parent=vol_home-server wind_from=resolve_gfid wind_to=BOUND_XL (frame)->fops->lookup unwind_from=io_stats_lookup_cbk unwind_to=resolve_gfid_cbk --- Additional comment from saujain on 2012-06-21 07:29:45 EDT --- I tried to reproduce this and rename for small or large files works fine. glusterfs version : 3.3.0rc2 It will be helpful if we can get some more information like, vol-type, glusterfs version or steps to reproduce this issue. --- Additional comment from jaw171 on 2012-06-21 09:02:14 EDT --- I had a distributed volume on 3.2.5. I moved the data out of it, killed the volume and uninstalled GlusterFS (removed vol files, etc.). I then installed glusterfs-3.3.0-1.el6.x86_64 from the packages on gluster.org and created a new distributed volume (with different bricks) and moved the data back into it. Could the fact that the files were in another volume previously be causing the issue? Is it safe to move data from one volume to another? This issue still occurs on the current volume when one user tries to compile large programs with gcc. # gluster volume info Volume Name: vol_home Type: Distribute Volume ID: 07ec60be-ec0c-4579-a675-069bb34c12ab Status: Started Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: storage0-dev.cssd.pitt.edu:/brick/0 Brick2: storage1-dev.cssd.pitt.edu:/brick/2 Brick3: storage0-dev.cssd.pitt.edu:/brick/1 Brick4: storage1-dev.cssd.pitt.edu:/brick/3 Options Reconfigured: diagnostics.brick-log-level: INFO diagnostics.client-log-level: INFO features.limit-usage: /home/cssd/jaw171:50GB,/cssd:200GB,/cssd/jaw171:100GB nfs.rpc-auth-allow: 10.54.50.*,127.* auth.allow: 10.54.50.*,127.* performance.io-cache: off cluster.min-free-disk: 5 performance.cache-size: 128000000 features.quota: on nfs.disable: on # uname -r 2.6.32-220.17.1.el6.x86_64 # cat /etc/redhat-release Red Hat Enterprise Linux Server release 6.2 (Santiago) # rpm -qa | grep gluster glusterfs-fuse-3.3.0-1.el6.x86_64 glusterfs-server-3.3.0-1.el6.x86_64 glusterfs-3.3.0-1.el6.x86_64 # gluster --version glusterfs 3.3.0 built on May 31 2012 11:16:29
This issue is not seen in rhs-3.0. This will not be fixed in 2.0, so closing as wontfix