+++ This bug was initially created as a clone of Bug #1178619 +++ Description of problem: Rebalance process is hung in statfs call of quota and fails after time out ################################################################### 1. crated a 6x2 dist-rep volume 2. Ran ACA script which does deep directory creation and renaming of directories and files 3. while script is running did add-brick and rebalance Result: Rebalance will be hung for 1800 seconds which is call bail timeout then it runs to completion statedump: -------------- [global.callpool.stack.1.frame.1] ref_count=1 translator=test-server complete=0 [global.callpool.stack.1.frame.2] ref_count=0 translator=test-quota complete=0 parent=/brick2/test7 wind_from=io_stats_statfs wind_to=FIRST_CHILD(this)->fops->statfs unwind_to=io_stats_statfs_cbk [global.callpool.stack.1.frame.3] ref_count=1 translator=/brick2/test7 complete=0 parent=test-server wind_from=server_statfs_resume wind_to=bound_xl->fops->statfs unwind_to=server_statfs_cbk From rebalance logs =========== [2015-01-03 14:49:59.065353] E [rpc-clnt.c:201:call_bail] 0-test-client-1: bailing out frame type(GlusterFS 3.3) op(STATFS(14)) xid = 0x794 sent = 2015-01-03 14:19:58.397959. timeout = 1800 for 10.70.44.70:49152 Version-Release number of selected component (if applicable): How reproducible: When building ancestry fails, it results in frame loss as error is not handled properly. We saw an error log in brick process which said open failed on the same gfid (on which statfs was issued). This open most likely would've been issued as part of Ancestry building code in quota. Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: --- Additional comment from Anand Avati on 2015-01-05 01:52:17 EST --- REVIEW: http://review.gluster.org/9380 (features/quota: prevent statfs frame-loss when an error happens during ancestry building.) posted (#4) for review on master by Raghavendra G (rgowdapp) --- Additional comment from Anand Avati on 2015-04-30 03:42:31 EDT --- REVIEW: http://review.gluster.org/9380 (features/quota: prevent statfs frame-loss when an error happens during ancestry building.) posted (#5) for review on master by Vijaikumar Mallikarjuna (vmallika) --- Additional comment from Niels de Vos on 2015-05-22 06:21:36 EDT --- I've dropped this bug from the glusterfs-3.7.1 tracker. Please clone this bug and have the clone depend on 1178619 (this bug) and block "glusterfs-3.7.1". --- Additional comment from Anand Avati on 2015-05-28 00:23:31 EDT --- REVIEW: http://review.gluster.org/9380 (features/quota: prevent statfs frame-loss when an error happens during ancestry building.) posted (#6) for review on master by Raghavendra G (rgowdapp)
REVIEW: http://review.gluster.org/11790 (features/quota: prevent statfs frame-loss when an error happens during ancestry building.) posted (#1) for review on release-3.6 by Vijaikumar Mallikarjuna (vmallika)
COMMIT: http://review.gluster.org/11790 committed in release-3.6 by Raghavendra Bhat (raghavendra) ------ commit dfa2bfb289cc73ade0e441f2e2ee88d0d819d48d Author: vmallika <vmallika> Date: Wed Jul 29 16:19:12 2015 +0530 features/quota: prevent statfs frame-loss when an error happens during ancestry building. This is a backport of http://review.gluster.org/#/c/9380/ We do quota_build_ancestry in function 'quota_get_limit_dir', suppose if quota_build_ancestry fails, then we don't have a frame saved to continue the statfs FOP and client can hang. > Change-Id: I92e25c1510d09444b9d4810afdb6b2a69dcd92c0 > BUG: 1178619 > Signed-off-by: Raghavendra G <rgowdapp> > Signed-off-by: vmallika <vmallika> > Reviewed-on: http://review.gluster.org/9380 > Tested-by: Gluster Build System <jenkins.com> Change-Id: Ia25cf738250fdc2c766f96c26e3c31093d534aba BUG: 1247959 Signed-off-by: vmallika <vmallika> Reviewed-on: http://review.gluster.org/11790 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Raghavendra Bhat <raghavendra> Reviewed-by: Raghavendra G <rgowdapp>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.5, please open a new bug report. glusterfs-3.6.5 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://www.gluster.org/pipermail/gluster-devel/2015-August/046570.html [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user