Bug 1178619 - Statfs is hung because of frame loss in quota
Summary: Statfs is hung because of frame loss in quota
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: quota
Version: mainline
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
Assignee: Raghavendra G
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: qe_tracker_everglades 1226162 1226792 1247959
TreeView+ depends on / blocked
 
Reported: 2015-01-05 06:50 UTC by Raghavendra G
Modified: 2019-12-31 07:17 UTC (History)
6 users (show)

Fixed In Version: glusterfs-3.8rc2
Clone Of:
: 1226162 1226792 1247959 (view as bug list)
Environment:
Last Closed: 2016-06-16 12:41:05 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:
ykaul: needinfo+


Attachments (Terms of Use)

Description Raghavendra G 2015-01-05 06:50:32 UTC
Description of problem:
Rebalance process is hung in statfs call of quota and fails after time out
###################################################################
1. crated a 6x2 dist-rep volume
2. Ran ACA script which does deep directory creation and renaming of
directories and files
3. while script is running did add-brick and rebalance

Result:
Rebalance will be hung for 1800 seconds which is call bail timeout then
it runs to completion


statedump:
--------------
[global.callpool.stack.1.frame.1]
ref_count=1
translator=test-server
complete=0

[global.callpool.stack.1.frame.2]
ref_count=0
translator=test-quota
complete=0
parent=/brick2/test7
wind_from=io_stats_statfs
wind_to=FIRST_CHILD(this)->fops->statfs
unwind_to=io_stats_statfs_cbk

[global.callpool.stack.1.frame.3]
ref_count=1
translator=/brick2/test7
complete=0
parent=test-server
wind_from=server_statfs_resume
wind_to=bound_xl->fops->statfs
unwind_to=server_statfs_cbk


From rebalance logs
===========
[2015-01-03 14:49:59.065353] E [rpc-clnt.c:201:call_bail]
0-test-client-1: bailing out frame type(GlusterFS 3.3) op(STATFS(14)) xid =
0x794 sent = 2015-01-03 14:19:58.397959. timeout = 1800 for
10.70.44.70:49152

Version-Release number of selected component (if applicable):


How reproducible:
When building ancestry fails, it results in frame loss as error is not handled properly. We saw an error log in brick process which said open failed on the same gfid (on which statfs was issued). This open most likely would've been issued as part of Ancestry building code in quota.

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Anand Avati 2015-01-05 06:52:17 UTC
REVIEW: http://review.gluster.org/9380 (features/quota: prevent statfs frame-loss when an error happens during ancestry building.) posted (#4) for review on master by Raghavendra G (rgowdapp)

Comment 2 Anand Avati 2015-04-30 07:42:31 UTC
REVIEW: http://review.gluster.org/9380 (features/quota: prevent statfs frame-loss when an error happens during ancestry building.) posted (#5) for review on master by Vijaikumar Mallikarjuna (vmallika)

Comment 3 Niels de Vos 2015-05-22 10:21:36 UTC
I've dropped this bug from the glusterfs-3.7.1 tracker. Please clone this bug and have the clone depend on 1178619 (this bug) and block "glusterfs-3.7.1".

Comment 4 Anand Avati 2015-05-28 04:23:31 UTC
REVIEW: http://review.gluster.org/9380 (features/quota: prevent statfs frame-loss when an error happens during ancestry building.) posted (#6) for review on master by Raghavendra G (rgowdapp)

Comment 5 Nagaprasad Sathyanarayana 2015-10-25 14:47:45 UTC
Fix for this BZ is already present in a GlusterFS release. You can find clone of this BZ, fixed in a GlusterFS release and closed. Hence closing this mainline BZ as well.

Comment 6 Niels de Vos 2016-06-16 12:41:05 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.