+++ This bug was initially created as a clone of Bug #1211220 +++ +++ This bug was initially created as a clone of Bug #1039674 +++ Description of problem: When running quota automation I occasionally(1 in 10 runs?) see the following testcase fail: 1. create a 6x2 volume, start it. 2. gluster volume quota <vol-name> enable 3. gluster volume quota <vol-name> limit-usage / 5GB 4. gluster volume quota <vol-name> list 5. mount -t nfs/glusterfs/(or mount using SMB) <server-ip>:<vol-name> <mount-point> 6. start creating data inside the mount-point, till limit is reached. files of size 2MB meanwhile: 7. gluster volume quota <vol-name> soft-timeout 30s 8. gluster volume quota <vol-name> hard-timeout 60s after data creation is completed. 10. gluster volume quota <vol-name> list Client side I see: dd: opening `/quota-mount/tcms_285026/test.file': Transport endpoint is not connected And in the brick logs I see: /var/log/glusterfs/bricks/bricks-quota-test-setup_brick2.log:[2013-12-06 17:59:02.743336] W [quota-enforcer-client.c:187:quota_enforcer_lookup_cbk] 0-quota-test-setup-quota: remote operation failed: Transport endpoint is not connected. Path: /tcms_285026 (d892ce24-7e59-4eeb-b86f-7c7d34c71317) /var/log/glusterfs/bricks/bricks-quota-test-setup_brick2.log:[2013-12-06 17:59:02.743377] I [server-rpc-fops.c:1618:server_create_cbk] 0-quota-test-setup-server: 26: CREATE /tcms_285026/test.file (d892ce24-7e59-4eeb-b86f-7c7d34c71317/test.file) ==> (Transport endpoint is not connected) Version-Release number of selected component (if applicable): glusterfs-server-3.4.0.44.1u2rhs-1.el6rhs.x86_64 How reproducible: So far this looks to be about 1 in 10 runs. Steps to Reproduce: 1. create a 6x2 volume, start it. 2. gluster volume quota <vol-name> enable 3. gluster volume quota <vol-name> limit-usage / 5GB 4. gluster volume quota <vol-name> list 5. mount -t nfs/glusterfs/(or mount using SMB) <server-ip>:<vol-name> <mount-point> 6. start creating data inside the mount-point, till limit is reached. files of size 2MB meanwhile: 7. gluster volume quota <vol-name> soft-timeout 30s 8. gluster volume quota <vol-name> hard-timeout 60s after data creation is completed. 10. gluster volume quota <vol-name> list Actual results: I/O errors are occasionally hit when the hard/soft timeout is modified with data in flight. Expected results: I/Os complete successfully when timeouts are modified. Additional info: I'll try to provide a more concrete reproducer. --- Additional comment from Vijaikumar Mallikarjuna on 2015-03-03 03:59:22 EST --- Hi Ben, I am not able to re-create this issue with 3.6 release. --- Additional comment from Vijaikumar Mallikarjuna on 2015-04-13 06:41:06 EDT --- Whenever a new volume is created, quotad gets restarted. This can cause ENOTCONN in the others volumes IO path --- Additional comment from Anand Avati on 2015-04-14 04:42:13 EDT --- REVIEW: http://review.gluster.org/10230 (quota: retry connecting to quotad on ENOTCONN error) posted (#1) for review on master by Vijaikumar Mallikarjuna (vmallika) --- Additional comment from Anand Avati on 2015-04-24 02:20:08 EDT --- REVIEW: http://review.gluster.org/10230 (quota: retry connecting to quotad on ENOTCONN error) posted (#2) for review on master by Vijaikumar Mallikarjuna (vmallika) --- Additional comment from Anand Avati on 2015-05-28 00:53:18 EDT --- REVIEW: http://review.gluster.org/10230 (quota: retry connecting to quotad on ENOTCONN error) posted (#3) for review on master by Atin Mukherjee (amukherj) --- Additional comment from Anand Avati on 2015-05-29 03:28:48 EDT --- REVIEW: http://review.gluster.org/10230 (quota: retry connecting to quotad on ENOTCONN error) posted (#4) for review on master by Vijaikumar Mallikarjuna (vmallika)
REVIEW: http://review.gluster.org/11024 (quota: retry connecting to quotad on ENOTCONN error) posted (#1) for review on release-3.7 by Vijaikumar Mallikarjuna (vmallika)
REVIEW: http://review.gluster.org/11024 (quota: retry connecting to quotad on ENOTCONN error) posted (#2) for review on release-3.7 by Vijaikumar Mallikarjuna (vmallika)
REVIEW: http://review.gluster.org/11024 (quota: retry connecting to quotad on ENOTCONN error) posted (#3) for review on release-3.7 by Sachin Pandit (spandit)
REVIEW: http://review.gluster.org/11024 (quota: retry connecting to quotad on ENOTCONN error) posted (#4) for review on release-3.7 by Sachin Pandit (spandit)
COMMIT: http://review.gluster.org/11024 committed in release-3.7 by Raghavendra G (rgowdapp) ------ commit e29f29ac18dc20187934d3da75ea7b55c6dcfb37 Author: vmallika <vmallika> Date: Tue Apr 14 10:44:13 2015 +0530 quota: retry connecting to quotad on ENOTCONN error This is a backport of http://review.gluster.org/#/c/10230/ > Suppose if there are two volumes vol1 and vol2, > and quota is enabled and limit is set on vol1. > Now if IO is happening on vol1 and quota is enabled/disabled > on vol2, quotad gets restarted and client will receive > ENOTCONN in the IO path of vol1. > > This patch will retry connecting to quotad upto 60sec > in a interval of 5sec (12 retries) > If not able to connect with 12 retries, then return ENOTCONN > > Change-Id: Ie7f5d108633ec68ba9cc3a6a61d79680485193e8 > BUG: 1211220 > Signed-off-by: vmallika <vmallika> > Reviewed-on: http://review.gluster.org/10230 > Tested-by: Gluster Build System <jenkins.com> > Reviewed-by: Raghavendra G <rgowdapp> > Tested-by: Raghavendra G <rgowdapp> Change-Id: I94d8d4a814a73d69e934f3e77e989e5f3bf2e65a BUG: 1226789 Signed-off-by: vmallika <vmallika> Reviewed-on: http://review.gluster.org/11024 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Raghavendra G <rgowdapp> Tested-by: Raghavendra G <rgowdapp>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.2, please reopen this bug report. glusterfs-3.7.2 has been announced on the Gluster Packaging mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://www.gluster.org/pipermail/packaging/2015-June/000006.html [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user