Description of problem: When running quota automation I occasionally(1 in 10 runs?) see the following testcase fail: 1. create a 6x2 volume, start it. 2. gluster volume quota <vol-name> enable 3. gluster volume quota <vol-name> limit-usage / 5GB 4. gluster volume quota <vol-name> list 5. mount -t nfs/glusterfs/(or mount using SMB) <server-ip>:<vol-name> <mount-point> 6. start creating data inside the mount-point, till limit is reached. files of size 2MB meanwhile: 7. gluster volume quota <vol-name> soft-timeout 30s 8. gluster volume quota <vol-name> hard-timeout 60s after data creation is completed. 10. gluster volume quota <vol-name> list Client side I see: dd: opening `/quota-mount/tcms_285026/test.file': Transport endpoint is not connected And in the brick logs I see: /var/log/glusterfs/bricks/bricks-quota-test-setup_brick2.log:[2013-12-06 17:59:02.743336] W [quota-enforcer-client.c:187:quota_enforcer_lookup_cbk] 0-quota-test-setup-quota: remote operation failed: Transport endpoint is not connected. Path: /tcms_285026 (d892ce24-7e59-4eeb-b86f-7c7d34c71317) /var/log/glusterfs/bricks/bricks-quota-test-setup_brick2.log:[2013-12-06 17:59:02.743377] I [server-rpc-fops.c:1618:server_create_cbk] 0-quota-test-setup-server: 26: CREATE /tcms_285026/test.file (d892ce24-7e59-4eeb-b86f-7c7d34c71317/test.file) ==> (Transport endpoint is not connected) Version-Release number of selected component (if applicable): glusterfs-server-3.4.0.44.1u2rhs-1.el6rhs.x86_64 How reproducible: So far this looks to be about 1 in 10 runs. Steps to Reproduce: 1. create a 6x2 volume, start it. 2. gluster volume quota <vol-name> enable 3. gluster volume quota <vol-name> limit-usage / 5GB 4. gluster volume quota <vol-name> list 5. mount -t nfs/glusterfs/(or mount using SMB) <server-ip>:<vol-name> <mount-point> 6. start creating data inside the mount-point, till limit is reached. files of size 2MB meanwhile: 7. gluster volume quota <vol-name> soft-timeout 30s 8. gluster volume quota <vol-name> hard-timeout 60s after data creation is completed. 10. gluster volume quota <vol-name> list Actual results: I/O errors are occasionally hit when the hard/soft timeout is modified with data in flight. Expected results: I/Os complete successfully when timeouts are modified. Additional info: I'll try to provide a more concrete reproducer.
Hi Ben, I am not able to re-create this issue with 3.6 release.
Whenever a new volume is created, quotad gets restarted. This can cause ENOTCONN in the others volumes IO path
submitted upstream patch: http://review.gluster.org/10230
Verified on glusterfs-3.7.1-7.
Hi Vijai, The doc text is updated. Please review the same and share your technical review comments. If it looks ok, then sign-off on the same. Regards, Bhavana
Doc-text looks good to me
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1495.html