Description of problem: ======================== As part of the validation of fix 1224180 - Getting EIO instead of EDQUOTE when limit execeeds in disperse volume I tested the case from multiple clients and found that the issue exists. I see the i/o error on first few files once quota exceeds. Raising a new bug, as the bug 1224180 was to address the issue as in disperse volume even from a single client, the IO was failing with io error instead of quota exceeds. Discussed with dev, who mentioned that the 1224180 was a quota issue while this is more of a ec issue with write sizes. Version-Release number of selected component (if applicable): ================================== glusterfs-cli-3.7.9-6.el7rhgs.x86_64 glusterfs-libs-3.7.9-6.el7rhgs.x86_64 glusterfs-fuse-3.7.9-6.el7rhgs.x86_64 glusterfs-client-xlators-3.7.9-6.el7rhgs.x86_64 glusterfs-server-3.7.9-6.el7rhgs.x86_64 python-gluster-3.7.9-5.el7rhgs.noarch glusterfs-3.7.9-6.el7rhgs.x86_64 glusterfs-api-3.7.9-6.el7rhgs.x86_64 How reproducible: ========== always Steps to Reproduce: ======================= 1. Create a disperse volume 2x(4+2) 2. mount the volume on two clients 3. Now create a dir dir/dir1 and dir/dir2 (one for each client) and start creating files in a loop of say 1gb each 4. Now enable quota No errors or IO issues should be seen 5. Now set the quota limit of say 10GB to dir Once the Quota limits are reached, user must see "Disk quota exceeded" instead of the previous wrong error of "Input/output error" --->this step fails 6. Now extend the quota limit to 100GB for the dir1 the IO must continue as the quota is not hit 7. Now reduce quota back to say 15GB --->this step fails Once the Quota limits are reached, user must see "Disk quota exceeded" instead of the previous wrong error of "Input/output error" client1: [root@dhcp35-103 103]# for i in {1..30};do dd if=/dev/urandom of=../126/f103.$i bs=1024 count=100000;done 100000+0 records in 100000+0 records out 102400000 bytes (102 MB) copied, 15.6082 s, 6.6 MB/s dd: error writing ‘../126/f103.2’: Input/output error dd: closing output file ‘../126/f103.2’: Input/output error dd: failed to open ‘../126/f103.3’: Disk quota exceeded dd: failed to open ‘../126/f103.4’: Disk quota exceeded dd: failed to open ‘../126/f103.5’: Disk quota exceeded dd: failed to open ‘../126/f103.6’: Disk quota exceeded dd: failed to open ‘../126/f103.7’: Disk quota exceeded dd: failed to open ‘../126/f103.8’: Disk quota exceeded dd: failed to open ‘../126/f103.9’: Disk quota exceeded dd: failed to open ‘../126/f103.10’: Disk quota exceeded dd: failed to open ‘../126/f103.11’: Disk quota exceeded dd: failed to open ‘../126/f103.12’: Disk quota exceeded client2: root@dhcp35-126 126]# for i in {1..30};do dd if=/dev/urandom of=nff.$i bs=1024 count=1000000;done dd: error writing ‘nff.1’: Input/output error dd: closing output file ‘nff.1’: Input/output error dd: failed to open ‘nff.2’: Disk quota exceeded dd: failed to open ‘nff.3’: Disk quota exceeded dd: error writing ‘nff.4’: Input/output error dd: closing output file ‘nff.4’: Input/output error dd: failed to open ‘nff.5’: Disk quota exceeded dd: failed to open ‘nff.6’: Disk quota exceeded dd: failed to open ‘nff.7’: Disk quota exceeded dd: failed to open ‘nff.8’: Disk quota exceeded dd: failed to open ‘nff.9’: Disk quota exceeded dd: failed to open ‘nff.10’: Disk quota exceeded dd: failed to open ‘nff.11’: Disk quota exceeded dd: failed to open ‘nff.12’: Disk quota exceeded dd: failed to open ‘nff.13’: Disk quota exceeded dd: failed to open ‘nff.14’: Disk quota exceeded dd: failed to open ‘nff.15’: Disk quota exceeded dd: failed to open ‘nff.16’: Disk quota exceeded dd: failed to open ‘nff.17’: Disk quota exceeded dd: failed to open ‘nff.18’: Disk quota exceeded dd: failed to open ‘nff.19’: Disk quota exceeded dd: failed to open ‘nff.20’: Disk quota exceeded dd: failed to open ‘nff.21’: Disk quota exceeded dd: failed to open ‘nff.22’: Disk quota exceeded Related bug 1339144 - Getting EIO error when limit exceeds in disperse volume when bricks are down
The quota tracking mechanism at the bricks may have slight differences in tracking quota which manifests in EDQUOT detection. If it so happens that in a 4+2 EC upon wind 3 success and 3 EDQUOT return values are obtained, The resultant state leads to an EIO and cannot be resolved with current infrastructure. Bringing the brick down as in The bug 1339144, makes the scenario easily reproducible. More details can be found at https://bugzilla.redhat.com/show_bug.cgi?id=1224180#c18 https://bugzilla.redhat.com/show_bug.cgi?id=1224180#c19
*** This bug has been marked as a duplicate of bug 1339144 ***