+++ This bug was initially created as a clone of Bug #1138737 +++ Description of problem: I tried a rename a 5GB file and it failed. Failure is seen when the quota related available space is reported to be less the double of file size, so in this case less than 10GB available as per quota. This is the observation, may be there are other reasons that rename is failing. Altogether rename failure is the problem and second is the error messages in subsequent trials of rename are different. How reproducible: always Steps to Reproduce: 1. create a volume of 6x2 type, start it 2. enable quota on the volume 3. set quota limit on the "/",say 20GB 4. mount the volume over nfs 5. create a file of 5GB 6. create a directory, and create data in this dir fill up the volume, lets say upto 18GB(approx) 7. try to rename the 5GB file --- fails Actual results: first problem step 7 fails, second problem, subsequent trials of rename of same file gave different error messages, as can be understood from these logs, [root@rhsauto002 dir3]# mv 5GBfile-rename 5GBfile [root@rhsauto002 dir3]# mv 5GBfile 5GBfile-rename mv: cannot stat ‘5GBfile’: No such file or directory [root@rhsauto002 dir3]# ls 5GBfile-rename [root@rhsauto002 dir3]# mv 5GBfile-rename 5GBfile [root@rhsauto002 dir3]# ls 5GBfile-rename [root@rhsauto002 dir3]# [root@rhsauto002 dir3]# mv 5GBfile-rename 5GBfile mv: ‘5GBfile-rename’ and ‘5GBfile’ are the same file one of the bricks around this time, [2014-09-05 02:43:27.157125] A [quota.c:4200:quota_log_usage] 0-dist-rep-quota: Usage crossed soft limit: 19.6GB used by / [2014-09-05 02:53:22.111639] I [server-rpc-fops.c:999:server_rename_cbk] 0-dist-rep-server: 749084: RENAME /dir3/5GBfile-rename (00000000-0000-0000-0000-000000000000/5GBfile-rename) -> /dir3/5GBfile (00000000-0000-0000-0000-000000000000/5GBfile) ==> (Disk quota exceeded) [2014-09-05 02:53:49.873920] I [server-rpc-fops.c:999:server_rename_cbk] 0-dist-rep-server: 749162: RENAME /dir3/5GBfile-rename (00000000-0000-0000-0000-000000000000/5GBfile-rename) -> /dir3/5GBfile (00000000-0000-0000-0000-000000000000/5GBfile) ==> (Disk quota exceeded) Please note that the brick log says soft limit crossed to 19.6GB, whereas per quota list command it is 18.7 GB [root@nfs1 ~]# gluster volume quota dist-rep list Path Hard-limit Soft-limit Used Available Soft-limit exceeded? Hard-limit exceeded? --------------------------------------------------------------------------------------------------------------------------- / 20.0GB 80% 18.7GB 1.3GB Yes No /dir1 5.0GB 80% 5.0GB 0Bytes Yes Yes Expected results: First problem, rename should pass, second problem, error messages should remain same. Additional info: (In reply to Saurabh from comment #0) > > > one of the bricks around this time, > [2014-09-05 02:43:27.157125] A [quota.c:4200:quota_log_usage] > 0-dist-rep-quota: Usage crossed soft limit: 19.6GB used by / > [2014-09-05 02:53:22.111639] I [server-rpc-fops.c:999:server_rename_cbk] > 0-dist-rep-server: 749084: RENAME /dir3/5GBfile-rename > (00000000-0000-0000-0000-000000000000/5GBfile-rename) -> /dir3/5GBfile > (00000000-0000-0000-0000-000000000000/5GBfile) ==> (Disk quota exceeded) > [2014-09-05 02:53:49.873920] I [server-rpc-fops.c:999:server_rename_cbk] > 0-dist-rep-server: 749162: RENAME /dir3/5GBfile-rename > (00000000-0000-0000-0000-000000000000/5GBfile-rename) -> /dir3/5GBfile > (00000000-0000-0000-0000-000000000000/5GBfile) ==> (Disk quota exceeded) > > Please note that the brick log says soft limit crossed to 19.6GB, whereas > per quota list command it is 18.7 GB The usage displayed in the logs, 19.6GB is the current disk usage of "/". The message indicates that the usage of "/" has crossed its soft limit, viz. 18.67BG. Hope that clarifies. > > [root@nfs1 ~]# gluster volume quota dist-rep list > Path Hard-limit Soft-limit Used > Available Soft-limit exceeded? Hard-limit exceeded? > ----------------------------------------------------------------------------- > ---------------------------------------------- > / 20.0GB 80% 18.7GB > 1.3GB Yes No > /dir1 5.0GB 80% 5.0GB > 0Bytes Yes Yes > Could you attach the sosreport to the BZ? --- Additional comment from Raghavendra G on 2014-09-22 05:42:10 EDT --- The actual bug here is mv not reporting any error even though it failed (as source is seen to be present even after mv). Below is the RCA for it: From strace mv 5 2, lstat("2", 0x7fffd2382400) = -1 ENOENT (No such file or directory) rename("5", "2") = 0 I am seeing brick failing the rename with EDQUOT, but the rename command itself "seems" to be succeeding, though rename was a failure, since ls after mv shows file 5 still existing and file 2 not existing. [root@unused gfs]# ls 3 5 dir newdir sibling [root@unused gfs]# ls 3 5 dir newdir sibling [root@unused gfs]# mv 5 2 [root@unused gfs]# ls 3 5 dir newdir sibling The culprit is dht, which is not propagating back the error to application. In dht_rename_cbk, we have, if (op_ret == -1) { /* Critical failure: unable to rename the cached file */ if (src_cached == dst_cached) { gf_msg (this->name, GF_LOG_WARNING, op_errno, DHT_MSG_RENAME_FAILED, "%s: Rename on %s failed, (gfid = %s) ", local->loc.path, prev->this->name, local->loc.inode ? uuid_utoa(local->loc.inode->gfid):""); local->op_ret = op_ret; local->op_errno = op_errno; goto cleanup; } the above if (src_cached == dst_cached) makes dht not to store failures of any renames where destination doesn't exist as dht_cached will be NULL.
REVIEW: http://review.gluster.org/9063 (cluster/dht: Fix subvol check, to correctly determine cached file rename) posted (#1) for review on master by Shyamsundar Ranganathan (srangana)
REVIEW: http://review.gluster.org/9063 (cluster/dht: Fix subvol check, to correctly determine cached file rename) posted (#2) for review on master by Shyamsundar Ranganathan (srangana)
REVIEW: http://review.gluster.org/9063 (cluster/dht: Fix subvol check, to correctly determine cached file rename) posted (#3) for review on master by Shyamsundar Ranganathan (srangana)
COMMIT: http://review.gluster.org/9063 committed in master by Vijay Bellur (vbellur) ------ commit dfc49143841fe84f846346a30dadce797940eebc Author: Shyam <srangana> Date: Thu Nov 6 10:43:37 2014 -0500 cluster/dht: Fix subvol check, to correctly determine cached file rename The check to treat rename as a critical failure ignored when the cached file is being renamed to new name, as the new name falls on the same subvol as the cached file. This is in addition to when the target of the rename does not exist. The current change is simpler, as the rename logic, renames the cached file in case the target exists and falls on the same subvol as source name, OR the target does not exist and the hash of target falls on the same subvol as source cached. These conditions mean we are renaming the source, other conditions mean we are renaming the source linkto file which we do not want to treat as a critical failure (and we also instruct marker that it is an internal FOP and to not account for the same). Change-Id: I4414e61a0d2b28a429fa747e545ef953e48cfb5b BUG: 1161156 Signed-off-by: Shyam <srangana> Reviewed-on: http://review.gluster.org/9063 Reviewed-by: N Balachandran <nbalacha> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: susant palai <spalai> Reviewed-by: venkatesh somyajulu <vsomyaju> Reviewed-by: Vijay Bellur <vbellur>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report. glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939 [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user