+++ This bug was initially created as a clone of Bug #1541916 +++ +++ This bug was initially created as a clone of Bug #1540961 +++ +++ This bug was initially created as a clone of Bug #1535332 +++ Description of problem: A volume of size 40GB was created and a file of 39GB was created on it. The volume was expanded by 30GB and a rebalance was performed. The total size of the volume became 70GB. But the used space on the volume increased from 39GB to 61.5GB Version-Release number of selected component (if applicable): cns-deploy-5.0.0-57.el7rhgs.x86_64 heketi-client-5.0.0-19.el7rhgs.x86_64 How reproducible: Expected results: The used space should stay the same i.e., 39GB Additional info: ** gluster volume info after expansion: sh-4.2# gluster vol info vol_14a05e3c1f728afdebc484c15f7c6cad Volume Name: vol_14a05e3c1f728afdebc484c15f7c6cad Type: Distributed-Replicate Volume ID: 15550f1f-743b-494d-b761-81d03c8bb24f Status: Started Snapshot Count: 0 Number of Bricks: 2 x 3 = 6 Transport-type: tcp Bricks: Brick1: server1:/var/lib/heketi/mounts/vg_02b16cdadce2aaf1356c37599dadf3f4/brick_a99f4919f14c0ab20b667a0269adeff5/brick Brick2: server2:/var/lib/heketi/mounts/vg_cec08acd1c8b4e8aa15c61a97d89181c/brick_1b0082ad01b71b87e16686c5667cab6e/brick Brick3: server3:/var/lib/heketi/mounts/vg_159d40a7085c9372b8c33572d14eba7b/brick_35a93294d10a783faf85aca0d8ced048/brick Brick4: server3:/var/lib/heketi/mounts/vg_159d40a7085c9372b8c33572d14eba7b/brick_7542d4af880f8f2b501180e0a2e1f117/brick Brick5: server1:/var/lib/heketi/mounts/vg_bf6b608031932669bb05cc8430948619/brick_490a6de2e0e5b67e7c9495b3d57bb9e0/brick Brick6: server2:/var/lib/heketi/mounts/vg_cec08acd1c8b4e8aa15c61a97d89181c/brick_7c97d7dc6c2f0a3d77bffd3e2c95bd7e/brick Options Reconfigured: nfs.disable: on transport.address-family: inet cluster.brick-multiplex: on ** ls on the client after expansion: # ls -lash /usr/share/busybox/ total 40894472 4 drwxrwsr-x 4 root 2000 4.0K Jan 16 14:25 . 0 drwxr-xr-x 3 root root 21 Jan 18 07:12 .. 4 drwxr-sr-x 3 root 2000 4.0K Jan 16 14:25 .trashcan 40894464 -rw-r--r-- 1 root 2000 39.0G Jan 16 14:23 39gfile ** ls on the bricks after expansion * node 1 sh-4.2# ls -lash /var/lib/heketi/mounts/vg_cec08acd1c8b4e8aa15c61a97d89181c/brick_7c97d7dc6c2f0a3d77bffd3e2c95bd7e/brick/ total 23G 0 drwxrwsr-x. 4 root 2000 56 Jan 16 14:25 . 0 drwxr-xr-x. 3 root root 19 Jan 16 14:25 .. 0 drw---S---. 9 root 2000 184 Jan 16 14:26 .glusterfs 0 drwxr-sr-x. 3 root 2000 25 Jan 16 14:25 .trashcan 23G ---------T. 2 root 2000 0 Jan 16 14:25 39gfile sh-4.2# ls -lash /var/lib/heketi/mounts/vg_cec08acd1c8b4e8aa15c61a97d89181c/brick_1b0082ad01b71b87e16686c5667cab6e/brick/ total 40G 0 drwxrwsr-x. 4 root 2000 56 Jan 16 14:23 . 0 drwxr-xr-x. 3 root root 19 Jan 16 14:14 .. 0 drw---S---. 9 root 2000 184 Jan 16 14:23 .glusterfs 0 drwxr-sr-x. 3 root 2000 25 Jan 16 14:25 .trashcan 40G -rw-r--r--. 2 root 2000 39G Jan 16 14:23 39gfile * node 2 sh-4.2# ls -lash /var/lib/heketi/mounts/vg_02b16cdadce2aaf1356c37599dadf3f4/brick_a99f4919f14c0ab20b667a0269adeff5/brick/ total 40G 0 drwxrwsr-x. 4 root 2000 56 Jan 16 14:23 . 0 drwxr-xr-x. 3 root root 19 Jan 16 14:14 .. 0 drw---S---. 9 root 2000 184 Jan 16 14:23 .glusterfs 0 drwxr-sr-x. 3 root 2000 25 Jan 16 14:25 .trashcan 40G -rw-r--r--. 2 root 2000 39G Jan 16 14:23 39gfile sh-4.2# ls -lash /var/lib/heketi/mounts/vg_bf6b608031932669bb05cc8430948619/brick_490a6de2e0e5b67e7c9495b3d57bb9e0/brick/ total 23G 0 drwxrwsr-x. 4 root 2000 56 Jan 16 14:25 . 0 drwxr-xr-x. 3 root root 19 Jan 16 14:25 .. 0 drw---S---. 9 root 2000 184 Jan 16 14:26 .glusterfs 0 drwxr-sr-x. 3 root 2000 25 Jan 16 14:25 .trashcan 23G ---------T. 2 root 2000 0 Jan 16 14:25 39gfile * node 3 sh-4.2# ls -lash /var/lib/heketi/mounts/vg_159d40a7085c9372b8c33572d14eba7b/brick_35a93294d10a783faf85aca0d8ced048/brick/ total 40G 0 drwxrwsr-x. 4 root 2000 56 Jan 16 14:23 . 0 drwxr-xr-x. 3 root root 19 Jan 16 14:14 .. 0 drw---S---. 9 root 2000 184 Jan 16 14:23 .glusterfs 0 drwxr-sr-x. 3 root 2000 25 Jan 16 14:25 .trashcan 40G -rw-r--r--. 2 root 2000 39G Jan 16 14:23 39gfile sh-4.2# ls -lash /var/lib/heketi/mounts/vg_159d40a7085c9372b8c33572d14eba7b/brick_7542d4af880f8f2b501180e0a2e1f117/brick/ total 23G 0 drwxrwsr-x. 4 root 2000 56 Jan 16 14:25 . 0 drwxr-xr-x. 3 root root 19 Jan 16 14:25 .. 0 drw---S---. 9 root 2000 184 Jan 16 14:26 .glusterfs 0 drwxr-sr-x. 3 root 2000 25 Jan 16 14:25 .trashcan 23G ---------T. 2 root 2000 0 Jan 16 14:25 39gfile --- Additional comment from Raghavendra Talur on 2018-01-18 13:21:19 IST --- df output seems to showing sum of space occupied by real file and link to file 10.70.46.153:vol_14a05e3c1f728afdebc484c15f7c6cad 70.0G 61.5G 8.4G 88% /usr/share/busybox --- Additional comment from Nithya Balachandran on 2018-01-22 10:56:42 IST --- [root@dhcp46-2 ~]# oc rsh glusterfs-j7wkd sh-4.2# grep 39gfile *rebalance.log vol_14a05e3c1f728afdebc484c15f7c6cad-rebalance.log:[2018-01-16 14:25:57.154717] I [dht-rebalance.c:1570:dht_migrate_file] 0-vol_14a05e3c1f728afdebc484c15f7c6cad-dht: /39gfile: attempting to move from vol_14a05e3c1f728afdebc484c15f7c6cad-replicate-0 to vol_14a05e3c1f728afdebc484c15f7c6cad-replicate-1 vol_14a05e3c1f728afdebc484c15f7c6cad-rebalance.log:[2018-01-16 14:25:57.221248] E [MSGID: 109023] [dht-rebalance.c:847:__dht_rebalance_create_dst_file] 0-vol_14a05e3c1f728afdebc484c15f7c6cad-dht: fallocate failed for /39gfile on vol_14a05e3c1f728afdebc484c15f7c6cad-replicate-1 (No space left on device) vol_14a05e3c1f728afdebc484c15f7c6cad-rebalance.log:[2018-01-16 14:25:57.221456] E [MSGID: 0] [dht-rebalance.c:1688:dht_migrate_file] 0-vol_14a05e3c1f728afdebc484c15f7c6cad-dht: Create dst failed on - vol_14a05e3c1f728afdebc484c15f7c6cad-replicate-1 for file - /39gfile vol_14a05e3c1f728afdebc484c15f7c6cad-rebalance.log:[2018-01-16 14:25:57.230524] E [MSGID: 109023] [dht-rebalance.c:2757:gf_defrag_migrate_single_file] 0-vol_14a05e3c1f728afdebc484c15f7c6cad-dht: migrate-data failed for /39gfile Need to check if the fallocate call which creates the dst file does not clean up the allocated space on failure. --- Additional comment from Nithya Balachandran on 2018-02-01 14:53:03 IST --- This is probably the root cause (found by vbellur): https://stackoverflow.com/questions/32799252/why-fallocate-on-linux-creates-a-non-empty-file-when-it-has-not-enough-space --- Additional comment from Worker Ant on 2018-02-05 03:13:50 EST --- REVIEW: https://review.gluster.org/19495 (cluster/dht: Cleanup on fallocate failure) posted (#1) for review on master by N Balachandran
REVIEW: https://review.gluster.org/19514 (cluster/dht: Cleanup on fallocate failure) posted (#1) for review on release-3.12 by N Balachandran
COMMIT: https://review.gluster.org/19514 committed in release-3.12 by "jiffin tony Thottan" <jthottan> with a commit message- cluster/dht: Cleanup on fallocate failure It looks like fallocate leaves a non-empty file behind in case of some failures. We now truncate the file to 0 bytes on failure in __dht_rebalance_create_dst_file. > Change-Id: Ia4ad7b94bb3624a301fcc87d9e36c4dc751edb59 > BUG: 1541916 > Signed-off-by: N Balachandran <nbalacha> Change-Id: Ia4ad7b94bb3624a301fcc87d9e36c4dc751edb59 BUG: 1542601 Signed-off-by: N Balachandran <nbalacha>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.6, please open a new bug report. glusterfs-3.12.6 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/gluster-users/2018-February/033552.html [2] https://www.gluster.org/pipermail/gluster-users/