DescriptionNithya Balachandran
2018-02-05 08:08:03 UTC
+++ This bug was initially created as a clone of Bug #1540961 +++
+++ This bug was initially created as a clone of Bug #1535332 +++
Description of problem:
A volume of size 40GB was created and a file of 39GB was created on it. The volume was expanded by 30GB and a rebalance was performed. The total size of the volume became 70GB. But the used space on the volume increased from 39GB to 61.5GB
Version-Release number of selected component (if applicable):
cns-deploy-5.0.0-57.el7rhgs.x86_64
heketi-client-5.0.0-19.el7rhgs.x86_64
How reproducible:
Expected results:
The used space should stay the same i.e., 39GB
Additional info:
** gluster volume info after expansion:
sh-4.2# gluster vol info vol_14a05e3c1f728afdebc484c15f7c6cad
Volume Name: vol_14a05e3c1f728afdebc484c15f7c6cad
Type: Distributed-Replicate
Volume ID: 15550f1f-743b-494d-b761-81d03c8bb24f
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1: server1:/var/lib/heketi/mounts/vg_02b16cdadce2aaf1356c37599dadf3f4/brick_a99f4919f14c0ab20b667a0269adeff5/brick
Brick2: server2:/var/lib/heketi/mounts/vg_cec08acd1c8b4e8aa15c61a97d89181c/brick_1b0082ad01b71b87e16686c5667cab6e/brick
Brick3: server3:/var/lib/heketi/mounts/vg_159d40a7085c9372b8c33572d14eba7b/brick_35a93294d10a783faf85aca0d8ced048/brick
Brick4: server3:/var/lib/heketi/mounts/vg_159d40a7085c9372b8c33572d14eba7b/brick_7542d4af880f8f2b501180e0a2e1f117/brick
Brick5: server1:/var/lib/heketi/mounts/vg_bf6b608031932669bb05cc8430948619/brick_490a6de2e0e5b67e7c9495b3d57bb9e0/brick
Brick6: server2:/var/lib/heketi/mounts/vg_cec08acd1c8b4e8aa15c61a97d89181c/brick_7c97d7dc6c2f0a3d77bffd3e2c95bd7e/brick
Options Reconfigured:
nfs.disable: on
transport.address-family: inet
cluster.brick-multiplex: on
** ls on the client after expansion:
# ls -lash /usr/share/busybox/
total 40894472
4 drwxrwsr-x 4 root 2000 4.0K Jan 16 14:25 .
0 drwxr-xr-x 3 root root 21 Jan 18 07:12 ..
4 drwxr-sr-x 3 root 2000 4.0K Jan 16 14:25 .trashcan
40894464 -rw-r--r-- 1 root 2000 39.0G Jan 16 14:23 39gfile
** ls on the bricks after expansion
* node 1
sh-4.2# ls -lash /var/lib/heketi/mounts/vg_cec08acd1c8b4e8aa15c61a97d89181c/brick_7c97d7dc6c2f0a3d77bffd3e2c95bd7e/brick/
total 23G
0 drwxrwsr-x. 4 root 2000 56 Jan 16 14:25 .
0 drwxr-xr-x. 3 root root 19 Jan 16 14:25 ..
0 drw---S---. 9 root 2000 184 Jan 16 14:26 .glusterfs
0 drwxr-sr-x. 3 root 2000 25 Jan 16 14:25 .trashcan
23G ---------T. 2 root 2000 0 Jan 16 14:25 39gfile
sh-4.2# ls -lash /var/lib/heketi/mounts/vg_cec08acd1c8b4e8aa15c61a97d89181c/brick_1b0082ad01b71b87e16686c5667cab6e/brick/
total 40G
0 drwxrwsr-x. 4 root 2000 56 Jan 16 14:23 .
0 drwxr-xr-x. 3 root root 19 Jan 16 14:14 ..
0 drw---S---. 9 root 2000 184 Jan 16 14:23 .glusterfs
0 drwxr-sr-x. 3 root 2000 25 Jan 16 14:25 .trashcan
40G -rw-r--r--. 2 root 2000 39G Jan 16 14:23 39gfile
* node 2
sh-4.2# ls -lash /var/lib/heketi/mounts/vg_02b16cdadce2aaf1356c37599dadf3f4/brick_a99f4919f14c0ab20b667a0269adeff5/brick/
total 40G
0 drwxrwsr-x. 4 root 2000 56 Jan 16 14:23 .
0 drwxr-xr-x. 3 root root 19 Jan 16 14:14 ..
0 drw---S---. 9 root 2000 184 Jan 16 14:23 .glusterfs
0 drwxr-sr-x. 3 root 2000 25 Jan 16 14:25 .trashcan
40G -rw-r--r--. 2 root 2000 39G Jan 16 14:23 39gfile
sh-4.2# ls -lash /var/lib/heketi/mounts/vg_bf6b608031932669bb05cc8430948619/brick_490a6de2e0e5b67e7c9495b3d57bb9e0/brick/
total 23G
0 drwxrwsr-x. 4 root 2000 56 Jan 16 14:25 .
0 drwxr-xr-x. 3 root root 19 Jan 16 14:25 ..
0 drw---S---. 9 root 2000 184 Jan 16 14:26 .glusterfs
0 drwxr-sr-x. 3 root 2000 25 Jan 16 14:25 .trashcan
23G ---------T. 2 root 2000 0 Jan 16 14:25 39gfile
* node 3
sh-4.2# ls -lash /var/lib/heketi/mounts/vg_159d40a7085c9372b8c33572d14eba7b/brick_35a93294d10a783faf85aca0d8ced048/brick/
total 40G
0 drwxrwsr-x. 4 root 2000 56 Jan 16 14:23 .
0 drwxr-xr-x. 3 root root 19 Jan 16 14:14 ..
0 drw---S---. 9 root 2000 184 Jan 16 14:23 .glusterfs
0 drwxr-sr-x. 3 root 2000 25 Jan 16 14:25 .trashcan
40G -rw-r--r--. 2 root 2000 39G Jan 16 14:23 39gfile
sh-4.2# ls -lash /var/lib/heketi/mounts/vg_159d40a7085c9372b8c33572d14eba7b/brick_7542d4af880f8f2b501180e0a2e1f117/brick/
total 23G
0 drwxrwsr-x. 4 root 2000 56 Jan 16 14:25 .
0 drwxr-xr-x. 3 root root 19 Jan 16 14:25 ..
0 drw---S---. 9 root 2000 184 Jan 16 14:26 .glusterfs
0 drwxr-sr-x. 3 root 2000 25 Jan 16 14:25 .trashcan
23G ---------T. 2 root 2000 0 Jan 16 14:25 39gfile
--- Additional comment from Raghavendra Talur on 2018-01-18 13:21:19 IST ---
df output seems to showing sum of space occupied by real file and link to file
10.70.46.153:vol_14a05e3c1f728afdebc484c15f7c6cad
70.0G 61.5G 8.4G 88% /usr/share/busybox
--- Additional comment from Nithya Balachandran on 2018-01-22 10:56:42 IST ---
[root@dhcp46-2 ~]# oc rsh glusterfs-j7wkd
sh-4.2# grep 39gfile *rebalance.log
vol_14a05e3c1f728afdebc484c15f7c6cad-rebalance.log:[2018-01-16 14:25:57.154717] I [dht-rebalance.c:1570:dht_migrate_file] 0-vol_14a05e3c1f728afdebc484c15f7c6cad-dht: /39gfile: attempting to move from vol_14a05e3c1f728afdebc484c15f7c6cad-replicate-0 to vol_14a05e3c1f728afdebc484c15f7c6cad-replicate-1
vol_14a05e3c1f728afdebc484c15f7c6cad-rebalance.log:[2018-01-16 14:25:57.221248] E [MSGID: 109023] [dht-rebalance.c:847:__dht_rebalance_create_dst_file] 0-vol_14a05e3c1f728afdebc484c15f7c6cad-dht: fallocate failed for /39gfile on vol_14a05e3c1f728afdebc484c15f7c6cad-replicate-1 (No space left on device)
vol_14a05e3c1f728afdebc484c15f7c6cad-rebalance.log:[2018-01-16 14:25:57.221456] E [MSGID: 0] [dht-rebalance.c:1688:dht_migrate_file] 0-vol_14a05e3c1f728afdebc484c15f7c6cad-dht: Create dst failed on - vol_14a05e3c1f728afdebc484c15f7c6cad-replicate-1 for file - /39gfile
vol_14a05e3c1f728afdebc484c15f7c6cad-rebalance.log:[2018-01-16 14:25:57.230524] E [MSGID: 109023] [dht-rebalance.c:2757:gf_defrag_migrate_single_file] 0-vol_14a05e3c1f728afdebc484c15f7c6cad-dht: migrate-data failed for /39gfile
Need to check if the fallocate call which creates the dst file does not clean up the allocated space on failure.
--- Additional comment from Nithya Balachandran on 2018-02-01 14:53:03 IST ---
This is probably the root cause (found by vbellur):
https://stackoverflow.com/questions/32799252/why-fallocate-on-linux-creates-a-non-empty-file-when-it-has-not-enough-space
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-v4.1.0, please open a new bug report.
glusterfs-v4.1.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.
[1] http://lists.gluster.org/pipermail/announce/2018-June/000102.html
[2] https://www.gluster.org/pipermail/gluster-users/