Bug 1542601 - The used space in the volume increases when the volume is expanded
Summary: The used space in the volume increases when the volume is expanded
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: 3.12
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Nithya Balachandran
QA Contact:
URL:
Whiteboard:
Depends On: 1540961 1541916
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-02-06 16:05 UTC by Nithya Balachandran
Modified: 2018-03-05 07:14 UTC (History)
11 users (show)

Fixed In Version: glusterfs-3.12.6
Clone Of: 1541916
Environment:
Last Closed: 2018-03-05 07:14:45 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Nithya Balachandran 2018-02-06 16:05:38 UTC
+++ This bug was initially created as a clone of Bug #1541916 +++

+++ This bug was initially created as a clone of Bug #1540961 +++

+++ This bug was initially created as a clone of Bug #1535332 +++

Description of problem:

A volume of size 40GB was created and a file of 39GB was created on it. The volume was expanded by 30GB and a rebalance was performed. The total size of the volume became 70GB. But the used space on the volume increased from 39GB to 61.5GB

Version-Release number of selected component (if applicable):

cns-deploy-5.0.0-57.el7rhgs.x86_64
heketi-client-5.0.0-19.el7rhgs.x86_64

How reproducible:


Expected results:
The used space should stay the same i.e., 39GB

Additional info:



** gluster volume info after expansion:
sh-4.2# gluster vol info vol_14a05e3c1f728afdebc484c15f7c6cad                  
                                       
Volume Name: vol_14a05e3c1f728afdebc484c15f7c6cad                              
Type: Distributed-Replicate            
Volume ID: 15550f1f-743b-494d-b761-81d03c8bb24f                                
Status: Started                        
Snapshot Count: 0                      
Number of Bricks: 2 x 3 = 6            
Transport-type: tcp                    
Bricks:                                
Brick1: server1:/var/lib/heketi/mounts/vg_02b16cdadce2aaf1356c37599dadf3f4/brick_a99f4919f14c0ab20b667a0269adeff5/brick                                   
Brick2: server2:/var/lib/heketi/mounts/vg_cec08acd1c8b4e8aa15c61a97d89181c/brick_1b0082ad01b71b87e16686c5667cab6e/brick                                   
Brick3: server3:/var/lib/heketi/mounts/vg_159d40a7085c9372b8c33572d14eba7b/brick_35a93294d10a783faf85aca0d8ced048/brick                                  
Brick4: server3:/var/lib/heketi/mounts/vg_159d40a7085c9372b8c33572d14eba7b/brick_7542d4af880f8f2b501180e0a2e1f117/brick                                  
Brick5: server1:/var/lib/heketi/mounts/vg_bf6b608031932669bb05cc8430948619/brick_490a6de2e0e5b67e7c9495b3d57bb9e0/brick                                   
Brick6: server2:/var/lib/heketi/mounts/vg_cec08acd1c8b4e8aa15c61a97d89181c/brick_7c97d7dc6c2f0a3d77bffd3e2c95bd7e/brick                                   
Options Reconfigured:                  
nfs.disable: on                        
transport.address-family: inet         
cluster.brick-multiplex: on      



** ls on the client after expansion:
# ls -lash /usr/share/busybox/
total 40894472                          
     4 drwxrwsr-x    4 root     2000        4.0K Jan 16 14:25 .                 
     0 drwxr-xr-x    3 root     root          21 Jan 18 07:12 ..                
     4 drwxr-sr-x    3 root     2000        4.0K Jan 16 14:25 .trashcan         
40894464 -rw-r--r--    1 root     2000       39.0G Jan 16 14:23 39gfile         

** ls on the bricks after expansion
* node 1
sh-4.2# ls -lash /var/lib/heketi/mounts/vg_cec08acd1c8b4e8aa15c61a97d89181c/brick_7c97d7dc6c2f0a3d77bffd3e2c95bd7e/brick/                                     
total 23G                              
  0 drwxrwsr-x. 4 root 2000  56 Jan 16 14:25 .                                 
  0 drwxr-xr-x. 3 root root  19 Jan 16 14:25 ..                                
  0 drw---S---. 9 root 2000 184 Jan 16 14:26 .glusterfs                        
  0 drwxr-sr-x. 3 root 2000  25 Jan 16 14:25 .trashcan                         
23G ---------T. 2 root 2000   0 Jan 16 14:25 39gfile                           
sh-4.2# ls -lash /var/lib/heketi/mounts/vg_cec08acd1c8b4e8aa15c61a97d89181c/brick_1b0082ad01b71b87e16686c5667cab6e/brick/                                     
total 40G                              
  0 drwxrwsr-x. 4 root 2000  56 Jan 16 14:23 .                                 
  0 drwxr-xr-x. 3 root root  19 Jan 16 14:14 ..                                
  0 drw---S---. 9 root 2000 184 Jan 16 14:23 .glusterfs                        
  0 drwxr-sr-x. 3 root 2000  25 Jan 16 14:25 .trashcan                         
40G -rw-r--r--. 2 root 2000 39G Jan 16 14:23 39gfile    

* node 2
sh-4.2# ls -lash /var/lib/heketi/mounts/vg_02b16cdadce2aaf1356c37599dadf3f4/brick_a99f4919f14c0ab20b667a0269adeff5/brick/                                     
total 40G                              
  0 drwxrwsr-x. 4 root 2000  56 Jan 16 14:23 .                                 
  0 drwxr-xr-x. 3 root root  19 Jan 16 14:14 ..                                
  0 drw---S---. 9 root 2000 184 Jan 16 14:23 .glusterfs                        
  0 drwxr-sr-x. 3 root 2000  25 Jan 16 14:25 .trashcan                         
40G -rw-r--r--. 2 root 2000 39G Jan 16 14:23 39gfile                           
sh-4.2# ls -lash /var/lib/heketi/mounts/vg_bf6b608031932669bb05cc8430948619/brick_490a6de2e0e5b67e7c9495b3d57bb9e0/brick/                                     
total 23G                              
  0 drwxrwsr-x. 4 root 2000  56 Jan 16 14:25 .                                 
  0 drwxr-xr-x. 3 root root  19 Jan 16 14:25 ..                                
  0 drw---S---. 9 root 2000 184 Jan 16 14:26 .glusterfs                        
  0 drwxr-sr-x. 3 root 2000  25 Jan 16 14:25 .trashcan                         
23G ---------T. 2 root 2000   0 Jan 16 14:25 39gfile         

* node 3
sh-4.2# ls -lash /var/lib/heketi/mounts/vg_159d40a7085c9372b8c33572d14eba7b/brick_35a93294d10a783faf85aca0d8ced048/brick/                                     
total 40G                              
  0 drwxrwsr-x. 4 root 2000  56 Jan 16 14:23 .                                 
  0 drwxr-xr-x. 3 root root  19 Jan 16 14:14 ..                                
  0 drw---S---. 9 root 2000 184 Jan 16 14:23 .glusterfs                        
  0 drwxr-sr-x. 3 root 2000  25 Jan 16 14:25 .trashcan                         
40G -rw-r--r--. 2 root 2000 39G Jan 16 14:23 39gfile                           
sh-4.2# ls -lash /var/lib/heketi/mounts/vg_159d40a7085c9372b8c33572d14eba7b/brick_7542d4af880f8f2b501180e0a2e1f117/brick/                                     
total 23G                              
  0 drwxrwsr-x. 4 root 2000  56 Jan 16 14:25 .                                 
  0 drwxr-xr-x. 3 root root  19 Jan 16 14:25 ..                                
  0 drw---S---. 9 root 2000 184 Jan 16 14:26 .glusterfs                        
  0 drwxr-sr-x. 3 root 2000  25 Jan 16 14:25 .trashcan                         
23G ---------T. 2 root 2000   0 Jan 16 14:25 39gfile

--- Additional comment from Raghavendra Talur on 2018-01-18 13:21:19 IST ---

df output seems to showing sum of space occupied by real file and link to file 

10.70.46.153:vol_14a05e3c1f728afdebc484c15f7c6cad
                         70.0G     61.5G      8.4G  88% /usr/share/busybox

--- Additional comment from Nithya Balachandran on 2018-01-22 10:56:42 IST ---

[root@dhcp46-2 ~]# oc rsh glusterfs-j7wkd

sh-4.2# grep 39gfile *rebalance.log
vol_14a05e3c1f728afdebc484c15f7c6cad-rebalance.log:[2018-01-16 14:25:57.154717] I [dht-rebalance.c:1570:dht_migrate_file] 0-vol_14a05e3c1f728afdebc484c15f7c6cad-dht: /39gfile: attempting to move from vol_14a05e3c1f728afdebc484c15f7c6cad-replicate-0 to vol_14a05e3c1f728afdebc484c15f7c6cad-replicate-1
vol_14a05e3c1f728afdebc484c15f7c6cad-rebalance.log:[2018-01-16 14:25:57.221248] E [MSGID: 109023] [dht-rebalance.c:847:__dht_rebalance_create_dst_file] 0-vol_14a05e3c1f728afdebc484c15f7c6cad-dht: fallocate failed for /39gfile on vol_14a05e3c1f728afdebc484c15f7c6cad-replicate-1 (No space left on device)
vol_14a05e3c1f728afdebc484c15f7c6cad-rebalance.log:[2018-01-16 14:25:57.221456] E [MSGID: 0] [dht-rebalance.c:1688:dht_migrate_file] 0-vol_14a05e3c1f728afdebc484c15f7c6cad-dht: Create dst failed on - vol_14a05e3c1f728afdebc484c15f7c6cad-replicate-1 for file - /39gfile
vol_14a05e3c1f728afdebc484c15f7c6cad-rebalance.log:[2018-01-16 14:25:57.230524] E [MSGID: 109023] [dht-rebalance.c:2757:gf_defrag_migrate_single_file] 0-vol_14a05e3c1f728afdebc484c15f7c6cad-dht: migrate-data failed for /39gfile


Need to check if the fallocate call which creates the dst file does not clean up the allocated space on failure.


--- Additional comment from Nithya Balachandran on 2018-02-01 14:53:03 IST ---

This is probably the root cause (found by vbellur): 

https://stackoverflow.com/questions/32799252/why-fallocate-on-linux-creates-a-non-empty-file-when-it-has-not-enough-space

--- Additional comment from Worker Ant on 2018-02-05 03:13:50 EST ---

REVIEW: https://review.gluster.org/19495 (cluster/dht: Cleanup on fallocate failure) posted (#1) for review on master by N Balachandran

Comment 1 Worker Ant 2018-02-06 16:17:53 UTC
REVIEW: https://review.gluster.org/19514 (cluster/dht: Cleanup on fallocate failure) posted (#1) for review on release-3.12 by N Balachandran

Comment 2 Worker Ant 2018-02-09 08:52:25 UTC
COMMIT: https://review.gluster.org/19514 committed in release-3.12 by "jiffin tony Thottan" <jthottan> with a commit message- cluster/dht: Cleanup on fallocate failure

It looks like fallocate leaves a non-empty
file behind in case of some failures. We now
truncate the file to 0 bytes on failure in
__dht_rebalance_create_dst_file.

> Change-Id: Ia4ad7b94bb3624a301fcc87d9e36c4dc751edb59
> BUG: 1541916
> Signed-off-by: N Balachandran <nbalacha>

Change-Id: Ia4ad7b94bb3624a301fcc87d9e36c4dc751edb59
BUG: 1542601
Signed-off-by: N Balachandran <nbalacha>

Comment 3 Jiffin 2018-03-05 07:14:45 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.6, please open a new bug report.

glusterfs-3.12.6 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2018-February/033552.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.