Bug 1569306

Summary: Standalone heketi unable to rebalance automatically and use the expanded volume size and new bricks
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Neha Berry <nberry>
Component: heketiAssignee: Raghavendra Talur <rtalur>
Status: CLOSED ERRATA QA Contact: Neha Berry <nberry>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rhgs-3.3CC: akrishna, hchiramm, jmulligan, nberry, pierre-yves.goubet, pprakash, rhs-bugs, rtalur, sankarshan, storage-qa-internal
Target Milestone: ---   
Target Release: CNS 3.10   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: heketi-6.0.0-13.el7rhg Doc Type: Bug Fix
Doc Text:
Previously, a rebalance operation was not run after expanding a volume in heketi, meaning only new files could be written to newly added space. With this fix, heketi performs a rebalance operation on the volume after expansion.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-09-12 09:22:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1568861    
Attachments:
Description Flags
heketi-logs none

Description Neha Berry 2018-04-19 02:44:55 UTC
Description of problem:
###########################

Even in a standalone [manually installed heketi setup (yum install heketi) with no OCP+CNS],  When a volume is expanded via heketi using volume expand option, the additional capacity is not always usable although the size is available i.e., Rebalance doesn't seem to be happening.

writes through client still keep using the original bricks and writes start failing once the brick capacity is exhausted, even though the volume still reports ample free space.

Version-Release number of selected component (if applicable):
#############################################################
[root@dhcp46-175 heketi]# rpm -qa| grep heketi
heketi-client-6.0.0-7.1.el7rhgs.x86_64
heketi-6.0.0-7.1.el7rhgs.x86_64
[root@dhcp46-175 heketi]# 

[root@dhcp46-175 heketi]# cat /etc/redhat-storage-release 
Red Hat Gluster Storage Server 3.3.1



How reproducible:
######################


Steps to Reproduce:
########################

1. Created a volume of size 5 GB using heketi-cli command. Volume name = vol_b429e8ea1198100932cb4d41f99a9c2d

-----------------------------------------------------------------
[root@dhcp46-175 heketi]# heketi-cli topology info

Cluster Id: 0e5c10a90f4906e65beaf52b12a458d9

    File:  true
    Block: true
...
...
    Volumes:

	Name: vol_b429e8ea1198100932cb4d41f99a9c2d
	Size: 5
	Id: b429e8ea1198100932cb4d41f99a9c2d
	Cluster Id: 0e5c10a90f4906e65beaf52b12a458d9
	Mount: 10.70.47.56:vol_b429e8ea1198100932cb4d41f99a9c2d
	Mount Options: backup-volfile-servers=10.70.46.10,10.70.46.183
	Durability Type: replicate
	Replica: 3
	Snapshot: Disabled

		Bricks:
			Id: 2e82eeea855d9ee0807f639feb94c285
			Path: /var/lib/heketi/mounts/vg_c65f1dfb4ca24e116ac88f12d643aa3e/brick_2e82eeea855d9ee0807f639feb94c285/brick
			Size (GiB): 5
			Node: 260e92fc5c44835e8286707ef966c7c8
			Device: c65f1dfb4ca24e116ac88f12d643aa3e

			Id: 9cd086d33d31c849b2f01c7352fca63a
			Path: /var/lib/heketi/mounts/vg_0c99ee1161cef01b3ea64f8faaac4b49/brick_9cd086d33d31c849b2f01c7352fca63a/brick
			Size (GiB): 5
			Node: 3abd7e3ec79863e187da6bcdc3a36399
			Device: 0c99ee1161cef01b3ea64f8faaac4b49

			Id: c73d1610f5a501c85fc4a59e3374ff3a
			Path: /var/lib/heketi/mounts/vg_3501c6b59089d297785e73ff7029deb9/brick_c73d1610f5a501c85fc4a59e3374ff3a/brick
			Size (GiB): 5
			Node: a4476d78aa9f6d018243679ca7339eb5
			Device: 3501c6b59089d297785e73ff7029deb9

---------------------------------------------------------------------

2. Mounted the volume on a client- dhcp46-119.lab.eng.blr.redhat.com and wrote some files.

3. expanded the volume by expand-size=90, hence total size = 95GB

---------------------------------------------------------------------
[root@dhcp46-175 heketi]# heketi-cli volume expand --volume=b429e8ea1198100932cb4d41f99a9c2d --expand-size=90
Name: vol_b429e8ea1198100932cb4d41f99a9c2d
Size: 95
Volume Id: b429e8ea1198100932cb4d41f99a9c2d
Cluster Id: 0e5c10a90f4906e65beaf52b12a458d9
Mount: 10.70.47.56:vol_b429e8ea1198100932cb4d41f99a9c2d
Mount Options: backup-volfile-servers=10.70.46.10,10.70.46.183
Block: false
Free Size: 0
Block Volumes: []
Durability Type: replicate
Distributed+Replica: 3

---------------------------------------------------------------------

4. The mount point in client now reflects 95 GB of usable space. 

5. Created multiple files with data and then observed that we are still unable to write beyond 5GB ,even though the volume size is now 95Gb
--------------------------------------------------------------
[root@dhcp46-119 heketi-cl1]# fallocate -l 10G abc.img
fallocate: abc.img: fallocate failed: No space left on device
[root@dhcp46-119 heketi-cl1]# df -kh

Filesystem                                         Size  Used Avail Use% Mounted on

10.70.47.56:vol_b429e8ea1198100932cb4d41f99a9c2d    95G  2.5G   93G   3% /mnt/heketi-cl1
--------------------------------------------------------------
Maximum space that could be used in the volume
##################################################

--------------------------------------------------------------
[root@dhcp46-119 heketi-cl1]#  dd if=/dev/zero of=fileabc2 count=1024000 bs=10240 
dd: error writing ‘fileabc2’: No space left on device
dd: closing output file ‘fileabc2’: No space left on device

Filesystem                                         Size  Used Avail Use% Mounted on

10.70.47.56:vol_b429e8ea1198100932cb4d41f99a9c2d    95G  5.2G   90G   6% /mnt/heketi-cl1

--------------------------------------------------------------
5. Observed that no new files are getting created in the added bricks, they all are present in the old bricks itself.

[root@dhcp46-119 heketi-cl1]# ls -l | wc -l
469
[root@dhcp46-119 heketi-cl1]# 


In each node only the original bricks have these 469 files. Thus every new file that we created  after expansion used the same old brick(1*3). New bricks are all empty.

E.g from 3 nodes:
++++++++++++++++++++
[root@dhcp46-183 ~]# ls -l /var/lib/heketi/mounts/vg_3501c6b59089d297785e73ff7029deb9/brick_c73d1610f5a501c85fc4a59e3374ff3a/brick| wc -l
469
[root@dhcp46-183 ~]# 

[root@dhcp46-10 ~]# ls -l /var/lib/heketi/mounts/vg_0c99ee1161cef01b3ea64f8faaac4b49/brick_9cd086d33d31c849b2f01c7352fca63a/brick | wc -l
469
[root@dhcp46-10 ~]# 

[root@dhcp46-10 ~]# ls -l /var/lib/heketi/mounts/vg_0c99ee1161cef01b3ea64f8faaac4b49/brick_9cd086d33d31c849b2f01c7352fca63a/brick | wc -l
469
[root@dhcp46-10 ~]# 


Actual results:
#####################

[root@dhcp46-119 heketi-cl1]# df -kh /mnt/heketi-cl1
Filesystem                                        Size  Used Avail Use% Mounted on
10.70.47.56:vol_b429e8ea1198100932cb4d41f99a9c2d   95G  5.2G   90G   6% /mnt/heketi-cl1
[root@dhcp46-119 heketi-cl1]# pwd
/mnt/heketi-cl1
[root@dhcp46-119 heketi-cl1]#  dd if=/dev/zero of=limit count=102400 bs=1024  
dd: failed to open ‘limit’: No space left on device
[root@dhcp46-119 heketi-cl1]# ls -l limit
ls: cannot access limit: No such file or directory
[root@dhcp46-119 heketi-cl1]# 



Expected results:
#######################
After volume expansion, Heketi should have automatically rebalanced the files and new files should have been created in the newly added bricks as well.



Additional info:
#####################


There is already a similar bug for heketi setup which is part of OCP
Bug 1566590 - expanded capacity not usable after expanding a volume using heketi 


Current bricks in volume
############################

Name: vol_b429e8ea1198100932cb4d41f99a9c2d
	Size: 95
	Id: b429e8ea1198100932cb4d41f99a9c2d
	Cluster Id: 0e5c10a90f4906e65beaf52b12a458d9
	Mount: 10.70.47.56:vol_b429e8ea1198100932cb4d41f99a9c2d
	Mount Options: backup-volfile-servers=10.70.46.10,10.70.46.183
	Durability Type: replicate
	Replica: 3
	Snapshot: Disabled

		Bricks:
			Id: 165d55f4ce1ade235449ef6c3638c918
			Path: /var/lib/heketi/mounts/vg_c521e784cf1307d589480949d36e6795/brick_165d55f4ce1ade235449ef6c3638c918/brick
			Size (GiB): 22
			Node: a4476d78aa9f6d018243679ca7339eb5
			Device: c521e784cf1307d589480949d36e6795

			Id: 175370043b37a36d0fd60ae17b4625ad
			Path: /var/lib/heketi/mounts/vg_0c99ee1161cef01b3ea64f8faaac4b49/brick_175370043b37a36d0fd60ae17b4625ad/brick
			Size (GiB): 22
			Node: 3abd7e3ec79863e187da6bcdc3a36399
			Device: 0c99ee1161cef01b3ea64f8faaac4b49

			Id: 2ace2ccc5aa724197fabab8661620285
			Path: /var/lib/heketi/mounts/vg_f317850238ea325aa4753323968c33ff/brick_2ace2ccc5aa724197fabab8661620285/brick
			Size (GiB): 22
			Node: a4476d78aa9f6d018243679ca7339eb5
			Device: f317850238ea325aa4753323968c33ff

			Id: 2e82eeea855d9ee0807f639feb94c285
			Path: /var/lib/heketi/mounts/vg_c65f1dfb4ca24e116ac88f12d643aa3e/brick_2e82eeea855d9ee0807f639feb94c285/brick
			Size (GiB): 5
			Node: 260e92fc5c44835e8286707ef966c7c8
			Device: c65f1dfb4ca24e116ac88f12d643aa3e

			Id: 3f8e60eb4ebd2ad0173d3096c30cc9bf
			Path: /var/lib/heketi/mounts/vg_80848549bed2416ac2d1d63df2b2ee4e/brick_3f8e60eb4ebd2ad0173d3096c30cc9bf/brick
			Size (GiB): 22
			Node: 3abd7e3ec79863e187da6bcdc3a36399
			Device: 80848549bed2416ac2d1d63df2b2ee4e

			Id: 7ad232ddcb3921abcc84668ea61050b7
			Path: /var/lib/heketi/mounts/vg_80848549bed2416ac2d1d63df2b2ee4e/brick_7ad232ddcb3921abcc84668ea61050b7/brick
			Size (GiB): 22
			Node: 3abd7e3ec79863e187da6bcdc3a36399
			Device: 80848549bed2416ac2d1d63df2b2ee4e

			Id: 923ccaadd9a8ce3860c3fd48c8bb9157
			Path: /var/lib/heketi/mounts/vg_3501c6b59089d297785e73ff7029deb9/brick_923ccaadd9a8ce3860c3fd48c8bb9157/brick
			Size (GiB): 22
			Node: a4476d78aa9f6d018243679ca7339eb5
			Device: 3501c6b59089d297785e73ff7029deb9

			Id: 9b8670e234d611fdc9f3182207bdb05a
			Path: /var/lib/heketi/mounts/vg_98785d4009586b7355abb23b26844b5f/brick_9b8670e234d611fdc9f3182207bdb05a/brick
			Size (GiB): 22
			Node: 260e92fc5c44835e8286707ef966c7c8
			Device: 98785d4009586b7355abb23b26844b5f

			Id: 9cd086d33d31c849b2f01c7352fca63a
			Path: /var/lib/heketi/mounts/vg_0c99ee1161cef01b3ea64f8faaac4b49/brick_9cd086d33d31c849b2f01c7352fca63a/brick
			Size (GiB): 5
			Node: 3abd7e3ec79863e187da6bcdc3a36399
			Device: 0c99ee1161cef01b3ea64f8faaac4b49

			Id: c73d1610f5a501c85fc4a59e3374ff3a
			Path: /var/lib/heketi/mounts/vg_3501c6b59089d297785e73ff7029deb9/brick_c73d1610f5a501c85fc4a59e3374ff3a/brick
			Size (GiB): 5
			Node: a4476d78aa9f6d018243679ca7339eb5
			Device: 3501c6b59089d297785e73ff7029deb9

			Id: d11ce8ab5ee0c1e3dd035d4647f7f18b
			Path: /var/lib/heketi/mounts/vg_c65f1dfb4ca24e116ac88f12d643aa3e/brick_d11ce8ab5ee0c1e3dd035d4647f7f18b/brick
			Size (GiB): 22
			Node: 260e92fc5c44835e8286707ef966c7c8
			Device: c65f1dfb4ca24e116ac88f12d643aa3e

			Id: e6ce620b00756da38ac83efb5d3fc220
			Path: /var/lib/heketi/mounts/vg_f317850238ea325aa4753323968c33ff/brick_e6ce620b00756da38ac83efb5d3fc220/brick
			Size (GiB): 22
			Node: a4476d78aa9f6d018243679ca7339eb5
			Device: f317850238ea325aa4753323968c33ff

			Id: e927000cf2fac9f7698a9492066d5305
			Path: /var/lib/heketi/mounts/vg_c849d3e3875d3ee9ff70837d2303b49b/brick_e927000cf2fac9f7698a9492066d5305/brick
			Size (GiB): 22
			Node: 260e92fc5c44835e8286707ef966c7c8
			Device: c849d3e3875d3ee9ff70837d2303b49b

			Id: f44780bb53e0e54129038ad579ec4749
			Path: /var/lib/heketi/mounts/vg_4fc7078ed83754c2c1371d262d30ade9/brick_f44780bb53e0e54129038ad579ec4749/brick
			Size (GiB): 22
			Node: 3abd7e3ec79863e187da6bcdc3a36399
			Device: 4fc7078ed83754c2c1371d262d30ade9

			Id: fdd6eaed62648e71e7101f5434080752
			Path: /var/lib/heketi/mounts/vg_98785d4009586b7355abb23b26844b5f/brick_fdd6eaed62648e71e7101f5434080752/brick
			Size (GiB): 22
			Node: 260e92fc5c44835e8286707ef966c7c8
			Device: 98785d4009586b7355abb23b26844b5f




[root@dhcp47-56 ~]# gluster v info vol_b429e8ea1198100932cb4d41f99a9c2d
 
Volume Name: vol_b429e8ea1198100932cb4d41f99a9c2d
Type: Distributed-Replicate
Volume ID: 254ba50e-98c8-4c85-aade-dda3cf97f32b
Status: Started
Snapshot Count: 0
Number of Bricks: 5 x 3 = 15
Transport-type: tcp
Bricks:
Brick1: 10.70.47.56:/var/lib/heketi/mounts/vg_c65f1dfb4ca24e116ac88f12d643aa3e/brick_2e82eeea855d9ee0807f639feb94c285/brick
Brick2: 10.70.46.10:/var/lib/heketi/mounts/vg_0c99ee1161cef01b3ea64f8faaac4b49/brick_9cd086d33d31c849b2f01c7352fca63a/brick
Brick3: 10.70.46.183:/var/lib/heketi/mounts/vg_3501c6b59089d297785e73ff7029deb9/brick_c73d1610f5a501c85fc4a59e3374ff3a/brick
Brick4: 10.70.46.183:/var/lib/heketi/mounts/vg_3501c6b59089d297785e73ff7029deb9/brick_923ccaadd9a8ce3860c3fd48c8bb9157/brick
Brick5: 10.70.46.10:/var/lib/heketi/mounts/vg_0c99ee1161cef01b3ea64f8faaac4b49/brick_175370043b37a36d0fd60ae17b4625ad/brick
Brick6: 10.70.47.56:/var/lib/heketi/mounts/vg_c849d3e3875d3ee9ff70837d2303b49b/brick_e927000cf2fac9f7698a9492066d5305/brick
Brick7: 10.70.46.10:/var/lib/heketi/mounts/vg_80848549bed2416ac2d1d63df2b2ee4e/brick_7ad232ddcb3921abcc84668ea61050b7/brick
Brick8: 10.70.46.183:/var/lib/heketi/mounts/vg_f317850238ea325aa4753323968c33ff/brick_e6ce620b00756da38ac83efb5d3fc220/brick
Brick9: 10.70.47.56:/var/lib/heketi/mounts/vg_c65f1dfb4ca24e116ac88f12d643aa3e/brick_d11ce8ab5ee0c1e3dd035d4647f7f18b/brick
Brick10: 10.70.47.56:/var/lib/heketi/mounts/vg_98785d4009586b7355abb23b26844b5f/brick_fdd6eaed62648e71e7101f5434080752/brick
Brick11: 10.70.46.10:/var/lib/heketi/mounts/vg_80848549bed2416ac2d1d63df2b2ee4e/brick_3f8e60eb4ebd2ad0173d3096c30cc9bf/brick
Brick12: 10.70.46.183:/var/lib/heketi/mounts/vg_f317850238ea325aa4753323968c33ff/brick_2ace2ccc5aa724197fabab8661620285/brick
Brick13: 10.70.46.10:/var/lib/heketi/mounts/vg_4fc7078ed83754c2c1371d262d30ade9/brick_f44780bb53e0e54129038ad579ec4749/brick
Brick14: 10.70.46.183:/var/lib/heketi/mounts/vg_c521e784cf1307d589480949d36e6795/brick_165d55f4ce1ade235449ef6c3638c918/brick
Brick15: 10.70.47.56:/var/lib/heketi/mounts/vg_98785d4009586b7355abb23b26844b5f/brick_9b8670e234d611fdc9f3182207bdb05a/brick

Comment 2 Neha Berry 2018-04-19 02:52:54 UTC
Created attachment 1423862 [details]
heketi-logs

journalctl logs from heketi are attached here for reference.

Comment 3 John Mulligan 2018-04-25 15:28:23 UTC
AFAIK this is resolved in master: 
https://github.com/heketi/heketi/commit/d54452104da3fe243279316977135e635550cbff

Comment 4 Michael Adam 2018-05-08 07:19:01 UTC
(In reply to John Mulligan from comment #3)
> AFAIK this is resolved in master: 
> https://github.com/heketi/heketi/commit/
> d54452104da3fe243279316977135e635550cbff

This is correct. The tested build in the description was the CNS 3.9.0 build.
I should be fixed with the builds including the arbiter feature.

Comment 11 Anjana KD 2018-08-30 23:17:35 UTC
Updated doc text in the Doc Text field. Please review for technical accuracy.

Comment 12 John Mulligan 2018-09-07 17:22:21 UTC
Doc Text looks OK.

Comment 14 errata-xmlrpc 2018-09-12 09:22:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2686