Previously, a rebalance operation was not run after expanding a volume in heketi, meaning only new files could be written to newly added space. With this fix, heketi performs a rebalance operation on the volume after expansion.
Description of problem:
###########################
Even in a standalone [manually installed heketi setup (yum install heketi) with no OCP+CNS], When a volume is expanded via heketi using volume expand option, the additional capacity is not always usable although the size is available i.e., Rebalance doesn't seem to be happening.
writes through client still keep using the original bricks and writes start failing once the brick capacity is exhausted, even though the volume still reports ample free space.
Version-Release number of selected component (if applicable):
#############################################################
[root@dhcp46-175 heketi]# rpm -qa| grep heketi
heketi-client-6.0.0-7.1.el7rhgs.x86_64
heketi-6.0.0-7.1.el7rhgs.x86_64
[root@dhcp46-175 heketi]#
[root@dhcp46-175 heketi]# cat /etc/redhat-storage-release
Red Hat Gluster Storage Server 3.3.1
How reproducible:
######################
Steps to Reproduce:
########################
1. Created a volume of size 5 GB using heketi-cli command. Volume name = vol_b429e8ea1198100932cb4d41f99a9c2d
-----------------------------------------------------------------
[root@dhcp46-175 heketi]# heketi-cli topology info
Cluster Id: 0e5c10a90f4906e65beaf52b12a458d9
File: true
Block: true
...
...
Volumes:
Name: vol_b429e8ea1198100932cb4d41f99a9c2d
Size: 5
Id: b429e8ea1198100932cb4d41f99a9c2d
Cluster Id: 0e5c10a90f4906e65beaf52b12a458d9
Mount: 10.70.47.56:vol_b429e8ea1198100932cb4d41f99a9c2d
Mount Options: backup-volfile-servers=10.70.46.10,10.70.46.183
Durability Type: replicate
Replica: 3
Snapshot: Disabled
Bricks:
Id: 2e82eeea855d9ee0807f639feb94c285
Path: /var/lib/heketi/mounts/vg_c65f1dfb4ca24e116ac88f12d643aa3e/brick_2e82eeea855d9ee0807f639feb94c285/brick
Size (GiB): 5
Node: 260e92fc5c44835e8286707ef966c7c8
Device: c65f1dfb4ca24e116ac88f12d643aa3e
Id: 9cd086d33d31c849b2f01c7352fca63a
Path: /var/lib/heketi/mounts/vg_0c99ee1161cef01b3ea64f8faaac4b49/brick_9cd086d33d31c849b2f01c7352fca63a/brick
Size (GiB): 5
Node: 3abd7e3ec79863e187da6bcdc3a36399
Device: 0c99ee1161cef01b3ea64f8faaac4b49
Id: c73d1610f5a501c85fc4a59e3374ff3a
Path: /var/lib/heketi/mounts/vg_3501c6b59089d297785e73ff7029deb9/brick_c73d1610f5a501c85fc4a59e3374ff3a/brick
Size (GiB): 5
Node: a4476d78aa9f6d018243679ca7339eb5
Device: 3501c6b59089d297785e73ff7029deb9
---------------------------------------------------------------------
2. Mounted the volume on a client- dhcp46-119.lab.eng.blr.redhat.com and wrote some files.
3. expanded the volume by expand-size=90, hence total size = 95GB
---------------------------------------------------------------------
[root@dhcp46-175 heketi]# heketi-cli volume expand --volume=b429e8ea1198100932cb4d41f99a9c2d --expand-size=90
Name: vol_b429e8ea1198100932cb4d41f99a9c2d
Size: 95
Volume Id: b429e8ea1198100932cb4d41f99a9c2d
Cluster Id: 0e5c10a90f4906e65beaf52b12a458d9
Mount: 10.70.47.56:vol_b429e8ea1198100932cb4d41f99a9c2d
Mount Options: backup-volfile-servers=10.70.46.10,10.70.46.183
Block: false
Free Size: 0
Block Volumes: []
Durability Type: replicate
Distributed+Replica: 3
---------------------------------------------------------------------
4. The mount point in client now reflects 95 GB of usable space.
5. Created multiple files with data and then observed that we are still unable to write beyond 5GB ,even though the volume size is now 95Gb
--------------------------------------------------------------
[root@dhcp46-119 heketi-cl1]# fallocate -l 10G abc.img
fallocate: abc.img: fallocate failed: No space left on device
[root@dhcp46-119 heketi-cl1]# df -kh
Filesystem Size Used Avail Use% Mounted on
10.70.47.56:vol_b429e8ea1198100932cb4d41f99a9c2d 95G 2.5G 93G 3% /mnt/heketi-cl1
--------------------------------------------------------------
Maximum space that could be used in the volume
##################################################
--------------------------------------------------------------
[root@dhcp46-119 heketi-cl1]# dd if=/dev/zero of=fileabc2 count=1024000 bs=10240
dd: error writing ‘fileabc2’: No space left on device
dd: closing output file ‘fileabc2’: No space left on device
Filesystem Size Used Avail Use% Mounted on
10.70.47.56:vol_b429e8ea1198100932cb4d41f99a9c2d 95G 5.2G 90G 6% /mnt/heketi-cl1
--------------------------------------------------------------
5. Observed that no new files are getting created in the added bricks, they all are present in the old bricks itself.
[root@dhcp46-119 heketi-cl1]# ls -l | wc -l
469
[root@dhcp46-119 heketi-cl1]#
In each node only the original bricks have these 469 files. Thus every new file that we created after expansion used the same old brick(1*3). New bricks are all empty.
E.g from 3 nodes:
++++++++++++++++++++
[root@dhcp46-183 ~]# ls -l /var/lib/heketi/mounts/vg_3501c6b59089d297785e73ff7029deb9/brick_c73d1610f5a501c85fc4a59e3374ff3a/brick| wc -l
469
[root@dhcp46-183 ~]#
[root@dhcp46-10 ~]# ls -l /var/lib/heketi/mounts/vg_0c99ee1161cef01b3ea64f8faaac4b49/brick_9cd086d33d31c849b2f01c7352fca63a/brick | wc -l
469
[root@dhcp46-10 ~]#
[root@dhcp46-10 ~]# ls -l /var/lib/heketi/mounts/vg_0c99ee1161cef01b3ea64f8faaac4b49/brick_9cd086d33d31c849b2f01c7352fca63a/brick | wc -l
469
[root@dhcp46-10 ~]#
Actual results:
#####################
[root@dhcp46-119 heketi-cl1]# df -kh /mnt/heketi-cl1
Filesystem Size Used Avail Use% Mounted on
10.70.47.56:vol_b429e8ea1198100932cb4d41f99a9c2d 95G 5.2G 90G 6% /mnt/heketi-cl1
[root@dhcp46-119 heketi-cl1]# pwd
/mnt/heketi-cl1
[root@dhcp46-119 heketi-cl1]# dd if=/dev/zero of=limit count=102400 bs=1024
dd: failed to open ‘limit’: No space left on device
[root@dhcp46-119 heketi-cl1]# ls -l limit
ls: cannot access limit: No such file or directory
[root@dhcp46-119 heketi-cl1]#
Expected results:
#######################
After volume expansion, Heketi should have automatically rebalanced the files and new files should have been created in the newly added bricks as well.
Additional info:
#####################
There is already a similar bug for heketi setup which is part of OCP
Bug 1566590 - expanded capacity not usable after expanding a volume using heketi
Current bricks in volume
############################
Name: vol_b429e8ea1198100932cb4d41f99a9c2d
Size: 95
Id: b429e8ea1198100932cb4d41f99a9c2d
Cluster Id: 0e5c10a90f4906e65beaf52b12a458d9
Mount: 10.70.47.56:vol_b429e8ea1198100932cb4d41f99a9c2d
Mount Options: backup-volfile-servers=10.70.46.10,10.70.46.183
Durability Type: replicate
Replica: 3
Snapshot: Disabled
Bricks:
Id: 165d55f4ce1ade235449ef6c3638c918
Path: /var/lib/heketi/mounts/vg_c521e784cf1307d589480949d36e6795/brick_165d55f4ce1ade235449ef6c3638c918/brick
Size (GiB): 22
Node: a4476d78aa9f6d018243679ca7339eb5
Device: c521e784cf1307d589480949d36e6795
Id: 175370043b37a36d0fd60ae17b4625ad
Path: /var/lib/heketi/mounts/vg_0c99ee1161cef01b3ea64f8faaac4b49/brick_175370043b37a36d0fd60ae17b4625ad/brick
Size (GiB): 22
Node: 3abd7e3ec79863e187da6bcdc3a36399
Device: 0c99ee1161cef01b3ea64f8faaac4b49
Id: 2ace2ccc5aa724197fabab8661620285
Path: /var/lib/heketi/mounts/vg_f317850238ea325aa4753323968c33ff/brick_2ace2ccc5aa724197fabab8661620285/brick
Size (GiB): 22
Node: a4476d78aa9f6d018243679ca7339eb5
Device: f317850238ea325aa4753323968c33ff
Id: 2e82eeea855d9ee0807f639feb94c285
Path: /var/lib/heketi/mounts/vg_c65f1dfb4ca24e116ac88f12d643aa3e/brick_2e82eeea855d9ee0807f639feb94c285/brick
Size (GiB): 5
Node: 260e92fc5c44835e8286707ef966c7c8
Device: c65f1dfb4ca24e116ac88f12d643aa3e
Id: 3f8e60eb4ebd2ad0173d3096c30cc9bf
Path: /var/lib/heketi/mounts/vg_80848549bed2416ac2d1d63df2b2ee4e/brick_3f8e60eb4ebd2ad0173d3096c30cc9bf/brick
Size (GiB): 22
Node: 3abd7e3ec79863e187da6bcdc3a36399
Device: 80848549bed2416ac2d1d63df2b2ee4e
Id: 7ad232ddcb3921abcc84668ea61050b7
Path: /var/lib/heketi/mounts/vg_80848549bed2416ac2d1d63df2b2ee4e/brick_7ad232ddcb3921abcc84668ea61050b7/brick
Size (GiB): 22
Node: 3abd7e3ec79863e187da6bcdc3a36399
Device: 80848549bed2416ac2d1d63df2b2ee4e
Id: 923ccaadd9a8ce3860c3fd48c8bb9157
Path: /var/lib/heketi/mounts/vg_3501c6b59089d297785e73ff7029deb9/brick_923ccaadd9a8ce3860c3fd48c8bb9157/brick
Size (GiB): 22
Node: a4476d78aa9f6d018243679ca7339eb5
Device: 3501c6b59089d297785e73ff7029deb9
Id: 9b8670e234d611fdc9f3182207bdb05a
Path: /var/lib/heketi/mounts/vg_98785d4009586b7355abb23b26844b5f/brick_9b8670e234d611fdc9f3182207bdb05a/brick
Size (GiB): 22
Node: 260e92fc5c44835e8286707ef966c7c8
Device: 98785d4009586b7355abb23b26844b5f
Id: 9cd086d33d31c849b2f01c7352fca63a
Path: /var/lib/heketi/mounts/vg_0c99ee1161cef01b3ea64f8faaac4b49/brick_9cd086d33d31c849b2f01c7352fca63a/brick
Size (GiB): 5
Node: 3abd7e3ec79863e187da6bcdc3a36399
Device: 0c99ee1161cef01b3ea64f8faaac4b49
Id: c73d1610f5a501c85fc4a59e3374ff3a
Path: /var/lib/heketi/mounts/vg_3501c6b59089d297785e73ff7029deb9/brick_c73d1610f5a501c85fc4a59e3374ff3a/brick
Size (GiB): 5
Node: a4476d78aa9f6d018243679ca7339eb5
Device: 3501c6b59089d297785e73ff7029deb9
Id: d11ce8ab5ee0c1e3dd035d4647f7f18b
Path: /var/lib/heketi/mounts/vg_c65f1dfb4ca24e116ac88f12d643aa3e/brick_d11ce8ab5ee0c1e3dd035d4647f7f18b/brick
Size (GiB): 22
Node: 260e92fc5c44835e8286707ef966c7c8
Device: c65f1dfb4ca24e116ac88f12d643aa3e
Id: e6ce620b00756da38ac83efb5d3fc220
Path: /var/lib/heketi/mounts/vg_f317850238ea325aa4753323968c33ff/brick_e6ce620b00756da38ac83efb5d3fc220/brick
Size (GiB): 22
Node: a4476d78aa9f6d018243679ca7339eb5
Device: f317850238ea325aa4753323968c33ff
Id: e927000cf2fac9f7698a9492066d5305
Path: /var/lib/heketi/mounts/vg_c849d3e3875d3ee9ff70837d2303b49b/brick_e927000cf2fac9f7698a9492066d5305/brick
Size (GiB): 22
Node: 260e92fc5c44835e8286707ef966c7c8
Device: c849d3e3875d3ee9ff70837d2303b49b
Id: f44780bb53e0e54129038ad579ec4749
Path: /var/lib/heketi/mounts/vg_4fc7078ed83754c2c1371d262d30ade9/brick_f44780bb53e0e54129038ad579ec4749/brick
Size (GiB): 22
Node: 3abd7e3ec79863e187da6bcdc3a36399
Device: 4fc7078ed83754c2c1371d262d30ade9
Id: fdd6eaed62648e71e7101f5434080752
Path: /var/lib/heketi/mounts/vg_98785d4009586b7355abb23b26844b5f/brick_fdd6eaed62648e71e7101f5434080752/brick
Size (GiB): 22
Node: 260e92fc5c44835e8286707ef966c7c8
Device: 98785d4009586b7355abb23b26844b5f
[root@dhcp47-56 ~]# gluster v info vol_b429e8ea1198100932cb4d41f99a9c2d
Volume Name: vol_b429e8ea1198100932cb4d41f99a9c2d
Type: Distributed-Replicate
Volume ID: 254ba50e-98c8-4c85-aade-dda3cf97f32b
Status: Started
Snapshot Count: 0
Number of Bricks: 5 x 3 = 15
Transport-type: tcp
Bricks:
Brick1: 10.70.47.56:/var/lib/heketi/mounts/vg_c65f1dfb4ca24e116ac88f12d643aa3e/brick_2e82eeea855d9ee0807f639feb94c285/brick
Brick2: 10.70.46.10:/var/lib/heketi/mounts/vg_0c99ee1161cef01b3ea64f8faaac4b49/brick_9cd086d33d31c849b2f01c7352fca63a/brick
Brick3: 10.70.46.183:/var/lib/heketi/mounts/vg_3501c6b59089d297785e73ff7029deb9/brick_c73d1610f5a501c85fc4a59e3374ff3a/brick
Brick4: 10.70.46.183:/var/lib/heketi/mounts/vg_3501c6b59089d297785e73ff7029deb9/brick_923ccaadd9a8ce3860c3fd48c8bb9157/brick
Brick5: 10.70.46.10:/var/lib/heketi/mounts/vg_0c99ee1161cef01b3ea64f8faaac4b49/brick_175370043b37a36d0fd60ae17b4625ad/brick
Brick6: 10.70.47.56:/var/lib/heketi/mounts/vg_c849d3e3875d3ee9ff70837d2303b49b/brick_e927000cf2fac9f7698a9492066d5305/brick
Brick7: 10.70.46.10:/var/lib/heketi/mounts/vg_80848549bed2416ac2d1d63df2b2ee4e/brick_7ad232ddcb3921abcc84668ea61050b7/brick
Brick8: 10.70.46.183:/var/lib/heketi/mounts/vg_f317850238ea325aa4753323968c33ff/brick_e6ce620b00756da38ac83efb5d3fc220/brick
Brick9: 10.70.47.56:/var/lib/heketi/mounts/vg_c65f1dfb4ca24e116ac88f12d643aa3e/brick_d11ce8ab5ee0c1e3dd035d4647f7f18b/brick
Brick10: 10.70.47.56:/var/lib/heketi/mounts/vg_98785d4009586b7355abb23b26844b5f/brick_fdd6eaed62648e71e7101f5434080752/brick
Brick11: 10.70.46.10:/var/lib/heketi/mounts/vg_80848549bed2416ac2d1d63df2b2ee4e/brick_3f8e60eb4ebd2ad0173d3096c30cc9bf/brick
Brick12: 10.70.46.183:/var/lib/heketi/mounts/vg_f317850238ea325aa4753323968c33ff/brick_2ace2ccc5aa724197fabab8661620285/brick
Brick13: 10.70.46.10:/var/lib/heketi/mounts/vg_4fc7078ed83754c2c1371d262d30ade9/brick_f44780bb53e0e54129038ad579ec4749/brick
Brick14: 10.70.46.183:/var/lib/heketi/mounts/vg_c521e784cf1307d589480949d36e6795/brick_165d55f4ce1ade235449ef6c3638c918/brick
Brick15: 10.70.47.56:/var/lib/heketi/mounts/vg_98785d4009586b7355abb23b26844b5f/brick_9b8670e234d611fdc9f3182207bdb05a/brick
(In reply to John Mulligan from comment #3)
> AFAIK this is resolved in master:
> https://github.com/heketi/heketi/commit/
> d54452104da3fe243279316977135e635550cbff
This is correct. The tested build in the description was the CNS 3.9.0 build.
I should be fixed with the builds including the arbiter feature.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHEA-2018:2686