Bug 1699384 - [GSS] Large BHV with multiple distribute sets can hit storage.reserve percent in some cases
Summary: [GSS] Large BHV with multiple distribute sets can hit storage.reserve percent...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: rhgs-server-container
Version: ocs-3.11
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
: OCS 3.11.z Batch Update 6
Assignee: Raghavendra Talur
QA Contact: Manisha Saini
URL:
Whiteboard:
Depends On: 1787331
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-04-12 14:48 UTC by Anton Mark
Modified: 2020-12-24 03:38 UTC (History)
17 users (show)

Fixed In Version: rhgs-server-container-3.11.6-7
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 1787331 (view as bug list)
Environment:
Last Closed: 2020-12-17 04:29:04 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:5601 0 None None None 2020-12-17 04:29:11 UTC

Description Anton Mark 2019-04-12 14:48:07 UTC
Description of problem:
Large BHV with multiple distribute sets can hit storage.reserve percent in some cases. This causes block volumes to become inaccessible to host when XFS is unable to execute writes. This appears to occur when there is enough of a utilization imbalance between sub vols that the imbalance pushes one set of bricks over default 1% storage.reserve percentage.


Version-Release number of selected component (if applicable):
OCP 3.11.43
OCS 3.11 
gluster-block-0.2.1-28.el7rhgs.x86_64
glusterfs-3.12.2-18.2.el7rhgs.x86_64
rhgs-server-rhel7 3.11.0-5
rhgs-volmanager-rhel7 3.11.0-5


How reproducible:
I've seen this occur once in the wild. I think it would be challenging to reproduce because it would rely on the random change of enough files being distributed to one sub vol that it would cross the 1% reserve. It might just be a case of iterating enough times so the right conditions occur.


Steps to Reproduce:
1. Set BHV size of 1.5Ti /w largest storage device being 1Ti. This will result in 2 distribute sets of 750Gi.
2. Provision block volumes until BHV is very close to full.
3. At some point one of the sub vols will cross over the default 1% reserve limit and tcmu-runner will no longer be able to service write requests.


Actual results:
Block volumes hosted from BHV are fail to write to XFS filesystem.


Expected results:
Almost full BHV should not cause block volumes write failures.

Additional info:
Not sure if this should be handled by Heketi or gluster-block engineering. Starting with gluster-block since that seems to make the most sense to me.

Comment 10 Mohit Agrawal 2019-10-20 16:52:09 UTC
Upstream patch to allow overwrite the data even disk_space check is failed
https://review.gluster.org/#/c/glusterfs/+/23572/

Comment 27 Manisha Saini 2020-11-20 11:12:25 UTC
Verified this BZ with

openshift_storage_glusterfs_image='registry-proxy.engineering.redhat.com/rh-osbs/rhgs3-rhgs-server-rhel7:3.11.6-8'

openshift_storage_glusterfs_heketi_image='registry-proxy.engineering.redhat.com/rh-osbs/rhgs3-rhgs-volmanager-rhel7:3.11.6-4'

openshift_storage_glusterfs_block_image='registry-proxy.engineering.redhat.com/rh-osbs/rhgs3-rhgs-gluster-block-prov-rhel7:3.11.6-3'


Started creating block volumes along with busybox app pods on Block hosting device of 1.5T till block hosting volume got almost fill.
Have not observed any abnormal behavior/failures in services while writing IO's to app pod where block hosting volume was getting almost close to fill.Moving this BZ to verified state.

Comment 30 errata-xmlrpc 2020-12-17 04:29:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Storage 3.11.z Container Images Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:5601


Note You need to log in before you can comment on or make changes to this bug.