If TSP where heketidbstorage volume is created is down, heketi service will not work. So in this scenario we will not able to perform any operations on the other TSPs too. With current implementation even if two of the nodes are down in the TSP , heketi service will fail (BZ 1355689) And currently heketi doesn't have capability for node/device replacement.
This issue can only be fixed if GlusterFS supports accross-TSP volumes, which it currently does not. We need to educate the administrator that the database is in a single TSP, therefore, if the TSP is down, then Heketi and ALL volumes on that TSP will also be down.
Rejy, this is only a doc issue, not a code issue. How to treat the bz?
(In reply to Michael Adam from comment #8) > Rejy, this is only a doc issue, not a code issue. How to treat the bz? Moving this to the relevant documentation component for adding appropriate 'critical' note at guide Please provide the content to be documented
To my understanding the content is already in the doc-text field.
Hi Michael, Based on my understanding, I think this note fits well under section 4.4 right after the example of the topology file: https://access.redhat.com/documentation/en/red-hat-gluster-storage/3.1/paged/container-native-storage-for-openshift-container-platform/44-setting-up-the-heketi-server Before I go ahead and add the details I want to know if you have any other suggestion Thanks.
Bhavana, yes, that seems like the correct place! Thanks.
An "Important" block is added in the "Setting up Heketi Server" section riht after describing the topology file: https://access.qa.redhat.com/documentation/en/red-hat-gluster-storage/3.1/single/container-native-storage-for-openshift-container-platform/#idm139855790944944
Can you confirm if this is the change that is made? Heketi stores its database on a Red Hat Gluster Storage volume. In cases where the volume is down, the Heketi service does not respond due to the unavailability of the volume served by a disabled trusted storage pool. To resolve this issue, restart the trusted storage pool which contains the Heketi volume I feel that the message is a little hazy. Can this be rephrased in a better way? suggestion: Heketi stores its database on a Red Hat Gluster Storage volume. Heketi service does not respond if the volume is down. To resolve this issue, restart the gluster pods hosting which contains the Heketi volume
The changes are made: wrt "To resolve this issue, restart the gluster pods hosting which contains the Heketi volume", am guessing you meant "To resolve this issue, restart the gluster pods hosting the Heketi volume" Following is the updated link: http://ccs-jenkins.gsslab.brq.redhat.com:8080/job/doc-Red_Hat_Gluster_Storage-3.4-Container_Native_Storage_with_OpenShift_Platform-branch-master/lastSuccessfulBuild/artifact/tmp/en-US/html-single/index.html#idm139829282703536
Heketi stores its database on a Red Hat Gluster Storage volume. In cases where the volume is down, the Heketi service does not respond due to the unavailability of the volume served by a disabled trusted storage pool. To resolve this issue, restart the trusted storage pool which contains the Heketi volume. This is what I see in the link provided in comment 17. Can you confirm if the link provided is the right one?
Created attachment 1237943 [details] Updated important note Hi krk, Am guessing it must a caching issue. Pls try reloading the page, cause i can see the updated changes in the same link. I have also attached the screenshot for reference.
am moving it back to on_qa. You can ping me in case you have any questions
changes are incorporated, Moving the bug to verified.