1356991 – Provide 'Critical Note' on availability of heketi database being dependent on availability of relevant RHGS Trusted Storage Pool

Bug 1356991 - Provide 'Critical Note' on availability of heketi database being dependent on availability of relevant RHGS Trusted Storage Pool

Summary: Provide 'Critical Note' on availability of heketi database being dependent on...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	doc-Container_Native_Storage_with_OpenShift
Sub Component:
Version:	rhgs-3.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	CNS 3.4
Assignee:	Bhavana
QA Contact:	krishnaram Karthick
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1385254
TreeView+	depends on / blocked

Reported:	2016-07-15 12:51 UTC by Neha
Modified:	2017-01-23 07:22 UTC (History)
CC List:	13 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Cause: Like any application which uses a volume from GlusterFS, if the cluster serving the volume is down, the volume will not be available. Heketi stores its database on a volume on GlusterFS, therefore, if the volume is down, Heketi will not be able to start. Consequence: Heketi service is no longer responding due to the unavailability of the volume served by a disabled trusted storage pool. Fix: Restart the GlusterFS cluster which contains the Heketi volume. Result: Heketi and all applications depending on volumes from the specified GlusterFS cluster will continue to operate as normal.
Clone Of:
Clones:	1358189 (view as bug list)
Environment:
Last Closed:	2017-01-23 07:22:02 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Updated important note (83.92 KB, image/png) 2017-01-06 11:33 UTC, Bhavana	no flags	Details
View All

Description Neha 2016-07-15 12:51:41 UTC

If TSP where heketidbstorage volume is created is down, heketi service will not work. 

So in this scenario we will not able to perform any operations on the other TSPs too.

With current implementation even if two of the nodes are down in the TSP , heketi service will fail (BZ 1355689)

And currently heketi doesn't have capability for node/device replacement.

Comment 2 Luis Pabón 2016-07-18 14:15:01 UTC

This issue can only be fixed if GlusterFS supports accross-TSP volumes, which it currently does not.

We need to educate the administrator that the database is in a single TSP, therefore, if the TSP is down, then Heketi and ALL volumes on that TSP will also be down.

Comment 8 Michael Adam 2016-10-27 12:54:36 UTC

Rejy, this is only a doc issue, not a code issue. How to treat the bz?

Comment 9 Rejy M Cyriac 2016-10-27 18:24:38 UTC

(In reply to Michael Adam from comment #8)
> Rejy, this is only a doc issue, not a code issue. How to treat the bz?

Moving this to the relevant documentation component for adding appropriate 'critical' note at guide

Please provide the content to be documented

Comment 10 Michael Adam 2016-10-27 21:26:17 UTC

To my understanding the content is already in the doc-text field.

Comment 11 Bhavana 2016-11-07 10:24:13 UTC

Hi Michael,

Based on my understanding, I think this note fits well under section 4.4 right after the example of the topology file:

https://access.redhat.com/documentation/en/red-hat-gluster-storage/3.1/paged/container-native-storage-for-openshift-container-platform/44-setting-up-the-heketi-server

Before I go ahead and add the details I want to know if you have any other suggestion

Thanks.

Comment 12 Michael Adam 2016-11-07 14:06:22 UTC

Bhavana, 

yes, that seems like the correct place!

Thanks.

Comment 13 Bhavana 2016-11-10 07:24:00 UTC

An "Important" block is added in the "Setting up Heketi Server" section riht after describing the topology file:

https://access.qa.redhat.com/documentation/en/red-hat-gluster-storage/3.1/single/container-native-storage-for-openshift-container-platform/#idm139855790944944

Comment 16 krishnaram Karthick 2017-01-06 06:49:00 UTC

Can you confirm if this is the change that is made?

 Heketi stores its database on a Red Hat Gluster Storage volume. In cases where the volume is down, the Heketi service does not respond due to the unavailability of the volume served by a disabled trusted storage pool.
To resolve this issue, restart the trusted storage pool which contains the Heketi volume

I feel that the message is a little hazy. Can this be rephrased in a better way?

suggestion:
Heketi stores its database on a Red Hat Gluster Storage volume. Heketi service does not respond if the volume is down.

To resolve this issue, restart the gluster pods hosting which contains the Heketi volume

Comment 17 Bhavana 2017-01-06 07:28:24 UTC

The changes are made:

wrt "To resolve this issue, restart the gluster pods hosting which contains the Heketi volume", am guessing you meant 

"To resolve this issue, restart the gluster pods hosting the Heketi volume"

Following is the updated link:

http://ccs-jenkins.gsslab.brq.redhat.com:8080/job/doc-Red_Hat_Gluster_Storage-3.4-Container_Native_Storage_with_OpenShift_Platform-branch-master/lastSuccessfulBuild/artifact/tmp/en-US/html-single/index.html#idm139829282703536

Comment 18 krishnaram Karthick 2017-01-06 11:05:54 UTC

 Heketi stores its database on a Red Hat Gluster Storage volume. In cases where the volume is down, the Heketi service does not respond due to the unavailability of the volume served by a disabled trusted storage pool. To resolve this issue, restart the trusted storage pool which contains the Heketi volume. 

This is what I see in the link provided in comment 17. Can you confirm if the link provided is the right one?

Comment 19 Bhavana 2017-01-06 11:33:19 UTC

Created attachment 1237943 [details]
Updated important note

Hi krk,

Am guessing it must a caching issue. Pls try reloading the page, cause i can see the updated changes in the same link. I have also attached the screenshot for reference.

Comment 20 Bhavana 2017-01-06 11:34:52 UTC

am moving it back to on_qa. You can ping me in case you have any questions

Comment 21 krishnaram Karthick 2017-01-06 11:54:53 UTC

changes are incorporated, Moving the bug to verified.

Note You need to log in before you can comment on or make changes to this bug.