Bug 1516211 - describe in detail what to do when ImportCluster task fails
Summary: describe in detail what to do when ImportCluster task fails
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: doc-RHGS_Web_Administration
Version: rhgs-3.3
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: RHGS 3.3.1
Assignee: Rakesh
QA Contact: storage-qa-internal@redhat.com
URL:
Whiteboard:
: 1513003 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-22 09:49 UTC by Martin Bukatovic
Modified: 2018-05-30 17:59 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-05-30 17:59:43 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github Tendrl documentation issues 94 0 None None None 2017-11-24 09:57:43 UTC
Github Tendrl node-agent issues 662 0 None None None 2017-11-24 09:58:25 UTC
Red Hat Bugzilla 1514178 0 unspecified CLOSED document that task details page (on other details) are no longer available after few days 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 1516135 0 unspecified CLOSED When import fails, the import button should be accessible only after unmanage 2021-12-10 15:25:43 UTC
Red Hat Bugzilla 1517065 0 unspecified CLOSED [Web-Admin] Add a new Chapter describing "Cluster Expansion" 2021-02-22 00:41:40 UTC

Internal Links: 1514178 1516135 1517065

Description Martin Bukatovic 2017-11-22 09:49:55 UTC
Document URL
============

Related to documentation for RHGS WA.

Describe the issue
==================

Since Tendrl can't recover from import cluster failure[1], we need to document
what to do when one is stuck with unimportable cluster.

There are multiple different aspects of this problem we need to tackle:

* ImportCluster failed for 1st cluster I'm trying to import
* ImportCluster failed, but I already have another cluster imported

* ImportCluster failed, and I see the reason why in events log (in
  Task Details page) in   some of error messages there
* ImportCluster failed, but I don't see the reason in the events log
  of task details page

[1] as described in BZ 1516135

Suggestions for improvement
===========================

Given the current limitation, describe what to do in all use cases listed
above, with additional note for some peculiar combinations if needed.

QE team would need to retry the scenarios during verification.

We are not aware of any option of recovery for all use cases listed above as
of today. The only description available[2] so far states:

> In case of failed imports as of today, users should refer Tendrl
> documentation (TODO) for cleaning up Tendrl central store and re-trying
> the Import after that.

The TODO item listed above probably refers to this section in upstream docs:

https://github.com/Tendrl/documentation/wiki/Tendrl-release-v1.5.4-(install-guide)#uninstall-tendrl

which doesn't provide full list of:

* services to be stopped
* packages to be removed
* directories/files to be deleted
* notes a possibility of backup of etcd, and removing etcd database, but
  didn't discuss how could we restore the backup (or if that is possible)

which means that this solution is applicable only to the 1st use case as
described above.

Moreover, we should also note that events log of task details page will be
cleaned in few days, as stated in BZ 1514178.

[2] https://github.com/Tendrl/node-agent/issues/662#issuecomment-345279345

Comment 2 Martin Bukatovic 2017-11-22 09:52:12 UTC
bobbGT, feel free to reassign this BZ to a correct component

Comment 3 Martin Bukatovic 2017-11-22 09:53:29 UTC
Could you provide recovery details for all use cases listed in this BZ?

The work on this BZ is blocked until this information is provided.

Comment 4 Lubos Trilety 2017-11-24 09:26:07 UTC
*** Bug 1513003 has been marked as a duplicate of this bug. ***

Comment 7 Rakesh 2017-12-11 22:00:03 UTC
Moving to ON_QA to follow BZ: 1502877.

Comment 8 Lubos Trilety 2017-12-12 09:10:32 UTC
The steps described in the monitoring guide
https://access.qa.redhat.com/documentation/en-us/red_hat_gluster_storage/3.3/html-single/monitoring_guide/#troubleshooting
are pretty easy as it simply says uninstall RHGSWA and install it again for all mentioned scenarios. It'll work, but for example if there is already some cluster imported and functional it means RHGSWA lost it too, because one of uninstall RHGSWA/Unmanage Cluster step is removing data on RHGSWA server and second remove etcd completely. Both are mentioned as optional, but for failed import, I am pretty sure they has to be done, at least the removing etcd one.

Other issues are already listed in Bug 1502877.

Comment 18 Nishanth Thomas 2017-12-15 09:21:01 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=1526338

Comment 19 Pratik Mulay 2017-12-15 10:14:48 UTC
Hi Team,

I've made the required change. Following is the link to the updated content:

https://doc-stage.usersys.redhat.com/documentation/en-us/red_hat_gluster_storage/3.3/html-single/monitoring_guide/#unmanaging_cluster

Let me know in case of any concerns.

Comment 21 Lubos Trilety 2017-12-15 11:10:31 UTC
All demands in this BZ were processed properly. There's a note about un-managing one cluster will result in all the clusters currently managed by Web Administration to be un-managed.


Note You need to log in before you can comment on or make changes to this bug.