Bug 1596655
Summary: | Unable to fix (rerun) failed cluster expand task | ||||||
---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Daniel Horák <dahorak> | ||||
Component: | web-admin-tendrl-gluster-integration | Assignee: | Shubhendu Tripathi <shtripat> | ||||
Status: | CLOSED ERRATA | QA Contact: | Daniel Horák <dahorak> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | unspecified | CC: | apaladug, julim, mbukatov, negupta, nthomas, rhs-bugs, sankarshan | ||||
Target Milestone: | --- | ||||||
Target Release: | RHGS 3.4.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | tendrl-commons-1.6.3-12.el7rhgs | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2018-09-04 07:08:24 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1503137 | ||||||
Attachments: |
|
Description
Daniel Horák
2018-06-29 11:58:52 UTC
This is a bug, now an RFE. In reviewing the suggested text, I made some minor edits. Try this one: "If cluster expansion fails, check if tendrl-ansible was executed successfully and ensure the node agents are correctly configured. If cluster expansion failed due to errors, resolve the errors on affected nodes and re-initiate the Expand Cluster action." QE team will try to inflict 2 different errors (eg. breaking yum repos as described in this BZ and cutting one machine off) during expand and see that it's possible to recover following the tooltip text (see comment 7). Any problem beyond that would require a separate bugzilla, with description of particular expand error. Created attachment 1474988 [details] Expand Cluster button on Hosts page is disabled when Expansion task failed Moving back to ASSIGNED, because it is not possible to relaunch previously failed Expansion task from the "Hosts" page. The "Expand Cluster" button is visible but disabled (see attached screenshot). Version-Release number of selected component (if applicable): Red Hat Gluster Web Administration Server: tendrl-ansible-1.6.3-6.el7rhgs.noarch tendrl-api-1.6.3-5.el7rhgs.noarch tendrl-api-httpd-1.6.3-5.el7rhgs.noarch tendrl-commons-1.6.3-11.el7rhgs.noarch tendrl-grafana-plugins-1.6.3-8.el7rhgs.noarch tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch tendrl-monitoring-integration-1.6.3-8.el7rhgs.noarch tendrl-node-agent-1.6.3-9.el7rhgs.noarch tendrl-notifier-1.6.3-4.el7rhgs.noarch tendrl-selinux-1.5.4-2.el7rhgs.noarch tendrl-ui-1.6.3-9.el7rhgs.noarch Red Hat Gluster Storage Server: tendrl-collectd-selinux-1.5.4-2.el7rhgs.noarch tendrl-commons-1.6.3-11.el7rhgs.noarch tendrl-gluster-integration-1.6.3-9.el7rhgs.noarch tendrl-node-agent-1.6.3-9.el7rhgs.noarch tendrl-selinux-1.5.4-2.el7rhgs.noarch Note: It is possible to relaunch the failed Expansion from the Clusters page, from menu under the three dots on the right side of the particular cluster line. >> ASSIGNED Tested and Verified on two scenarios:
* disabling RHGS WA Repo(s) on one of the expanded Gluster Storage Server
* stopping tendrl-node-agent on one of the expanded Gluster Storage Server
In both cases, it was possible to relaunch the "expand" cluster task and
when the simulated issues was corrected, the expand job pass.
Version-Release number of selected component (if applicable):
Red Hat Gluster Web Administration Server:
Red Hat Enterprise Linux Server release 7.5 (Maipo)
collectd-5.7.2-3.1.el7rhgs.x86_64
collectd-ping-5.7.2-3.1.el7rhgs.x86_64
etcd-3.2.7-1.el7.x86_64
grafana-4.3.2-3.el7rhgs.x86_64
libcollectdclient-5.7.2-3.1.el7rhgs.x86_64
python-etcd-0.4.5-2.el7rhgs.noarch
rubygem-etcd-0.3.0-2.el7rhgs.noarch
tendrl-ansible-1.6.3-6.el7rhgs.noarch
tendrl-api-1.6.3-5.el7rhgs.noarch
tendrl-api-httpd-1.6.3-5.el7rhgs.noarch
tendrl-commons-1.6.3-12.el7rhgs.noarch
tendrl-grafana-plugins-1.6.3-10.el7rhgs.noarch
tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch
tendrl-monitoring-integration-1.6.3-10.el7rhgs.noarch
tendrl-node-agent-1.6.3-10.el7rhgs.noarch
tendrl-notifier-1.6.3-4.el7rhgs.noarch
tendrl-selinux-1.5.4-2.el7rhgs.noarch
tendrl-ui-1.6.3-10.el7rhgs.noarch
Red Hat Gluster Storage Server:
Red Hat Enterprise Linux Server release 7.5 (Maipo)
Red Hat Gluster Storage Server 3.4.0
collectd-5.7.2-3.1.el7rhgs.x86_64
collectd-ping-5.7.2-3.1.el7rhgs.x86_64
glusterfs-3.12.2-16.el7rhgs.x86_64
glusterfs-api-3.12.2-16.el7rhgs.x86_64
glusterfs-cli-3.12.2-16.el7rhgs.x86_64
glusterfs-client-xlators-3.12.2-16.el7rhgs.x86_64
glusterfs-events-3.12.2-16.el7rhgs.x86_64
glusterfs-fuse-3.12.2-16.el7rhgs.x86_64
glusterfs-geo-replication-3.12.2-16.el7rhgs.x86_64
glusterfs-libs-3.12.2-16.el7rhgs.x86_64
glusterfs-rdma-3.12.2-16.el7rhgs.x86_64
glusterfs-server-3.12.2-16.el7rhgs.x86_64
gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
libcollectdclient-5.7.2-3.1.el7rhgs.x86_64
libvirt-daemon-driver-storage-gluster-3.9.0-14.el7_5.6.x86_64
python2-gluster-3.12.2-16.el7rhgs.x86_64
python-etcd-0.4.5-2.el7rhgs.noarch
tendrl-collectd-selinux-1.5.4-2.el7rhgs.noarch
tendrl-commons-1.6.3-12.el7rhgs.noarch
tendrl-gluster-integration-1.6.3-9.el7rhgs.noarch
tendrl-node-agent-1.6.3-10.el7rhgs.noarch
tendrl-selinux-1.5.4-2.el7rhgs.noarch
vdsm-gluster-4.19.43-2.3.el7rhgs.noarch
>> VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2616 |