Bug 1053330 - [RHEVM-RHS] RHSS Node doesn't come up after reinstalling it using RHEVM UI,
Summary: [RHEVM-RHS] RHSS Node doesn't come up after reinstalling it using RHEVM UI,
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: vdsm
Version: 2.1
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: RHGS 2.1.2
Assignee: Timothy Asir
QA Contact: Sudhir D
URL:
Whiteboard: gluster
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-01-15 03:27 UTC by SATHEESARAN
Modified: 2015-05-13 16:31 UTC (History)
10 users (show)

Fixed In Version: 4.13.0-24
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-01-16 10:46:43 UTC
Embargoed:


Attachments (Terms of Use)
ovirt host deploy log from RHEVM (189.94 KB, text/x-log)
2014-01-15 03:34 UTC, SATHEESARAN
no flags Details
RHEVM Screenshot showing "re-install" option (207.47 KB, image/png)
2014-01-15 03:54 UTC, SATHEESARAN
no flags Details

Description SATHEESARAN 2014-01-15 03:27:05 UTC
Description of problem:
-----------------------
Added RHSS Node to the gluster enabled glusterd and it came online in RHEVM UI.
When moving it to MAINTENANCE state and then again re-installing using RHEVM UI, doesn't bring it online though


Version-Release number of selected component (if applicable):
-------------------------------------------------------------
RHSS  - glusterfs-3.4.0.57rhs-1.el6rhs
RHEVM - IS32 (3.3.0-0.45.el6ev)

How reproducible:
----------------
Happened twice out of 2 attempts


Steps to Reproduce:
------------------
1. Add the RHSS Node to gluster enabled gluster
2. Once RHSS Node comes up in RHEVM UI, put it to Maintenanace state
3. use re-install option in RHEVM UI


Actual results:
--------------
RHSS Node never comes up in RHEVM UI. It gives error message as, "Host 10.70.37.10 installation failed. Command returned failure code 1 during SSH session 'root.37.10'"


Expected results:
-----------------
RHSS Node should come online in RHEVM UI


Additional info:
----------------
1. I tried this case with following use case :
a. Importing existing gluster configuration doesn't set iptables on RHSS Nodes
https://bugzilla.redhat.com/show_bug.cgi?id=1051019
b. So brought one of the RHSS Node to MAINTENANCE state using RHEVM UI
c. When selecting that RHSS Node, select "General" tab, and there is an option "Host is in maintenance mode, you can Activate it by pressing the Activate button. If you wish to upgrade or reinstall it click
here."
d. Pressed on link, "here" in previous step and re-installation begins

But finally not being up and being in DOWN state.

Comment 1 SATHEESARAN 2014-01-15 03:28:26 UTC
VDSM host deploy logs shows no error and I see iptables rules are up, though the state of RHSS Node was shown as down.

It means bootstrapping has again tookplace adding those rules

Comment 2 SATHEESARAN 2014-01-15 03:34:45 UTC
Created attachment 850329 [details]
ovirt host deploy log from RHEVM

Comment 3 SATHEESARAN 2014-01-15 03:54:19 UTC
Created attachment 850331 [details]
RHEVM Screenshot showing "re-install" option

Comment 4 SATHEESARAN 2014-01-15 04:59:49 UTC
This bug is the manifestation of this bug, https://bugzilla.redhat.com/show_bug.cgi?id=1038038

Since gateway is not configured correctly, DNS resolution for bricks were not happening successfully when glusterd was restarted

This is evident from glusterd logs,

<snip>
[2014-01-15 09:52:43.444950] I [glusterd.c:140:glusterd_uuid_init] 0-management: retrieved UUID: 1650fc10-5365-40e3-8fea-1e87908a9f55
[2014-01-15 09:52:43.445290] E [glusterd-store.c:2600:glusterd_resolve_all_bricks] 0-glusterd: resolve brick failed in restore
[2014-01-15 09:52:43.445318] E [xlator.c:423:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again
[2014-01-15 09:52:43.445335] E [graph.c:292:glusterfs_graph_init] 0-management: initializing translator failed
[2014-01-15 09:52:43.445345] E [graph.c:479:glusterfs_graph_activate] 0-graph: init failed
[2014-01-15 09:52:43.445768] W [glusterfsd.c:1099:cleanup_and_exit] (-->/usr/sbin/glusterd(main+0x6b1) [0x4069c1] (-->/usr/sbin/glusterd(glusterfs_vo
lumes_init+0xb7) [0x405177] (-->/usr/sbin/glusterd(glusterfs_process_volfp+0x106) [0x405086]))) 0-: received signum (0), shutting down
[2014-01-15 10:12:06.572837] I [glusterfsd.c:2026:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.4.0.57rhs (/usr/sbin/glust
erd --pid-file=/var/run/glusterd.pid)

</snip>

I have changed DEFROUTE to YES, in '/etc/sysconfig/network-scripts/ifcfg-rhevm'
and restarted network and that solved the problem.

In this case, re-installation of RHSS Node using RHEVM UI comes back online

Comment 6 Timothy Asir 2014-01-15 11:21:24 UTC
Patch sent to downstream: https://code.engineering.redhat.com/gerrit/#/c/18372/

Comment 7 Gowrishankar Rajaiyan 2014-01-15 12:54:25 UTC
Is this not a dupe of bug 1038038 ? I mean I do not see a fix particularly for this bug or did I miss anything obvious?

Comment 8 Sahina Bose 2014-01-16 06:09:12 UTC
Yes, the fix for bug 1038038, fixes this well.
Though the fix is the same, the test scenarios of both bugs are different.

Comment 9 Gowrishankar Rajaiyan 2014-01-16 10:28:50 UTC
Thanks for confirming Sahina. This test scenario is covered at https://tcms.engineering.redhat.com/run/107332/#caserun_4176587 which will be executed as part of regression cycle.

Giving qa_ack- since there is no separate fix for this case.

Comment 10 RHEL Program Management 2014-01-16 10:46:43 UTC
Quality Engineering Management has reviewed and declined this request.
You may appeal this decision by reopening this request.


Note You need to log in before you can comment on or make changes to this bug.