1053330 – [RHEVM-RHS] RHSS Node doesn't come up after reinstalling it using RHEVM UI,

Bug 1053330 - [RHEVM-RHS] RHSS Node doesn't come up after reinstalling it using RHEVM UI,

Summary: [RHEVM-RHS] RHSS Node doesn't come up after reinstalling it using RHEVM UI,

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	vdsm
Sub Component:
Version:	2.1
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	RHGS 2.1.2
Assignee:	Timothy Asir
QA Contact:	Sudhir D
Docs Contact:
URL:
Whiteboard:	gluster
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-01-15 03:27 UTC by SATHEESARAN
Modified:	2015-05-13 16:31 UTC (History)
CC List:	10 users (show)
Fixed In Version:	4.13.0-24
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2014-01-16 10:46:43 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
ovirt host deploy log from RHEVM (189.94 KB, text/x-log) 2014-01-15 03:34 UTC, SATHEESARAN	no flags	Details
RHEVM Screenshot showing "re-install" option (207.47 KB, image/png) 2014-01-15 03:54 UTC, SATHEESARAN	no flags	Details
View All

Description SATHEESARAN 2014-01-15 03:27:05 UTC

Description of problem:
-----------------------
Added RHSS Node to the gluster enabled glusterd and it came online in RHEVM UI.
When moving it to MAINTENANCE state and then again re-installing using RHEVM UI, doesn't bring it online though

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
RHSS - glusterfs-3.4.0.57rhs-1.el6rhs
RHEVM - IS32 (3.3.0-0.45.el6ev)

How reproducible:
----------------
Happened twice out of 2 attempts

Steps to Reproduce:
------------------
1. Add the RHSS Node to gluster enabled gluster
2. Once RHSS Node comes up in RHEVM UI, put it to Maintenanace state
3. use re-install option in RHEVM UI

Actual results:
--------------
RHSS Node never comes up in RHEVM UI. It gives error message as, "Host 10.70.37.10 installation failed. Command returned failure code 1 during SSH session 'root.37.10'"

Expected results:
-----------------
RHSS Node should come online in RHEVM UI

Additional info:
----------------
1. I tried this case with following use case :
a. Importing existing gluster configuration doesn't set iptables on RHSS Nodes
https://bugzilla.redhat.com/show_bug.cgi?id=1051019
b. So brought one of the RHSS Node to MAINTENANCE state using RHEVM UI
c. When selecting that RHSS Node, select "General" tab, and there is an option "Host is in maintenance mode, you can Activate it by pressing the Activate button. If you wish to upgrade or reinstall it click
here."
d. Pressed on link, "here" in previous step and re-installation begins

But finally not being up and being in DOWN state.

Comment 1 SATHEESARAN 2014-01-15 03:28:26 UTC

VDSM host deploy logs shows no error and I see iptables rules are up, though the state of RHSS Node was shown as down.

It means bootstrapping has again tookplace adding those rules

Comment 2 SATHEESARAN 2014-01-15 03:34:45 UTC

Created attachment 850329 [details]
ovirt host deploy log from RHEVM

Comment 3 SATHEESARAN 2014-01-15 03:54:19 UTC

Created attachment 850331 [details]
RHEVM Screenshot showing "re-install" option

Comment 4 SATHEESARAN 2014-01-15 04:59:49 UTC

This bug is the manifestation of this bug, https://bugzilla.redhat.com/show_bug.cgi?id=1038038

Since gateway is not configured correctly, DNS resolution for bricks were not happening successfully when glusterd was restarted

This is evident from glusterd logs,

<snip>
[2014-01-15 09:52:43.444950] I [glusterd.c:140:glusterd_uuid_init] 0-management: retrieved UUID: 1650fc10-5365-40e3-8fea-1e87908a9f55
[2014-01-15 09:52:43.445290] E [glusterd-store.c:2600:glusterd_resolve_all_bricks] 0-glusterd: resolve brick failed in restore
[2014-01-15 09:52:43.445318] E [xlator.c:423:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again
[2014-01-15 09:52:43.445335] E [graph.c:292:glusterfs_graph_init] 0-management: initializing translator failed
[2014-01-15 09:52:43.445345] E [graph.c:479:glusterfs_graph_activate] 0-graph: init failed
[2014-01-15 09:52:43.445768] W [glusterfsd.c:1099:cleanup_and_exit] (-->/usr/sbin/glusterd(main+0x6b1) [0x4069c1] (-->/usr/sbin/glusterd(glusterfs_vo
lumes_init+0xb7) [0x405177] (-->/usr/sbin/glusterd(glusterfs_process_volfp+0x106) [0x405086]))) 0-: received signum (0), shutting down
[2014-01-15 10:12:06.572837] I [glusterfsd.c:2026:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.4.0.57rhs (/usr/sbin/glust
erd --pid-file=/var/run/glusterd.pid)

</snip>

I have changed DEFROUTE to YES, in '/etc/sysconfig/network-scripts/ifcfg-rhevm'
and restarted network and that solved the problem.

In this case, re-installation of RHSS Node using RHEVM UI comes back online

Comment 6 Timothy Asir 2014-01-15 11:21:24 UTC

Patch sent to downstream: https://code.engineering.redhat.com/gerrit/#/c/18372/

Comment 7 Gowrishankar Rajaiyan 2014-01-15 12:54:25 UTC

Is this not a dupe of bug 1038038 ? I mean I do not see a fix particularly for this bug or did I miss anything obvious?

Comment 8 Sahina Bose 2014-01-16 06:09:12 UTC

Yes, the fix for bug 1038038, fixes this well.
Though the fix is the same, the test scenarios of both bugs are different.

Comment 9 Gowrishankar Rajaiyan 2014-01-16 10:28:50 UTC

Thanks for confirming Sahina. This test scenario is covered at https://tcms.engineering.redhat.com/run/107332/#caserun_4176587 which will be executed as part of regression cycle.

Giving qa_ack- since there is no separate fix for this case.

Comment 10 RHEL Program Management 2014-01-16 10:46:43 UTC

Quality Engineering Management has reviewed and declined this request.
You may appeal this decision by reopening this request.

Note You need to log in before you can comment on or make changes to this bug.