1410543 – OCP deployment with gluster storage, times out waiting on nodes

Bug 1410543 - OCP deployment with gluster storage, times out waiting on nodes

Summary: OCP deployment with gluster storage, times out waiting on nodes

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Quickstart Cloud Installer
Classification:	Red Hat
Component:	Installation - OpenShift
Sub Component:
Version:	1.1
Hardware:	Unspecified
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	1.1
Assignee:	dgao
QA Contact:	Antonin Pagac
Docs Contact:	Derek
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-01-05 17:17 UTC by James Olin Oden
Modified:	2017-02-28 01:43 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-02-28 01:43:14 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2017:0335	0	normal	SHIPPED_LIVE	Red Hat Quickstart Installer 1.1	2017-02-28 06:36:13 UTC

Description James Olin Oden 2017-01-05 17:17:41 UTC

Description of problem:
I was doing an OCP deployment on RHV using gluster, and the deployment timed out while deploying OCP with the message:

   ERF42-7017 [Foreman::Exception]: You've reached the timeout set for this action. If the action is still ongoing, you can click on the "Resume Deployment" button to continue.

At the end of the deployment.log it had:

I, [2017-01-04T15:46:12.019216 #12105]  INFO -- : ====== Leaving OSE Launch run method ======
I, [2017-01-04T15:46:12.131886 #12105]  INFO -- : Waiting for the following hosts to begin provisioning...
I, [2017-01-04T15:46:12.137155 #12105]  INFO -- : ID: 95, Name: penultimate2-ose-master1.b.b
I, [2017-01-04T15:46:12.141839 #12105]  INFO -- : ID: 96, Name: penultimate2-ose-node1.b.b

I checked and both nodes were pingable but I could not log into them.   Neither as root, cloudsuite-user or cloudsuite-install using the password I set in the UI.

I spent some time troubleshooting this development but could not get logged into the nodes.   There was a newer compose so I opted to install it and see if this is reproducible there.


Version-Release number of selected component (if applicable):
QCI-1.1-RHEL-7-20161215.t.0

How reproducible:
First time seen

Steps to Reproduce:
1.   Do a RHV + OCP deployment

Actual results:
Got the timeout mentioned above.

Expected results:
Did not expect to see the timeout.

Comment 2 James Olin Oden 2017-01-06 14:07:54 UTC

Installed what was the latest compose yesterday:

   QCI-1.1-RHEL-7-20170104.t.0

Re-ran the test and it failed in the same way.   I'm going to try to do this without gluster and see if it still fails in the same way.

Comment 4 James Olin Oden 2017-01-06 18:35:51 UTC

I just did the same deployment but using NFS rather than gluster and it worked just fine.   So I think its a gluster thing.   I'm able to mount all the gluster volumes (getting through the RHV portion would imply this) so I think my setup is right, but somehow using gluster seems to be slowing things down.   That is at least what the deployments that failed versus succeeded seemed to indicate.

Comment 5 dgao 2017-01-16 16:39:21 UTC

Dev is not able to reproduce this issue. Please retest and provide access to the deployment environment if the error reported in comment #1 occurs again.

Comment 6 Antonin Pagac 2017-01-26 09:40:13 UTC

Couldn't reproduce with QCI-1.1-RHEL-7-20170123.t.0.

Deployed OCP 1 master, 1 node on RHV with engine + 3 hypervisors on HW machines using Gluster for RHV & OCP storage. Deployment went successfully, no errors.

Comment 7 Antonin Pagac 2017-02-07 16:24:25 UTC

Not seeing this with QCI-1.1-RHEL-7-20170203.t.0.

Marking as verified.

Comment 9 errata-xmlrpc 2017-02-28 01:43:14 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:0335

Note You need to log in before you can comment on or make changes to this bug.