Bug 1410543 - OCP deployment with gluster storage, times out waiting on nodes
Summary: OCP deployment with gluster storage, times out waiting on nodes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Quickstart Cloud Installer
Classification: Red Hat
Component: Installation - OpenShift
Version: 1.1
Hardware: Unspecified
OS: Linux
unspecified
medium
Target Milestone: ---
: 1.1
Assignee: dgao
QA Contact: Antonin Pagac
Derek
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-01-05 17:17 UTC by James Olin Oden
Modified: 2017-02-28 01:43 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-02-28 01:43:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:0335 0 normal SHIPPED_LIVE Red Hat Quickstart Installer 1.1 2017-02-28 06:36:13 UTC

Description James Olin Oden 2017-01-05 17:17:41 UTC
Description of problem:
I was doing an OCP deployment on RHV using gluster, and the deployment timed out while deploying OCP with the message:

   ERF42-7017 [Foreman::Exception]: You've reached the timeout set for this action. If the action is still ongoing, you can click on the "Resume Deployment" button to continue.

At the end of the deployment.log it had:

I, [2017-01-04T15:46:12.019216 #12105]  INFO -- : ====== Leaving OSE Launch run method ======
I, [2017-01-04T15:46:12.131886 #12105]  INFO -- : Waiting for the following hosts to begin provisioning...
I, [2017-01-04T15:46:12.137155 #12105]  INFO -- : ID: 95, Name: penultimate2-ose-master1.b.b
I, [2017-01-04T15:46:12.141839 #12105]  INFO -- : ID: 96, Name: penultimate2-ose-node1.b.b

I checked and both nodes were pingable but I could not log into them.   Neither as root, cloudsuite-user or cloudsuite-install using the password I set in the UI.

I spent some time troubleshooting this development but could not get logged into the nodes.   There was a newer compose so I opted to install it and see if this is reproducible there.


Version-Release number of selected component (if applicable):
QCI-1.1-RHEL-7-20161215.t.0

How reproducible:
First time seen

Steps to Reproduce:
1.   Do a RHV + OCP deployment

Actual results:
Got the timeout mentioned above.

Expected results:
Did not expect to see the timeout.

Comment 2 James Olin Oden 2017-01-06 14:07:54 UTC
Installed what was the latest compose yesterday:

   QCI-1.1-RHEL-7-20170104.t.0

Re-ran the test and it failed in the same way.   I'm going to try to do this without gluster and see if it still fails in the same way.

Comment 4 James Olin Oden 2017-01-06 18:35:51 UTC
I just did the same deployment but using NFS rather than gluster and it worked just fine.   So I think its a gluster thing.   I'm able to mount all the gluster volumes (getting through the RHV portion would imply this) so I think my setup is right, but somehow using gluster seems to be slowing things down.   That is at least what the deployments that failed versus succeeded seemed to indicate.

Comment 5 dgao 2017-01-16 16:39:21 UTC
Dev is not able to reproduce this issue. Please retest and provide access to the deployment environment if the error reported in comment #1 occurs again.

Comment 6 Antonin Pagac 2017-01-26 09:40:13 UTC
Couldn't reproduce with QCI-1.1-RHEL-7-20170123.t.0.

Deployed OCP 1 master, 1 node on RHV with engine + 3 hypervisors on HW machines using Gluster for RHV & OCP storage. Deployment went successfully, no errors.

Comment 7 Antonin Pagac 2017-02-07 16:24:25 UTC
Not seeing this with QCI-1.1-RHEL-7-20170203.t.0.

Marking as verified.

Comment 9 errata-xmlrpc 2017-02-28 01:43:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:0335


Note You need to log in before you can comment on or make changes to this bug.