Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1505424 - [Splitstack] Overcloud is not functional after the deployment due
[Splitstack] Overcloud is not functional after the deployment due
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates (Show other bugs)
12.0 (Pike)
Unspecified Unspecified
high Severity high
: beta
: 12.0 (Pike)
Assigned To: Martin André
Gurenko Alex
: Triaged
: 1505495 (view as bug list)
Depends On: 1501852
Blocks:
  Show dependency treegraph
 
Reported: 2017-10-23 10:36 EDT by Gurenko Alex
Modified: 2018-02-05 14:15 EST (History)
11 users (show)

See Also:
Fixed In Version: openstack-tripleo-heat-templates-7.0.3-6.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-12-13 17:18:18 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
overcloudrc (905 bytes, text/plain)
2017-10-23 10:36 EDT, Gurenko Alex
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
OpenStack gerrit 511509 None None None 2017-10-27 09:55 EDT
OpenStack gerrit 517022 None None None 2017-11-01 19:11 EDT
Red Hat Product Errata RHEA-2017:3462 normal SHIPPED_LIVE Red Hat OpenStack Platform 12.0 Enhancement Advisory 2018-02-15 20:43:25 EST

  None (edit)
Description Gurenko Alex 2017-10-23 10:36:52 EDT
Created attachment 1342220 [details]
overcloudrc

Description of problem: After split stack deployment of 1 compute, 1 controller that completed with CREATE_COMPLETE the overcloud commands returns error


Version-Release number of selected component (if applicable): build 2017-10-17.2


How reproducible:


Steps to Reproduce:
1. Deploy split stack with 1 compute, 1 controller
2. source overcloudrc
3. type openstack catalog list

Actual results:

[stack@undercloud-0 ~]$ openstack catalog list
Failed to discover available identity versions when contacting http://192.168.25.28:5000/v2.0. Attempting to parse version from URL.
Unable to establish connection to http://192.168.25.28:5000/v2.0/tokens: HTTPConnectionPool(host='192.168.25.28', port=5000): Max retries exceeded with url: /v2.0/tokens (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x38e2890>: Failed to establish a new connection: [Errno 113] No route to host',))


Expected results:

get a catalog printer for the overcloud


Additional info:
overcloudrc specifies following line:

export OS_AUTH_URL=http://192.168.25.28:5000/v2.0

but overcloud controller has an ip of 192.168.25.25 and compute 192.168.25.23
Comment 2 James Slagle 2017-10-23 12:28:14 EDT
i see that on controller-0, the local docker daemon is not running:

Oct 23 14:53:52 controller-0.redhat.local os-collect-config[10452]: "2017-10-23 14:53:24,268 WARNING: 15741 -- retrying pulling image: 192.168.24.1:8787/rhosp12/openstack-memcached-docker:20171017.1",
Oct 23 14:53:52 controller-0.redhat.local os-collect-config[10452]: "2017-10-23 14:53:24,282 WARNING: 15740 -- docker pull failed: Cannot connect to the Docker daemon. Is the docker daemon running on this host?",

What is the expectation of the docker service prior to the deployment? Should it be running or not? We recommended to disable it first due to:

https://bugzilla.redhat.com/show_bug.cgi?id=1503021

Also note that the stack went to create_complete even though nothing got deployed on the overcloud. It seems paunch and/or heat-config-ansible is not properly signaling a failed deployment back to Heat (wrong exit code getting used somewhere probably). The deployment definitely should have been failed since nothing got deployed.
Comment 3 James Slagle 2017-10-23 12:29:12 EDT
> Also note that the stack went to create_complete even though nothing got
> deployed on the overcloud. It seems paunch and/or heat-config-ansible is not
> properly signaling a failed deployment back to Heat (wrong exit code getting
> used somewhere probably). The deployment definitely should have been failed
> since nothing got deployed.

Alex, can you file a new bug for this issue? I think it needs to be tracked separately. It's also for DFG:Containers.
Comment 4 James Slagle 2017-10-23 12:30:59 EDT
(In reply to James Slagle from comment #2)
> i see that on controller-0, the local docker daemon is not running:
> 
> Oct 23 14:53:52 controller-0.redhat.local os-collect-config[10452]:
> "2017-10-23 14:53:24,268 WARNING: 15741 -- retrying pulling image:
> 192.168.24.1:8787/rhosp12/openstack-memcached-docker:20171017.1",
> Oct 23 14:53:52 controller-0.redhat.local os-collect-config[10452]:
> "2017-10-23 14:53:24,282 WARNING: 15740 -- docker pull failed: Cannot
> connect to the Docker daemon. Is the docker daemon running on this host?",
> 
> What is the expectation of the docker service prior to the deployment?
> Should it be running or not? We recommended to disable it first due to:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1503021

Need input from DFG:Containers on what/how to bootstrap the docker service appropriately, taking into consideration this bug and bug 1503021
Comment 5 Gurenko Alex 2017-10-24 11:41:27 EDT
(In reply to James Slagle from comment #3)
> > Also note that the stack went to create_complete even though nothing got
> > deployed on the overcloud. It seems paunch and/or heat-config-ansible is not
> > properly signaling a failed deployment back to Heat (wrong exit code getting
> > used somewhere probably). The deployment definitely should have been failed
> > since nothing got deployed.
> 
> Alex, can you file a new bug for this issue? I think it needs to be tracked
> separately. It's also for DFG:Containers.

Here is BZ open for that issue with logs attached https://bugzilla.redhat.com/show_bug.cgi?id=1505495
Comment 6 Dan Prince 2017-10-25 16:40:36 EDT
It sounds like we could be missing a signal in the case where paunch fails to configure a service correctly. I will sync with Steve Baker and see if we have any ideas on this.
Comment 7 Steve Baker 2017-10-25 17:27:27 EDT
I've commented on bug 1505495, I think it is docker-puppet.py not handling puppet exitcodes correctly.
Comment 8 Dan Prince 2017-10-27 09:53:50 EDT
Marking this as depends on for bug 1501852. I think the real issue being described here is that deployment finished but in fact should not have because some of the containers (keystone in this example) was not deployed.

There is an actual issue here with deployment but my suspicion is that we are fixing that in bug docker bootstrapping with split stack. Perhaps related to bug 1503021
Comment 9 Dan Prince 2017-10-27 09:55:28 EDT
Marking as ON_DEV as the --detail-exit codes patch upstream has been proposed:

https://review.openstack.org/#/c/511509/
Comment 10 Omri Hochman 2017-10-27 09:59:51 EDT
*** Bug 1505495 has been marked as a duplicate of this bug. ***
Comment 12 Martin André 2017-11-10 05:07:40 EST
https://review.openstack.org/#/c/517022/ merged in stable/pike.
Comment 13 Gurenko Alex 2017-11-15 10:14:25 EST
1+1 topology is now deployable with Split stack
Comment 17 errata-xmlrpc 2017-12-13 17:18:18 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3462

Note You need to log in before you can comment on or make changes to this bug.