Bug 1505424
Summary: | [Splitstack] Overcloud is not functional after the deployment due | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Gurenko Alex <agurenko> | ||||
Component: | openstack-tripleo-heat-templates | Assignee: | Martin André <maandre> | ||||
Status: | CLOSED ERRATA | QA Contact: | Gurenko Alex <agurenko> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 12.0 (Pike) | CC: | agurenko, dprince, jcoufal, jjoyce, jschluet, jslagle, m.andre, mburns, ohochman, rhel-osp-director-maint, sbaker | ||||
Target Milestone: | beta | Keywords: | Triaged | ||||
Target Release: | 12.0 (Pike) | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | openstack-tripleo-heat-templates-7.0.3-6.el7ost | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2017-12-13 22:18:18 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 1501852 | ||||||
Bug Blocks: | |||||||
Attachments: |
|
Description
Gurenko Alex
2017-10-23 14:36:52 UTC
i see that on controller-0, the local docker daemon is not running: Oct 23 14:53:52 controller-0.redhat.local os-collect-config[10452]: "2017-10-23 14:53:24,268 WARNING: 15741 -- retrying pulling image: 192.168.24.1:8787/rhosp12/openstack-memcached-docker:20171017.1", Oct 23 14:53:52 controller-0.redhat.local os-collect-config[10452]: "2017-10-23 14:53:24,282 WARNING: 15740 -- docker pull failed: Cannot connect to the Docker daemon. Is the docker daemon running on this host?", What is the expectation of the docker service prior to the deployment? Should it be running or not? We recommended to disable it first due to: https://bugzilla.redhat.com/show_bug.cgi?id=1503021 Also note that the stack went to create_complete even though nothing got deployed on the overcloud. It seems paunch and/or heat-config-ansible is not properly signaling a failed deployment back to Heat (wrong exit code getting used somewhere probably). The deployment definitely should have been failed since nothing got deployed.
> Also note that the stack went to create_complete even though nothing got
> deployed on the overcloud. It seems paunch and/or heat-config-ansible is not
> properly signaling a failed deployment back to Heat (wrong exit code getting
> used somewhere probably). The deployment definitely should have been failed
> since nothing got deployed.
Alex, can you file a new bug for this issue? I think it needs to be tracked separately. It's also for DFG:Containers.
(In reply to James Slagle from comment #2) > i see that on controller-0, the local docker daemon is not running: > > Oct 23 14:53:52 controller-0.redhat.local os-collect-config[10452]: > "2017-10-23 14:53:24,268 WARNING: 15741 -- retrying pulling image: > 192.168.24.1:8787/rhosp12/openstack-memcached-docker:20171017.1", > Oct 23 14:53:52 controller-0.redhat.local os-collect-config[10452]: > "2017-10-23 14:53:24,282 WARNING: 15740 -- docker pull failed: Cannot > connect to the Docker daemon. Is the docker daemon running on this host?", > > What is the expectation of the docker service prior to the deployment? > Should it be running or not? We recommended to disable it first due to: > > https://bugzilla.redhat.com/show_bug.cgi?id=1503021 Need input from DFG:Containers on what/how to bootstrap the docker service appropriately, taking into consideration this bug and bug 1503021 (In reply to James Slagle from comment #3) > > Also note that the stack went to create_complete even though nothing got > > deployed on the overcloud. It seems paunch and/or heat-config-ansible is not > > properly signaling a failed deployment back to Heat (wrong exit code getting > > used somewhere probably). The deployment definitely should have been failed > > since nothing got deployed. > > Alex, can you file a new bug for this issue? I think it needs to be tracked > separately. It's also for DFG:Containers. Here is BZ open for that issue with logs attached https://bugzilla.redhat.com/show_bug.cgi?id=1505495 It sounds like we could be missing a signal in the case where paunch fails to configure a service correctly. I will sync with Steve Baker and see if we have any ideas on this. I've commented on bug 1505495, I think it is docker-puppet.py not handling puppet exitcodes correctly. Marking this as depends on for bug 1501852. I think the real issue being described here is that deployment finished but in fact should not have because some of the containers (keystone in this example) was not deployed. There is an actual issue here with deployment but my suspicion is that we are fixing that in bug docker bootstrapping with split stack. Perhaps related to bug 1503021 Marking as ON_DEV as the --detail-exit codes patch upstream has been proposed: https://review.openstack.org/#/c/511509/ *** Bug 1505495 has been marked as a duplicate of this bug. *** https://review.openstack.org/#/c/517022/ merged in stable/pike. 1+1 topology is now deployable with Split stack Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:3462 |