Bug 1545085 - Virtual setup does not host on RHEL 7.5 as a result OSP13 deployment failed
Summary: Virtual setup does not host on RHEL 7.5 as a result OSP13 deployment failed
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: documentation
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: ---
: ---
Assignee: Dan Macpherson
QA Contact: RHOS Documentation Team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-02-14 09:19 UTC by Eran Kuris
Modified: 2019-08-27 07:47 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-08-27 07:47:04 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs (18.29 MB, application/zip)
2018-02-14 09:19 UTC, Eran Kuris
no flags Details

Description Eran Kuris 2018-02-14 09:19:09 UTC
Created attachment 1395815 [details]
logs

Description of problem:
Tried to deploy OSP13-OVN-HA setup 3 controllers 2 computes use puddle:2018-01-26.3  and it failed:

with Stdlib::Compat::Numeric. There is further documentation for validate_legacy function in the README. at [\"/etc/puppet/modules/ntp/manifests/init.pp\", 76]:[\"/etc/puppet/modules/tripleo/manifests/profile/base/time/ntp.pp\", 29]", 
            "                    with Stdlib::Compat::Hash. There is further documentation for validate_legacy function in the README. at [\"/etc/puppet/modules/ssh/manifests/server.pp\", 12]:[\"/var/lib/tripleo-config/puppet_step_config.pp\", 41]", 
            "Error: Evaluation Error: Error while evaluating a Resource Statement, Evaluation Error: Error while evaluating a Function Call, Duplicate declaration: Tripleo::Firewall::Rule[118 neutron vxlan networks] is already declared in file /etc/puppet/modules/tripleo/manifests/firewall/service_rules.pp:41; cannot redeclare at /etc/puppet/modules/tripleo/manifests/firewall/service_rules.pp:41 at /etc/puppet/modules/tripleo/manifests/firewall/service_rules.pp:41:3  at /etc/puppet/modules/tripleo/manifests/firewall.pp:105 on node controller-2.localdomain"

Version-Release number of selected component (if applicable):
osp13-ovn- p 2018-01-26.3
rpm -qa |grep triple
openstack-tripleo-heat-templates-8.0.0-0.20180122224016.el7ost.noarch
openstack-tripleo-ui-8.1.1-0.20180122135122.aef02d8.el7ost.noarch
openstack-tripleo-common-containers-8.3.1-0.20180123050218.el7ost.noarch
openstack-tripleo-puppet-elements-8.0.0-0.20180117092204.120eca8.el7ost.noarch
python-tripleoclient-9.0.1-0.20180119233147.el7ost.noarch
ansible-tripleo-ipsec-0.0.1-0.20180119094817.5e80d4f.el7ost.noarch
openstack-tripleo-image-elements-8.0.0-0.20180117094122.02d0985.el7ost.noarch
puppet-tripleo-8.2.0-0.20180122224519.9fd3379.el7ost.noarch
openstack-tripleo-common-8.3.1-0.20180123050218.el7ost.noarch
openstack-tripleo-validations-8.1.1-0.20180119231917.2ff3c79.el7ost.noarch
[stack@undercloud-0 ~]$ rpm -qa |grep ovn
puppet-ovn-12.2.0-0.20180119084039.2a9b3ad.el7ost.noarch
[stack@undercloud-0 ~]$ rpm -qa |grep neutron
openstack-neutron-12.0.0-0.20180123043113.d32ad6e.el7ost.noarch
puppet-neutron-12.2.0-0.20180123133228.7428f68.el7ost.noarch
python2-neutron-lib-1.12.0-0.20180112024322.cd07c7b.el7ost.noarch
openstack-neutron-common-12.0.0-0.20180123043113.d32ad6e.el7ost.noarch
openstack-neutron-ml2-12.0.0-0.20180123043113.d32ad6e.el7ost.noarch
python-neutron-12.0.0-0.20180123043113.d32ad6e.el7ost.noarch
openstack-neutron-openvswitch-12.0.0-0.20180123043113.d32ad6e.el7ost.noarch
python2-neutronclient-6.6.0-0.20171215103134.50b5b29.el7ost.noarch

How reproducible:
always

Steps to Reproduce:
1. run deployment of  OSP13-OVN-HA setup 3 controllers 2 computes
2.
3.

Actual results:
failed 

Expected results:
pass

Additional info:
adding entire logs & setup details

Comment 12 Eran Kuris 2018-03-07 13:45:15 UTC
Used Infrared master branch puddle 2018-03-02.2

(undercloud) [stack@undercloud-0 ~]$ openstack baremetal node list
 +--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+
| UUID                                 | Name         | Instance UUID                        | Power State | Provisioning State | Maintenance |
+--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+
| 11aa36ed-e76b-4f2c-898c-87102f80a2b2 | compute-0    | 54df5598-fd3e-406f-9a24-718ccf4461ca | power on    | active             | False       |
| e6ab1878-0077-4165-9d97-92806d72c99d | compute-1    | None                                 | power off   | available          | False       |
| ba1a34a1-6b43-4902-9b76-61d9bd798f2d | controller-0 | 9c23a2fe-a43c-4022-a7a0-2390202a0a89 | power on    | active             | False       |
| 6a266f1e-76aa-426e-b71f-3895863db9eb | controller-1 | None                                 | power off   | available          | False       |
| 335c35c8-e26d-47bd-91d8-9461f2df562c | controller-2 | None                                 | power off   | available          | False       |
+--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+
(undercloud) [stack@undercloud-0 ~]$ nova list 
+--------------------------------------+--------------+--------+------------+-------------+------------------------+
| ID                                   | Name         | Status | Task State | Power State | Networks               |
+--------------------------------------+--------------+--------+------------+-------------+------------------------+
| 54df5598-fd3e-406f-9a24-718ccf4461ca | compute-0    | ACTIVE | -          | Running     | ctlplane=192.168.24.11 |
| c18b9191-a53b-47db-90e0-9dc896b01ca0 | compute-1    | BUILD  | scheduling | NOSTATE     |                        |
| 92db40a9-baee-450c-b49a-8494d1c92a57 | controller-0 | BUILD  | scheduling | NOSTATE     |                        |
| cf1ddeae-6817-4daa-b436-7298fdb442b9 | controller-1 | ERROR  | -          | NOSTATE     |                        |
| 9c23a2fe-a43c-4022-a7a0-2390202a0a89 | controller-2 | ACTIVE | -          | Running     | ctlplane=192.168.24.16 |

overcloud_install.log:
018-03-07 13:26:17Z [overcloud.Controller.2.ControllerConfig]: CREATE_IN_PROGRESS  state changed
2018-03-07 13:26:17Z [overcloud.Controller.2.ControllerConfig]: CREATE_COMPLETE  state changed
2018-03-07 13:26:21Z [overcloud.Controller.1.Controller]: CREATE_FAILED  ResourceInError: resources.Controller: Went to status ERROR due to "Message: Build of instance cf1ddeae-6817-4daa-b436-7298fdb442b9 aborted: Failure prepping block device., Code: 500"
2018-03-07 13:26:21Z [overcloud.Controller.1]: CREATE_FAILED  Resource CREATE failed: ResourceInError: resources.Controller: Went to status ERROR due to "Message: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 30be2483-44c9-466a-9491-4619fda32970., Code: 50
2018-03-07 13:26:22Z [overcloud.Controller.1]: CREATE_FAILED  ResourceInError: resources[1].resources.Controller: WentHeat Stack create failed.
Heat Stack create failed.
 to status ERROR due to "Message: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 30be2483-44c9-466a-9491-4619fda32970., Code: 500"
2018-03-07 13:26:22Z [overcloud.Controller]: UPDATE_FAILED  Resource CREATE failed: ResourceInError: resources[1].resources.Controller: Went to status ERROR due to "Message: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 30be2483-44c9-466a-9491-4619fda329
2018-03-07 13:26:23Z [overcloud.Controller]: CREATE_FAILED  resources.Controller: Resource CREATE failed: ResourceInError: resources[1].resources.Controller: Went to status ERROR due to "Message: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 30be2483-44c
2018-03-07 13:26:23Z [overcloud]: CREATE_FAILED  Resource CREATE failed: resources.Controller: Resource CREATE failed: ResourceInError: resources[1].resources.Controller: Went to status ERROR due to "Message: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures f

 Stack overcloud CREATE_FAILED

overcloud.Controller.1.Controller:
  resource_type: OS::TripleO::ControllerServer
  physical_resource_id: cf1ddeae-6817-4daa-b436-7298fdb442b9
  status: CREATE_FAILED
  status_reason: |
    ResourceInError: resources.Controller: Went to status ERROR due to "Message: Build of instance cf1ddeae-6817-4daa-b436-7298fdb442b9 aborted: Failure prepping block device., Code: 500"

Comment 13 Numan Siddique 2018-03-08 07:40:06 UTC
I don't think this bug is because of OVN. Looks like there is a problem with other components as can be seen by Error messages

****
2018-03-07 13:26:21Z [overcloud.Controller.1.Controller]: CREATE_FAILED  ResourceInError: resources.Controller: Went to status ERROR due to "Message: Build of instance cf1ddeae-6817-4daa-b436-7298fdb442b9 aborted: Failure prepping block device., Code: 500"
...
***

Comment 19 Roee Agiman 2018-04-11 13:36:53 UTC
Hey.
Re-opening due to issues popping out recently.
This is probably not duplicate as mentioned.

https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/OSPD-Customized-Deployment-virt/3514/

Take a look at this Customized Job, failing in the overcloud step due to same reasons that should be fixed here.

Console Output:

13:42:58 TASK [get ironic info for the node] ********************************************
13:42:58 task path: /home/rhos-ci/jenkins/workspace/OSPD-Customized-Deployment-virt@2/infrared/plugins/tripleo-overcloud/tasks/add_overcloud_host.yml:2
13:42:58 fatal: [undercloud-0]: FAILED! => {
13:42:58     "changed": true, 
13:42:58     "cmd": "source ~/stackrc\n openstack baremetal node show  -c name -f value", 
13:42:58     "delta": "0:00:02.120858", 
13:42:58     "end": "2018-04-09 09:43:11.794658", 
13:42:58     "failed": true, 
13:42:58     "rc": 2, 
13:42:58     "start": "2018-04-09 09:43:09.673800"
13:42:58 }
13:42:58 
13:42:58 STDERR:
13:42:58 
13:42:58 usage: openstack baremetal node show [-h] [-f {json,shell,table,value,yaml}]
13:42:58                                      [-c COLUMN] [--max-width <integer>]
13:42:58                                      [--fit-width] [--print-empty]
13:42:58                                      [--noindent] [--prefix PREFIX]
13:42:58                                      [--instance]
13:42:58                                      [--fields <field> [<field> ...]]
13:42:58                                      <node>
13:42:58 openstack baremetal node show: error: too few arguments
13:42:58 
13:42:58 
13:42:58 MSG:
13:42:58 
13:42:58 non-zero return code

Please contact me if any further information is needed.

Comment 20 Bob Fournier 2018-04-12 16:29:34 UTC
The command "openstack baremetal node show" requires a node (name or uuid). It looks like the script is not supplying it which is why the error occurs as expected.

I'm moving this bug back to its previous state as this is unrelated to the issues discussed earlier. If there is a still a problem after changing the command to provide a node, please reopen a separate bug.

*** This bug has been marked as a duplicate of bug 1551603 ***

Comment 24 Alex Schultz 2018-11-26 21:36:22 UTC
docker.yaml and docker-ha.yaml shouldn't need to be provided anymore as they are the defaults. It appears that the inclusion of them elsewhere can cause issues.  I don't think this constitutes a blocker anymore however.

Comment 26 Dan Macpherson 2019-03-18 23:14:45 UTC
Hi folks,

Is there anything required from a documentation perspective. By the looks of Alex's comment in comment #24 it appears this issue is with the inclusion of the docker/yaml and docker-ha.yaml file. So it seems like this isn't a docs issue. Can any one confirm?

Eran -- were there any docs requirements you had?

Miguel -- when the BZ was changed to a docs BZ, were there any specific docs fixes you had in mind?

Comment 27 Eran Kuris 2019-03-19 06:24:33 UTC
(In reply to Dan Macpherson from comment #26)
> Hi folks,
> 
> Is there anything required from a documentation perspective. By the looks of
> Alex's comment in comment #24 it appears this issue is with the inclusion of
> the docker/yaml and docker-ha.yaml file. So it seems like this isn't a docs
> issue. Can any one confirm?
> 
> Eran -- were there any docs requirements you had?
> 
> Miguel -- when the BZ was changed to a docs BZ, were there any specific docs
> fixes you had in mind?

From my understanding, we just need to document that the order of the yaml files is important and of course that 
docker.yaml and docker-ha.yaml shouldn't need to be provided anymore as they are the defaults.

Comment 29 Eran Kuris 2019-03-19 12:57:49 UTC
(In reply to Dan Macpherson from comment #28)
> I think we've got a section on environment file order already:
> 
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/
> html/director_installation_and_usage/chap-
> configuring_basic_overcloud_requirements_with_the_cli_tools#sect-
> Customizing_the_Overcloud
> 
> How does that look?

it looks ok but I think we need to add that docker.yaml and docker-ha.yaml shouldn't need to be provided anymore as they are the defaults.


Note You need to log in before you can comment on or make changes to this bug.