Bug 1394466

Summary: overcloud deployment failed
Product: Red Hat OpenStack Reporter: Pratik Pravin Bandarkar <pbandark>
Component: rhosp-directorAssignee: Angus Thomas <athomas>
Status: CLOSED NOTABUG QA Contact: Omri Hochman <ohochman>
Severity: urgent Docs Contact:
Priority: high    
Version: 10.0 (Newton)CC: aschultz, dbecker, jraju, mburns, mcornea, michele, morazi, rhel-osp-director-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-11-14 16:01:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Pratik Pravin Bandarkar 2016-11-12 10:50:46 UTC
Description of problem:

overcloud deployment command(with bonding+vlans) successful. But, actual deployment failed. No openstack service is configured on overcloud nodes. 

[stack@instack ~]$ nova list
+--------------------------------------+------------------------+--------+------------+-------------+---------------------+
| ID                                   | Name                   | Status | Task State | Power State | Networks            |
+--------------------------------------+------------------------+--------+------------+-------------+---------------------+
| 83b24465-1d8a-4076-bdd6-17d10ff770d7 | overcloud-compute-0    | ACTIVE | -          | Running     | ctlplane=192.0.2.17 |
| 45a523c1-c19c-4f79-a73c-d4da24a0bb27 | overcloud-controller-0 | ACTIVE | -          | Running     | ctlplane=192.0.2.18 |
| 6b918408-9759-41ac-9d06-81723fc164c7 | overcloud-controller-1 | ACTIVE | -          | Running     | ctlplane=192.0.2.12 |
| d6be5420-752a-4f2a-9031-e065b4e1df37 | overcloud-controller-2 | ACTIVE | -          | Running     | ctlplane=192.0.2.16 |
+--------------------------------------+------------------------+--------+------------+-------------+---------------------+
 

 
[stack@instack ~]$ openstack overcloud deploy --compute-scale 1 --control-scale 3  --templates  -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /home/stack/nic-config/network-environment.yaml  --control-flavor control --compute-flavor compute --ntp-server  clock.redhat.com
 
Creating Swift container to store the plan
Creating plan from template files in: /tmp/tripleoclient-5Kw_qD/tripleo-heat-templates
 
 
 
 
 
Started Mistral Workflow. Execution ID: b892f9b9-f0b6-4a70-a666-1d1bcc1f0b26
Plan created
Deploying templates in the directory /tmp/tripleoclient-5Kw_qD/tripleo-heat-templates
Started Mistral Workflow. Execution ID: e27bab3b-f221-47da-a05b-f9e68218b62e
2016-11-12 11:34:54Z [overcloud]: CREATE_IN_PROGRESS  Stack CREATE started
 
 
[...]
 
2016-11-12 12:12:05Z [overcloud.AllNodesDeploySteps.CephStorageExtraConfigPost]: CREATE_IN_PROGRESS  state changed
2016-11-12 12:12:05Z [overcloud.AllNodesDeploySteps.BlockStorageExtraConfigPost]: CREATE_IN_PROGRESS  state changed
2016-11-12 12:12:06Z [overcloud.AllNodesDeploySteps.ObjectStorageExtraConfigPost]: CREATE_IN_PROGRESS  state changed
2016-11-12 12:12:06Z [overcloud.AllNodesDeploySteps.ComputeExtraConfigPost]: CREATE_IN_PROGRESS  state changed
2016-11-12 12:12:06Z [overcloud.AllNodesDeploySteps.ControllerExtraConfigPost]: CREATE_IN_PROGRESS  state changed
2016-11-12 12:12:07Z [overcloud.AllNodesDeploySteps.CephStorageExtraConfigPost]: CREATE_COMPLETE  state changed
2016-11-12 12:12:07Z [overcloud.AllNodesDeploySteps.BlockStorageExtraConfigPost]: CREATE_COMPLETE  state changed
2016-11-12 12:12:07Z [overcloud.AllNodesDeploySteps.ObjectStorageExtraConfigPost]: CREATE_COMPLETE  state changed
2016-11-12 12:12:07Z [overcloud.AllNodesDeploySteps.ComputeExtraConfigPost]: CREATE_COMPLETE  state changed
2016-11-12 12:12:07Z [overcloud.AllNodesDeploySteps.ControllerExtraConfigPost]: CREATE_COMPLETE  state changed
2016-11-12 12:12:07Z [overcloud.AllNodesDeploySteps]: CREATE_COMPLETE  Stack CREATE completed successfully
2016-11-12 12:12:08Z [overcloud.AllNodesDeploySteps]: CREATE_COMPLETE  state changed
2016-11-12 12:12:08Z [overcloud]: CREATE_COMPLETE  Stack CREATE completed successfully
 
 Stack overcloud CREATE_COMPLETE
 
Started Mistral Workflow. Execution ID: 935fcfb5-e01f-43de-85f2-3576d19e429f
/home/stack/.ssh/known_hosts updated.
Original contents retained as /home/stack/.ssh/known_hosts.old
Overcloud Endpoint: http://10.11.48.22:5000/v2.0
Overcloud Deployed
[stack@instack ~]$
 
[stack@instack ~]$ heat stack-list
WARNING (shell) "heat stack-list" is deprecated, please use "openstack stack list" instead
+--------------------------------------+------------+-----------------+----------------------+--------------+
| id                                   | stack_name | stack_status    | creation_time        | updated_time |
+--------------------------------------+------------+-----------------+----------------------+--------------+
| 55c920e4-6a15-463f-a3a5-c6a10acb03bc | overcloud  | CREATE_COMPLETE | 2016-11-12T11:34:54Z | None         |
+--------------------------------------+------------+-----------------+----------------------+--------------+
 
 
 
[stack@instack ~]$ openstack stack resource list overcloud |grep -i fail
 
+++++++++++++++++++++++++++++++++++++
 
[stack@instack ~]$ ssh heat-admin.2.18
Last login: Sat Nov 12 12:22:37 2016 from 192.0.2.1
[heat-admin@overcloud-controller-0 ~]$ sudo su
pcs st[root@overcloud-controller-0 heat-admin]# pcs status
Cluster name: tripleo_cluster
Stack: corosync
Current DC: overcloud-controller-1 (version 1.1.15-11.el7_3.2-e174ec8) - partition with quorum
Last updated: Sat Nov 12 12:40:24 2016          Last change: Sat Nov 12 12:05:32 2016 by root via cibadmin on overcloud-controller-0
 
3 nodes and 19 resources configured
 
Online: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 
Full list of resources:
 
 ip-192.168.125.23      (ocf::heartbeat:IPaddr2):       Started overcloud-controller-0
 ip-10.11.48.22 (ocf::heartbeat:IPaddr2):       Stopped
 ip-192.168.124.30      (ocf::heartbeat:IPaddr2):       Started overcloud-controller-1
 Clone Set: haproxy-clone [haproxy]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 ip-192.0.2.7   (ocf::heartbeat:IPaddr2):       Started overcloud-controller-2
 Master/Slave Set: galera-master [galera]
     Masters: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 ip-192.168.124.29      (ocf::heartbeat:IPaddr2):       Started overcloud-controller-0
 ip-192.168.128.21      (ocf::heartbeat:IPaddr2):       Stopped
 Clone Set: rabbitmq-clone [rabbitmq]
     Started: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
 Master/Slave Set: redis-master [redis]
     Masters: [ overcloud-controller-2 ]
     Slaves: [ overcloud-controller-0 overcloud-controller-1 ]
 openstack-cinder-volume        (systemd:openstack-cinder-volume):      Started overcloud-controller-1
 
Failed Actions:
* ip-10.11.48.22_start_0 on overcloud-controller-2 'unknown error' (1): call=94, status=complete, exitreason='Unable to find nic or netmask.',
    last-rc-change='Sat Nov 12 12:24:24 2016', queued=0ms, exec=127ms
* ip-192.168.128.21_start_0 on overcloud-controller-2 'unknown error' (1): call=125, status=complete, exitreason='Unable to find nic or netmask.',
    last-rc-change='Sat Nov 12 12:24:30 2016', queued=0ms, exec=77ms
* ip-10.11.48.22_start_0 on overcloud-controller-0 'unknown error' (1): call=102, status=complete, exitreason='Unable to find nic or netmask.',
    last-rc-change='Sat Nov 12 12:24:24 2016', queued=0ms, exec=45ms
* ip-192.168.128.21_start_0 on overcloud-controller-0 'unknown error' (1): call=133, status=complete, exitreason='Unable to find nic or netmask.',
    last-rc-change='Sat Nov 12 12:24:30 2016', queued=0ms, exec=47ms
* ip-10.11.48.22_start_0 on overcloud-controller-1 'unknown error' (1): call=103, status=complete, exitreason='Unable to find nic or netmask.',
    last-rc-change='Sat Nov 12 12:24:22 2016', queued=1ms, exec=105ms
* ip-192.168.128.21_start_0 on overcloud-controller-1 'unknown error' (1): call=134, status=complete, exitreason='Unable to find nic or netmask.',
    last-rc-change='Sat Nov 12 12:24:29 2016', queued=0ms, exec=84ms
 
 
Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Version-Release number of selected component (if applicable):
RHOS10

How reproducible:
100%

Steps to Reproduce:
1. deploy RHOS10 overcloud with bond+vlans


Actual results:
The overcloud deployment command is successful. But, actual overcloud deployment failed. No openstack service is configured on overcloud nodes. 


Expected results:
Overcloud deployment should be successful. 

Additional info:

Comment 8 Michele Baldessari 2017-11-14 16:01:04 UTC
The reason that the VIPs did not come up is because the underlying network configuration does not have the right nic/networks assigned.
For example on controller-0:
 more ip_addr                                                                                                                                                                        130 ↵
1: lo    inet 127.0.0.1/8 scope host lo\       valid_lft forever preferred_lft forever
1: lo    inet6 ::1/128 scope host \       valid_lft forever preferred_lft forever
2: eth0    inet 192.0.2.18/24 brd 192.0.2.255 scope global eth0\       valid_lft forever preferred_lft forever
2: eth0    inet6 fe80::20a:83ff:fef8:8b57/64 scope link \       valid_lft forever preferred_lft forever
3: eth1    inet6 fe80::5054:ff:fe49:517e/64 scope link \       valid_lft forever preferred_lft forever
4: eth2    inet6 fe80::5054:ff:fea9:b171/64 scope link \       valid_lft forever preferred_lft forever
6: br-ex    inet6 fe80::882e:b9ff:fe86:144c/64 scope link \       valid_lft forever preferred_lft forever
7: vlan20    inet 192.168.125.24/24 brd 192.168.125.255 scope global vlan20\       valid_lft forever preferred_lft forever
7: vlan20    inet 192.168.125.23/32 brd 192.168.125.255 scope global vlan20\       valid_lft forever preferred_lft forever
7: vlan20    inet6 fe80::6c4f:fdff:fe6f:d4b0/64 scope link \       valid_lft forever preferred_lft forever
8: vlan30    inet 192.168.124.28/24 brd 192.168.124.255 scope global vlan30\       valid_lft forever preferred_lft forever
8: vlan30    inet 192.168.124.29/32 brd 192.168.124.255 scope global vlan30\       valid_lft forever preferred_lft forever
8: vlan30    inet6 fe80::30c3:ecff:fe63:e7da/64 scope link \       valid_lft forever preferred_lft forever
9: vlan50    inet 192.168.123.31/24 brd 192.168.123.255 scope global vlan50\       valid_lft forever preferred_lft forever
9: vlan50    inet6 fe80::8cda:9eff:fe2c:c502/64 scope link \       valid_lft forever preferred_lft forever


There is no way that a VIP like ip-10.11.48.22_start_0  can start on such a node

Marking as invalid as it is more of a misconfiguration.