Bug 1288550

Summary: IPs in external-lb.yaml do not match what is actually deployed in overcloud
Product: Red Hat OpenStack Reporter: John Fulton <johfulto>
Component: rhosp-directorAssignee: Giulio Fidente <gfidente>
Status: CLOSED NOTABUG QA Contact: yeylon <yeylon>
Severity: high Docs Contact:
Priority: urgent    
Version: 7.0 (Kilo)CC: calfonso, clincoln, gfidente, jcoufal, johfulto, mburns, mcornea, rhel-osp-director-maint, srevivo
Target Milestone: y2   
Target Release: 7.0 (Kilo)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-07 21:53:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description John Fulton 2015-12-04 14:37:44 UTC
I. Description of problem:

After Heat on the undercloud indicates that the overcloud was
successfully deployed using an external load balancer as described in
our documentation [0] the wrong IPs (relative to the external-lb.yaml)
are deployed to the overcloud and the resulting deployment does not
function correctly. 

The effective (given the comments) external-lb.yaml [1] has the
following mappings with nothing for storage_mgmt and tenant. 
 
    external:
    - 172.24.232.11
    - 172.24.232.12
    - 172.24.232.13
    internal_api:
    - 172.24.236.13
    - 172.24.236.14
    - 172.24.236.12
    storage:
    - 172.24.244.11
    - 172.24.244.13
    - 172.24.244.12

The resulting deployment, as per the sosreports, however indicates
that the configurations as deployed by Director received the
following IPs for internal_api and storage. 

    internal_api:
    - 172.24.236.13
    - 172.24.236.15  <--- should be 14
    - 172.24.236.12

    storage:
    - 172.24.244.11
    - 172.24.244.14  <--- should be 13
    - 172.24.244.12

The external IPs were set up correctly [2], but the internal_api [3]
and storage [4] were wrong. These incorrect settings for internal_api
and and storage were then applied to other OpenStack configurations;
e.g. it can be seen in the neutron.conf [5] and the cinder.conf [6]. 

II. Version-Release number of selected component (if applicable):

python-rdomanager-oscplugin-0.0.10-8.el7ost.noarch

III. How reproducible:

Deterministic

IV. Steps to Reproduce:

Follow the documentation [0] taking note of what IPs were specified in external-lb.yaml and what the actuall results are. 

V. Actual results:

The IPs in external-lb.yaml do not match what is actually deployed on the overcloud. 

VI. Expected results:

The IPs in external-lb.yaml do match what is actually deployed on the overcloud.

VII. Additional info:

The footnotes under "Description of problem" are below. Note also that they are using an F5 external load balancer. 

[0] 
https://access.redhat.com/documentation/en/red-hat-enterprise-linux-openstack-platform/version-7/external-load-balancing-for-the-overcloud/   

[1] 
[johfulto@collab-shell ~]$ md5sum /cases/01476906/external-lb.yaml 
b9ab02d316c74629d3bc74b4d8e6c471  /cases/01476906/external-lb.yaml
[johfulto@collab-shell ~]$ 

[2] external was set up correctly

[johfulto@collab-shell ~]$ cd /cases/01476906/2015-12-01-15xxxx
[johfulto@collab-shell 2015-12-01-15xxxx]$ grep IPADDR overcloud-controller-*/etc/sysconfig/network-scripts/ifcfg-vlan* | grep  172.24.232 
overcloud-controller-0.localdomain/etc/sysconfig/network-scripts/ifcfg-vlan1520:IPADDR=172.24.232.13
overcloud-controller-1.localdomain/etc/sysconfig/network-scripts/ifcfg-vlan1520:IPADDR=172.24.232.11
overcloud-controller-2.localdomain/etc/sysconfig/network-scripts/ifcfg-vlan1520:IPADDR=172.24.232.12
[johfulto@collab-shell 2015-12-01-15xxxx]$ 

[3] inernal_api was set up incorrectly:

[johfulto@collab-shell ~]$ cd /cases/01476906/2015-12-01-15xxxx
[johfulto@collab-shell 2015-12-01-15xxxx]$ grep IPADDR overcloud-controller-*/etc/sysconfig/network-scripts/ifcfg-vlan* | grep  172.24.236
overcloud-controller-0.localdomain/etc/sysconfig/network-scripts/ifcfg-vlan1517:IPADDR=172.24.236.15
overcloud-controller-1.localdomain/etc/sysconfig/network-scripts/ifcfg-vlan1517:IPADDR=172.24.236.12
overcloud-controller-2.localdomain/etc/sysconfig/network-scripts/ifcfg-vlan1517:IPADDR=172.24.236.13
[johfulto@collab-shell 2015-12-01-15xxxx]$ 

[4] storage was set up incorrectly

[johfulto@collab-shell ~]$ cd /cases/01476906/2015-12-01-15xxxx
[johfulto@collab-shell 2015-12-01-15xxxx]$ grep IPADDR overcloud-controller-*/etc/sysconfig/network-scripts/ifcfg-vlan* | grep  172.24.244 
overcloud-controller-0.localdomain/etc/sysconfig/network-scripts/ifcfg-vlan1518:IPADDR=172.24.244.14
overcloud-controller-1.localdomain/etc/sysconfig/network-scripts/ifcfg-vlan1518:IPADDR=172.24.244.11
overcloud-controller-2.localdomain/etc/sysconfig/network-scripts/ifcfg-vlan1518:IPADDR=172.24.244.12
[johfulto@collab-shell 2015-12-01-15xxxx]$ 

[5] neutron.conf then used the incorrect internal_api IPs 

[johfulto@collab-shell ~]$ cd /cases/01476906/2015-12-01-15xxxx
[johfulto@collab-shell 2015-12-01-15xxxx]$ grep rabbit_hosts overcloud-controller-*.localdomain/etc/neutron/neutron.conf | grep -v \# 
overcloud-controller-0.localdomain/etc/neutron/neutron.conf:rabbit_hosts = 172.24.236.15,172.24.236.12,172.24.236.13
overcloud-controller-1.localdomain/etc/neutron/neutron.conf:rabbit_hosts = 172.24.236.15,172.24.236.12,172.24.236.13
overcloud-controller-2.localdomain/etc/neutron/neutron.conf:rabbit_hosts = 172.24.236.15,172.24.236.12,172.24.236.13
[johfulto@collab-shell 2015-12-01-15xxxx]$ 

[6] cinder.conf then used incorrect storage IPs 

[johfulto@collab-shell ~]$ cd /cases/01476906/2015-12-01-15xxxx
[johfulto@collab-shell 2015-12-01-15xxxx]$ grep 172.24.244 overcloud-controller-*.localdomain/etc/cinder/cinder.conf
overcloud-controller-0.localdomain/etc/cinder/cinder.conf:iscsi_ip_address=172.24.244.14
overcloud-controller-1.localdomain/etc/cinder/cinder.conf:iscsi_ip_address=172.24.244.11
overcloud-controller-2.localdomain/etc/cinder/cinder.conf:iscsi_ip_address=172.24.244.12
[johfulto@collab-shell 2015-12-01-15xxxx]$

Comment 4 Giulio Fidente 2015-12-04 18:17:02 UTC
hi, thanks for the report

can you point out which version of openstack-tripleo-heat-templates is installed on the undercloud?

can you paste the deploy cmdline you are using?

can you attach any customized yaml you're using when deploying?

Comment 8 Giulio Fidente 2015-12-07 08:56:11 UTC
hi John, the problem seems to be in the order with which the environment file are passed to the deployment command.


The following two arguments:

-e /usr/share/openstack-tripleo-heat-templates/environments/external-loadbalancer-vip.yaml  -e ~/templates/external-lb.yaml

*must* go after:

-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml  -e /home/stack/templates/network-environment.yaml

otherwise some of the config settings they are supposed to apply will get overridden by the other two. The external lb configuration is a 'specialization' of the network isolation configuration.

You should also take out of the allocation pools (in network-environment.yaml) the IPs which are statically assigned the balancer or the controllers (that is the IPs from external-lb.yaml).


In your current setup the VIPs and the controller IPs are not assigned from the static lists instead they are taken from the allocation pool; the reason why some of them seem to match is that the allocation starts precisely from those same IPs. Could you please try again updating the allocation pools in network-environment.yaml and passing the environment files in the mentioned order?

Comment 9 Jaromir Coufal 2015-12-07 14:26:02 UTC
John, can you please verify whether Guilio's comment help to resolve your issues? We might not consider this a bug then. Thanks, --J

Comment 10 John Fulton 2015-12-07 16:01:54 UTC
Jarda and Guilio,

Thanks! We are testing these and will let you know how it goes. 

  John

Comment 11 John Fulton 2015-12-07 21:53:03 UTC
(In reply to Giulio Fidente from comment #8)
> The following two arguments:
> 
> -e /usr/share/openstack-tripleo-heat-templates/environments/external-
> loadbalancer-vip.yaml  -e ~/templates/external-lb.yaml
> 
> *must* go after:
> 
> -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.
> yaml  -e /home/stack/templates/network-environment.yaml
> 
> otherwise some of the config settings they are supposed to apply will get
> overridden by the other two. 

Thank you Giulio. The partner changed the order as you described above and the problem was resolved. 

Also, I apologize I opened this bug when our documentation does in fact say: 

 "Note that you should include this environment file after the network configuration files." 

The above is from 4.3. Creating the Overcloud of: 

https://access.redhat.com/documentation/en/red-hat-enterprise-linux-openstack-platform/7/external-load-balancing-for-the-overcloud/chapter-4-configuring-the-overcloud 

  John