Description of problem: When using Ironic in the overcloud in conjunction with a custom network created in network_dsata.yaml, it was found that the VIP was created successfully but was not added to the interface on the node. This is the VIP that was created for the OcProvisioning network: (undercloud) [stack@host01 ~]$ openstack port show oc_provisioning_virtual_ip -c fixed_ips +-----------+----------------------------------------------------------------------------+ | Field | Value | +-----------+----------------------------------------------------------------------------+ | fixed_ips | ip_address='172.21.2.10', subnet_id='30fac020-2702-41ad-b478-37c3d6d0b580' | +-----------+----------------------------------------------------------------------------+ On the controller node that uses this network only a single IP associated with the network is brought up, not the VIP. 11: vlan205: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN qlen 1000 link/ether ee:be:ca:e2:1c:39 brd ff:ff:ff:ff:ff:ff inet 172.21.2.18/24 brd 172.21.2.255 scope global vlan205 valid_lft forever preferred_lft forever inet6 fe80::ecbe:caff:fee2:1c39/64 scope link valid_lft forever preferred_lft forever i.e. VIP 172.21.2.10 is not on this interface Compare this to a non-custom network which has the VIP, 172.23.3.19 is the VIP for the StorageMgmt network: 13: vlan2001: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN qlen 1000 link/ether 7a:1a:e3:30:26:a9 brd ff:ff:ff:ff:ff:ff inet 172.23.3.18/24 brd 172.23.3.255 scope global vlan2001 valid_lft forever preferred_lft forever inet 172.23.3.19/32 brd 172.23.3.255 scope global vlan2001 valid_lft forever preferred_lft forever inet6 fe80::781a:e3ff:fe30:26a9/64 scope link valid_lft forever preferred_lft forever This configuration is using haproxy and this is how the VIP, again for StorageMgmt is assigned to interface: 16:47:53 localhost journal: #033[mNotice: /Stage[main]/Tripleo::Profile::Pacemaker::Haproxy_bundle/Tripleo::Pacemaker::Haproxy_with_vip[haproxy_and_storage_mgmt_vip]/Pacemaker::Resource::Ip[storage_mgmt_vip]/Pcmk_resource[ip-172.23.3.19]/ensure: created#033[0m Dec 11 16:47:53 localhost journal: #033[0;32mInfo: Pacemaker::Resource::Ip[storage_mgmt_vip]: Unscheduling all events on Pacemaker::Resource::Ip[storage_mgmt_vip]#033[0m Dec 11 16:47:53 localhost IPaddr2(ip-172.23.3.19)[78806]: INFO: Adding inet address 172.23.3.19/32 with broadcast address 172.23.3.255 to device vlan2001 Dec 11 16:47:53 localhost IPaddr2(ip-172.23.3.19)[78806]: INFO: Bringing device vlan2001 up The haproxy code in puppet-triplet only uses the standard isolated networks and does not have a mechanism for custom networks - https://github.com/openstack/puppet-tripleo/blob/master/manifests/profile/pacemaker/haproxy.pp#L140 Version-Release number of selected component (if applicable): puddle 12.0-20171129.1 puppet-tripleo-7.4.3-11.el7ost.noarch openstack-tripleo-heat-templates-7.0.3-17.el7ost.noarch How reproducible: Every time Steps to Reproduce: New network in network_data.yaml # custom network for Overcloud provisioning - name: OcProvisioning name_lower: oc_provisioning vip: true ip_subnet: '172.21.2.0/24' allocation_pools: [{'start': '172.21.2.10', 'end': '172.21.2.200'}] ipv6_subnet: 'fd00:fd00:fd00:7000::/64' ipv6_allocation_pools: [{'start': 'fd00:fd00:fd00:7000::10', 'end': 'fd00:fd00:fd00:7000:ffff:ffff:ffff:fffe'}] Its using Vlan 205 OcProvisioningNetworkVlanID: 205 Its added it for the Controller in roles_data.yaml networks: <snip> - OcProvisioning Its added to ServiceNetMap: ServiceNetMap: IronicApiNetwork: oc_provisioning # changed from ctlplane IronicNetwork: oc_provisioning # changed from ctlplane After OC deployment the network was created fine and the IP was added to the overcloud-controller node: 11: vlan205: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN qlen 1000 link/ether ee:be:ca:e2:1c:39 brd ff:ff:ff:ff:ff:ff inet 172.21.2.18/24 brd 172.21.2.255 scope global vlan205 valid_lft forever preferred_lft forever inet6 fe80::ecbe:caff:fee2:1c39/64 scope link valid_lft forever preferred_lft forever Actual results: The VIP, 172.21.2.1 in this case, should be added to the vlan205 interface on the controller, but its not. 11: vlan205: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN qlen 1000 link/ether ee:be:ca:e2:1c:39 brd ff:ff:ff:ff:ff:ff inet 172.21.2.18/24 brd 172.21.2.255 scope global vlan205 valid_lft forever preferred_lft forever inet6 fe80::ecbe:caff:fee2:1c39/64 scope link valid_lft forever preferred_lft forever Expected results: VIP added to vlan205 interface on controller. Additional info:
Upstream patches are here: https://review.openstack.org/#/c/531037/ https://review.openstack.org/#/c/531036/ When merged they must be backported to OSP-12.
Installed latest osp 12 2018-03-10.1 Env: [stack@undercloud-0 ~]$ rpm -qa | grep puppet-tripleo puppet-tripleo-7.4.8-4.el7ost.noarch To verify https://bugzilla.redhat.com/show_bug.cgi?id=1525550 Verified that puppet creates a table for network_virtual_ips on controller: [heat-admin@controller-0 ~]$ sudo cat /etc/puppet/hieradata/vip_data.json ... "network_virtual_ips": { "internal_api": { "index": 1, "ip_address": "172.17.1.12" }, "storage": { "index": 2, "ip_address": "172.17.3.11" }, "storage_mgmt": { "index": 3, "ip_address": "172.17.4.15" } }, Verified that VIP is correctly configured on controller: [heat-admin@controller-0 ~]$ ip a | grep -B 4 172.17.1.12 9: vlan20: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN qlen 1000 link/ether b2:d1:3c:78:99:eb brd ff:ff:ff:ff:ff:ff inet 172.17.1.20/24 brd 172.17.1.255 scope global vlan20 valid_lft forever preferred_lft forever inet 172.17.1.12/32 brd 172.17.1.255 scope global vlan20 Verified rpm exceeds Fixed In Version (undercloud) [stack@undercloud-0 ~]$ rpm -qa | grep puppet-tripleo puppet-tripleo-7.4.8-4.el7ost.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0607
I am hitting this issue even though I seems to have required rpms: (undercloud) [stack@undercloud ~]$ rpm -qa | grep puppet-tripleo puppet-tripleo-7.4.8-5.el7ost.noarch sudo cat /etc/puppet/hieradata/vip_data.json "network_virtual_ips": { "custombm": { "index": 4, "ip_address": "172.31.10.14" }, "internal_api": { "index": 1, "ip_address": "172.31.1.14" }, "storage": { "index": 2, "ip_address": "172.31.3.14" }, "storage_mgmt": { "index": 3, "ip_address": "172.31.4.14" } }, pcs status doesn't show that custom vip: [root@chrisj-controller-0 ~]# pcs status Cluster name: tripleo_cluster Stack: corosync Current DC: chrisj-controller-0 (version 1.1.16-12.el7_4.7-94ff4df) - partition with quorum Last updated: Thu May 3 04:26:21 2018 Last change: Thu May 3 04:11:20 2018 by root via cibadmin on chrisj-controller-0 4 nodes configured 17 resources configured Online: [ chrisj-controller-0 ] GuestOnline: [ galera-bundle-0@chrisj-controller-0 rabbitmq-bundle-0@chrisj-controller-0 redis-bundle-0@chrisj-controller-0 ] Full list of resources: Docker container: rabbitmq-bundle [172.31.0.10:8787/rhosp12/openstack-rabbitmq:pcmklatest] rabbitmq-bundle-0 (ocf::heartbeat:rabbitmq-cluster): Started chrisj-controller-0 Docker container: galera-bundle [172.31.0.10:8787/rhosp12/openstack-mariadb:pcmklatest] galera-bundle-0 (ocf::heartbeat:galera): Master chrisj-controller-0 Docker container: redis-bundle [172.31.0.10:8787/rhosp12/openstack-redis:pcmklatest] redis-bundle-0 (ocf::heartbeat:redis): Master chrisj-controller-0 ip-172.31.0.40 (ocf::heartbeat:IPaddr2): Started chrisj-controller-0 ip-172.31.8.20 (ocf::heartbeat:IPaddr2): Started chrisj-controller-0 ip-172.31.1.15 (ocf::heartbeat:IPaddr2): Started chrisj-controller-0 ip-172.31.1.14 (ocf::heartbeat:IPaddr2): Started chrisj-controller-0 ip-172.31.3.14 (ocf::heartbeat:IPaddr2): Started chrisj-controller-0 ip-172.31.4.14 (ocf::heartbeat:IPaddr2): Started chrisj-controller-0 Docker container: haproxy-bundle [172.31.0.10:8787/rhosp12/openstack-haproxy:pcmklatest] haproxy-bundle-docker-0 (ocf::heartbeat:docker): Started chrisj-controller-0 openstack-cinder-volume (systemd:openstack-cinder-volume): Started chrisj-controller-0 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled I am also unable to ping it: [root@chrisj-controller-0 ~]# ping -c 1 172.31.10.14 PING 172.31.10.14 (172.31.10.14) 56(84) bytes of data. From 172.31.10.28 icmp_seq=1 Destination Host Unreachable --- 172.31.10.14 ping statistics --- 1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms The only difference I can see is I have defined the custom VIP in here: CustomBMVirtualFixedIPs: [{'ip_address':'172.31.10.14'}] Any ideas?
Created attachment 1430421 [details] sos from controller
Created attachment 1430422 [details] templates used for deployment
Created attachment 1430423 [details] sosreport-undercloud-parta
Created attachment 1430424 [details] sosreport-undercloud-partb
Chris - have you tried it without the CustomBMVirtualFixedIPs setting? In our testing this worked fine with the custom network in network_data.yaml and "vip: true" although in that case the vip came from the allocation range. Looking through the controller sosreport it looks like the VIP was not created: 8: vlan320 inet 172.31.10.28/24 brd 172.31.10.255 scope global vlan320\ valid_lft forever preferred_lft forever 8: vlan320 inet6 fe80::984f:17ff:feb7:eda6/64 scope link \ valid_lft forever preferred_lft forever resulting in: containers/nova/nova-compute.log:2018-05-03 04:15:04.530 1 ERROR ironicclient.common.http [req-22a4ade9-d6a7-418a-bdf8-abef5e876740 - - - - -] Error contacting Ironic server: Unable to establish connection to http://172.31.10.14:6385/v1/nodes/detail: HTTPConnectionPool(host='172.31.10.14', port=6385): Max retries exceeded with url: /v1/nodes/detail (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7fa44d1f5590>: Failed to establish a new connection: [Errno 113] EHOSTUNREACH',)). Attempt 61 of 61: ConnectFailure: Unable to establish connection to http://172.31.10.14:6385/v1/nodes/detail: HTTPConnectionPool(host='172.31.10.14', port=6385): Max retries exceeded with url: /v1/nodes/detail (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7fa44d1f5590>: Failed to establish a new connection: [Errno 113] EHOSTUNREACH',))
(In reply to Chris Janiszewski from comment #9) > I am hitting this issue even though I seems to have required rpms: Chris, looking through your network templates, it appears that you are not instantiating the CustomBM network correctly. Looking at network-isolation.yaml, I see the other networks, but not CustomBM: OS::TripleO::Network::Ports::ExternalVipPort: /usr/share/openstack-tripleo-heat-templates/network/ports/external.yaml OS::TripleO::Network::Ports::InternalApiVipPort: /usr/share/openstack-tripleo-heat-templates/network/ports/internal_api.yaml OS::TripleO::Network::Ports::StorageVipPort: /usr/share/openstack-tripleo-heat-templates/network/ports/storage.yaml OS::TripleO::Network::Ports::StorageMgmtVipPort: /usr/share/openstack-tripleo-heat-templates/network/ports/storage_mgmt.yaml OS::TripleO::Network::Ports::RedisVipPort: /usr/share/openstack-tripleo-heat-templates/network/ports/vip.yaml You will need to instantiate the CustomBM network, ports, and VIP in the network-isolation.yaml, otherwise the network won't be included in the deployment: OS::TripleO::Network::CustomBM: /usr/share/openstack-tripleo-heat-templates/network/custombm.yaml OS::TripleO::Network::Ports::StorageMgmtVipPort: /usr/share/openstack-tripleo-heat-templates/network/ports/custombm.yaml OS::TripleO::Controller::Ports::CustomBMPort: /usr/share/openstack-tripleo-heat-templates/network/ports/custombm.yaml
Thanks guys, I have worked this issue around by just creating vip directly with pcs using: pcs resource create ip-172.31.10.14 ocf:heartbeat:IPaddr2 \ ip=172.31.10.14 cidr_netmask=32 nic=vlan320 \ op monitor interval=30s I need to demo this environment tomorrow but after that I'll attempt this again with steps provided by Dan in Comment #15. If that doesn't work I'll try Bob's suggestion.
(In reply to Dan Sneddon from comment #15) > Chris, looking through your network templates, it appears that you are not > instantiating the CustomBM network correctly. Looking at > network-isolation.yaml, I see the other networks, but not CustomBM: I forgot to include my deploy script. (undercloud) [stack@undercloud ~]$ cat deploy.sh #!/bin/bash source ~/stackrc cd ~/ time openstack overcloud deploy --templates --stack chrisj \ -r /home/stack/templates/roles_data.yaml \ -n /home/stack/templates/network_data.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-endpoints-public-dns.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services-docker/ironic.yaml \ -e /home/stack/templates/network-environment.yaml \ -e /home/stack/templates/ceph-custom-config.yaml \ -e /home/stack/templates/enable-tls.yaml \ -e /home/stack/templates/ExtraConfig.yaml \ -e /home/stack/templates/inject-trust-anchor.yaml \ -e /home/stack/templates/inject-trust-anchor-hiera.yaml \ -e /home/stack/templates/fernet.yaml \ -e /home/stack/templates/deployment-artifacts.yaml \ -e /home/stack/templates/docker-registry.yaml \ -e /home/stack/templates/logging-environment.yaml \ -e /home/stack/templates/monitoring-environment.yaml \ -e /home/stack/templates/collectd-environment.yaml As you can see I am not using the network-isolation.yaml from my templates directory .. but the generic one in /usr/share/openstack-tripleo-heat-templates/environments/ (which btw doesn't exist). I was hoping the deployment would generate one with the right ports, but it hasn't. Nevertheless the assignment of the VLANs and subnets to physical ports worked properly. Sorry for the confusion of leaving this wrong network-isolation.yaml file in my custom templates.
(In reply to Chris Janiszewski from comment #17) > As you can see I am not using the network-isolation.yaml from my templates > directory .. but the generic one in > /usr/share/openstack-tripleo-heat-templates/environments/ (which btw doesn't > exist). > I was hoping the deployment would generate one with the right ports, but it > hasn't. Nevertheless the assignment of the VLANs and subnets to physical > ports worked properly. Sorry for the confusion of leaving this wrong > network-isolation.yaml file in my custom templates. Chris, it appears to me from looking at the SOS reports that a VIP of 172.31.10.14 is getting assigned via the tripleo-heat-templates. I even see some evidence that HAProxy is trying to host a listener on that IP, although the IP doesn't show up in `ip addr`: sos_commands/process/lsof_-b_M_-n_-l: haproxy 76700 42454 24u IPv4 462191 0t0 TCP 172.31.10.14:6385 (LISTEN) sos_commands/networking/netstat_-W_-neopa: tcp 0 0 172.31.10.14:6385 0.0.0.0:* LISTEN 0 462191 76700/haproxy off (0.00/0/0) Looking at the output of sos_commands/pacemaker/pcs_status, I see the other VIPs, but not the custom VIP: ip-172.31.0.40 (ocf::heartbeat:IPaddr2): Started chrisj-controller-0 ip-172.31.8.20 (ocf::heartbeat:IPaddr2): Started chrisj-controller-0 ip-172.31.1.15 (ocf::heartbeat:IPaddr2): Started chrisj-controller-0 ip-172.31.1.14 (ocf::heartbeat:IPaddr2): Started chrisj-controller-0 ip-172.31.3.14 (ocf::heartbeat:IPaddr2): Started chrisj-controller-0 ip-172.31.4.14 (ocf::heartbeat:IPaddr2): Started chrisj-controller-0 So it looks to me like the pacemaker config is not working correctly for custom networks. There isn't enough data in the sos report to fully diagnose the issue, so I'm going to take your templates and deploy a similar config in a test lab.
I could not reproduce these symptoms. I used the same network_data.yaml, and copied the settings from the network-environment.yaml. The VIP deployed correctly for me.
Just to conclude this issue. In order for this functionality to work it's not just puppet-tripleo-7.4.8-4.el7ost.noarch that is required, but also overcloud images need to be updated to latest version. That was a missing piece on my end. I can also confirm assigned VIP in parameter_defaults also worked. In my case: CustomBMVirtualFixedIPs: [{'ip_address':'172.31.10.14'}] I no longer have this issue with the latest images. Thanks for the help.