Bug 1176423 - HA | HA deployment fails because of IP addresses conflict.
Summary: HA | HA deployment fails because of IP addresses conflict.
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: foreman-proxy
Version: 6.0 (Juno)
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ---
: Installer
Assignee: Scott Seago
QA Contact: Omri Hochman
URL:
Whiteboard: n1kv
: 1177033 1185107 1190825 (view as bug list)
Depends On:
Blocks: 743661 1174326 1177026 1198800
TreeView+ depends on / blocked
 
Reported: 2014-12-21 16:45 UTC by Leonid Natapov
Modified: 2023-02-22 23:02 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
DHCP does not honor IP address reservations correctly and causes possible conflicts between the IP addresses assigned to virtual IPs and newly discovered hosts. The issue only occurs when the Management, Public API, or Admin API network traffic exists on the PXE/Provisioning subnet. As a workaround, if you are planning to do a POC, discover all hosts for your environment before creating the deployment in the Red Hat Enterprise Linux OpenStack Platform installer user interface. If you are planning to do a longer term deployment, separate the Management, Public API, and Admin API network traffic onto different subnets. Discovering the hosts before creating the deployment will correctly create leases in DHCP preventing conflicts from occurring when the virtual IPs are generated. Having those traffic types separated from the provisioning network removes the interaction that causes the conflicts.
Clone Of:
Environment:
Last Closed: 2015-04-29 14:55:29 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs (155.74 KB, application/octet-stream)
2014-12-21 16:45 UTC, Leonid Natapov
no flags Details

Description Leonid Natapov 2014-12-21 16:45:33 UTC
Created attachment 971758 [details]
logs

Description of problem:
While HA deployment foreman suggests which IP address will be used for VIPs  and passes them to Puppet, so puppet can run pcs commands to create VIPs. I have a situation (happened already twice) when IP addresses suggested for VIP are already in use as controllers' ip address for provisioning. For example I have controller with ip address 192.168.0.2 and another controlles with ip address 192.168.0.3. While puppet I see that there are same ip addresses used for VIPs:

ip-192.168.0.3	(ocf::heartbeat:IPaddr2):	Started pcmk-mac848f69fbc4c3
ip-192.168.0.2	(ocf::heartbeat:IPaddr2):	Started pcmk-mac848f69fbc4c3

As a result HA deployment fails because host with actual IP X looses it's IP and can't get it back from DHCP because this ip used for VIP. So cluster members can't communicate with each other. 

I am attaching foreman log from staypuft machine and and messages log from controllers.

ruby193-rubygem-foreman_openstack_simplify-0.0.6-8.el7ost.noarch
openstack-puppet-modules-2014.2.7-2.el7ost.noarch
openstack-foreman-installer-3.0.8-1.el7ost.noarch

Here is pcs status output:
-------------------------------
[root@macf04da2732fb1 ~]# pcs status
Cluster name: openstack
Last updated: Sun Dec 21 18:31:04 2014
Last change: Sun Dec 21 18:05:20 2014 via cibadmin on pcmk-macf04da2732fb1
Stack: corosync
Current DC: pcmk-macf04da2732fb1 (3) - partition with quorum
Version: 1.1.10-32.el7_0.1-368c726
3 Nodes configured
16 Resources configured


Online: [ pcmk-mac848f69fbc4c3 pcmk-macf04da2732fb1 ]
OFFLINE: [ pcmk-mac848f69fbc643 ]

Full list of resources:

 stonith-ipmilan-10.35.160.172	(stonith:fence_ipmilan):	Started pcmk-mac848f69fbc4c3 
 stonith-ipmilan-10.35.160.174	(stonith:fence_ipmilan):	Started pcmk-mac848f69fbc4c3 
 stonith-ipmilan-10.35.160.170	(stonith:fence_ipmilan):	Started pcmk-macf04da2732fb1 
 ip-192.168.0.3	(ocf::heartbeat:IPaddr2):	Started pcmk-mac848f69fbc4c3 
 ip-10.35.173.157	(ocf::heartbeat:IPaddr2):	Started pcmk-macf04da2732fb1 
 ip-10.35.173.158	(ocf::heartbeat:IPaddr2):	Started pcmk-macf04da2732fb1 
 ip-192.168.0.2	(ocf::heartbeat:IPaddr2):	Started pcmk-mac848f69fbc4c3 
 ip-192.168.0.18	(ocf::heartbeat:IPaddr2):	Started pcmk-macf04da2732fb1 
 ip-192.168.0.13	(ocf::heartbeat:IPaddr2):	Started pcmk-macf04da2732fb1 
 ip-192.168.0.14	(ocf::heartbeat:IPaddr2):	Started pcmk-mac848f69fbc4c3 
 ip-192.168.0.21	(ocf::heartbeat:IPaddr2):	Started pcmk-mac848f69fbc4c3 
 ip-10.35.173.150	(ocf::heartbeat:IPaddr2):	Started pcmk-macf04da2732fb1 
 ip-10.35.173.155	(ocf::heartbeat:IPaddr2):	Started pcmk-mac848f69fbc4c3 
 Clone Set: memcached-clone [memcached]
     Started: [ pcmk-mac848f69fbc4c3 pcmk-macf04da2732fb1 ]
     Stopped: [ pcmk-mac848f69fbc643 ]

PCSD Status:
  pcmk-mac848f69fbc4c3: Online
  pcmk-mac848f69fbc643: Unable to authenticate
  pcmk-macf04da2732fb1: Online

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
-------------------------------
pcs config output:
-------------------------------
[root@mac848f69fbc4c3 ~]# pcs config
Cluster Name: openstack
Corosync Nodes:
 pcmk-mac848f69fbc4c3 pcmk-mac848f69fbc643 pcmk-macf04da2732fb1 
Pacemaker Nodes:
 pcmk-mac848f69fbc4c3 pcmk-mac848f69fbc643 pcmk-macf04da2732fb1 

Resources: 
 Resource: ip-192.168.0.3 (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=192.168.0.3 cidr_netmask=32 
  Operations: start interval=0s timeout=20s (ip-192.168.0.3-start-timeout-20s)
              stop interval=0s timeout=20s (ip-192.168.0.3-stop-timeout-20s)
              monitor interval=30s (ip-192.168.0.3-monitor-interval-30s)
 Resource: ip-10.35.173.157 (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=10.35.173.157 cidr_netmask=32 
  Operations: start interval=0s timeout=20s (ip-10.35.173.157-start-timeout-20s)
              stop interval=0s timeout=20s (ip-10.35.173.157-stop-timeout-20s)
              monitor interval=30s (ip-10.35.173.157-monitor-interval-30s)
 Resource: ip-10.35.173.158 (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=10.35.173.158 cidr_netmask=32 
  Operations: start interval=0s timeout=20s (ip-10.35.173.158-start-timeout-20s)
              stop interval=0s timeout=20s (ip-10.35.173.158-stop-timeout-20s)
              monitor interval=30s (ip-10.35.173.158-monitor-interval-30s)
 Resource: ip-192.168.0.2 (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=192.168.0.2 cidr_netmask=32 
  Operations: start interval=0s timeout=20s (ip-192.168.0.2-start-timeout-20s)
              stop interval=0s timeout=20s (ip-192.168.0.2-stop-timeout-20s)
              monitor interval=30s (ip-192.168.0.2-monitor-interval-30s)
 Resource: ip-192.168.0.18 (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=192.168.0.18 cidr_netmask=32 
  Operations: start interval=0s timeout=20s (ip-192.168.0.18-start-timeout-20s)
              stop interval=0s timeout=20s (ip-192.168.0.18-stop-timeout-20s)
              monitor interval=30s (ip-192.168.0.18-monitor-interval-30s)
 Resource: ip-192.168.0.13 (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=192.168.0.13 cidr_netmask=32 
  Operations: start interval=0s timeout=20s (ip-192.168.0.13-start-timeout-20s)
              stop interval=0s timeout=20s (ip-192.168.0.13-stop-timeout-20s)
              monitor interval=30s (ip-192.168.0.13-monitor-interval-30s)
 Resource: ip-192.168.0.14 (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=192.168.0.14 cidr_netmask=32 
  Operations: start interval=0s timeout=20s (ip-192.168.0.14-start-timeout-20s)
              stop interval=0s timeout=20s (ip-192.168.0.14-stop-timeout-20s)
              monitor interval=30s (ip-192.168.0.14-monitor-interval-30s)
 Resource: ip-192.168.0.21 (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=192.168.0.21 cidr_netmask=32 
  Operations: start interval=0s timeout=20s (ip-192.168.0.21-start-timeout-20s)
              stop interval=0s timeout=20s (ip-192.168.0.21-stop-timeout-20s)
              monitor interval=30s (ip-192.168.0.21-monitor-interval-30s)
 Resource: ip-10.35.173.150 (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=10.35.173.150 cidr_netmask=32 
  Operations: start interval=0s timeout=20s (ip-10.35.173.150-start-timeout-20s)
              stop interval=0s timeout=20s (ip-10.35.173.150-stop-timeout-20s)
              monitor interval=30s (ip-10.35.173.150-monitor-interval-30s)
 Resource: ip-10.35.173.155 (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=10.35.173.155 cidr_netmask=32 
  Operations: start interval=0s timeout=20s (ip-10.35.173.155-start-timeout-20s)
              stop interval=0s timeout=20s (ip-10.35.173.155-stop-timeout-20s)
              monitor interval=30s (ip-10.35.173.155-monitor-interval-30s)
 Clone: memcached-clone
  Resource: memcached (class=systemd type=memcached)
   Attributes: start-delay=10s 
   Operations: monitor interval=30s (memcached-monitor-interval-30s)

Stonith Devices: 
 Resource: stonith-ipmilan-10.35.160.172 (class=stonith type=fence_ipmilan)
  Attributes: pcmk_host_list=pcmk-mac848f69fbc643 ipaddr=10.35.160.172 login=root passwd=calvin 
  Operations: monitor interval=60s (stonith-ipmilan-10.35.160.172-monitor-interval-60s)
 Resource: stonith-ipmilan-10.35.160.174 (class=stonith type=fence_ipmilan)
  Attributes: pcmk_host_list=pcmk-macf04da2732fb1 ipaddr=10.35.160.174 login=root passwd=calvin 
  Operations: monitor interval=60s (stonith-ipmilan-10.35.160.174-monitor-interval-60s)
 Resource: stonith-ipmilan-10.35.160.170 (class=stonith type=fence_ipmilan)
  Attributes: pcmk_host_list=pcmk-mac848f69fbc4c3 ipaddr=10.35.160.170 login=root passwd=calvin 
  Operations: monitor interval=60s (stonith-ipmilan-10.35.160.170-monitor-interval-60s)
Fencing Levels: 

Location Constraints:
  Resource: stonith-ipmilan-10.35.160.170
    Disabled on: pcmk-mac848f69fbc4c3 (score:-INFINITY) (id:location-stonith-ipmilan-10.35.160.170-pcmk-mac848f69fbc4c3--INFINITY)
  Resource: stonith-ipmilan-10.35.160.172
    Disabled on: pcmk-mac848f69fbc643 (score:-INFINITY) (id:location-stonith-ipmilan-10.35.160.172-pcmk-mac848f69fbc643--INFINITY)
  Resource: stonith-ipmilan-10.35.160.174
    Disabled on: pcmk-macf04da2732fb1 (score:-INFINITY) (id:location-stonith-ipmilan-10.35.160.174-pcmk-macf04da2732fb1--INFINITY)
Ordering Constraints:
Colocation Constraints:

Cluster Properties:
 cluster-infrastructure: corosync
 dc-version: 1.1.10-32.el7_0.1-368c726
 pcmk-mac848f69fbc4c3: memcached,rabbitmq
 pcmk-mac848f69fbc643: memcached
 pcmk-macf04da2732fb1: memcached,rabbitmq,haproxy
 rabbitmq: running

Comment 2 Jason Guiditta 2015-01-05 13:57:32 UTC
This is unrelated to puppet configuration, but is somewhere in staypuft, which passes the IP addresses to be used for VIPs to the puppet manifests

Comment 3 Mike Burns 2015-01-05 16:39:32 UTC
*** Bug 1177033 has been marked as a duplicate of this bug. ***

Comment 4 Lukas Zapletal 2015-01-06 13:37:06 UTC
For the record - let's take a look on dhcpd.leases and proxy.log to see what IP addresses does Foreman and Proxy return.

Comment 5 Mike Burns 2015-01-07 13:55:55 UTC
Worked with Lukas and Leonid to get to the root cause:

The installer uses an api call to unused_ip to generate the VIPs for a deployment.  The unused_ip api call does not create leases in dhcp.  This means that a new host in the environment will use dhcp and get a lease which might conflict with the VIPs.  

Some notes:

This only affects subnets using a foreman-proxy DHCP server (the provisioning network currently).  

It can be avoided by discovering *all* hosts prior to creating the deployment.

Proposed fix:

in the case that the subnet is using DHCP, we should do:

* unused_ip to get an ip address
* immediately call into dhcp and create a reservation for the ip address
** This is done with by calling POST /dhcp/network?mac=xyz&name=abc


A feature request for foreman has been filed related to this issue to allow doing this in one step:

http://projects.theforeman.org/issues/8854

Comment 6 Lukas Zapletal 2015-01-07 14:00:15 UTC
Correction, to create a reservation one needs to provide MAC, name and IP:

POST /dhcp/network?mac=xyz&name=abc&ip=def

When doing the code change, leave a note this can be later refactored to a signle call once our feature request is implemented.

Comment 7 Scott Seago 2015-01-07 15:29:16 UTC
What about on deployment deletion? I imagine we also need to clear reservations for all those macs? Whats the POST API call for that?

Comment 8 Lukas Zapletal 2015-01-08 10:45:51 UTC
I just realized that this is not that easy. If you add the reservation, Foreman orchestration code will try to perform it once again, which will fail. You also need to make sure the orchestration will not happen.

In this code:

  app/models/concerns/orchestration/dhcp.rb

you need to disable the after_validation hooks:

  after_validation :dhcp_conflict_detected?, :queue_dhcp

for the particular Host instance you do want to do the pre-reservation.

Instead commenting that out I think you want to introduce some flag and use it when you want to skip the DHCP validations (which triggers the orchestration code) on a particular instance.

For the deletion - you don't need to care about this, because our orchestration code will make sure DHCP record get's deleted automatically (that's actually the before_destroy :queue_dhcp_destroy line in this file). But when you do the pre-reservation, make sure to store the IP address that was returned in the Host record (field name "ip") otherwise the deletion code will not have enough information for the deletion.

Comment 9 Mike Burns 2015-01-08 14:57:07 UTC
Lukas,

Does this still apply if we're using nics that are not attached to a host?  The nics this is an issue for are all virtual and not attached to a host.  

Same question for the deletion.  Is the cleanup done when the nic is removed? or when the host is removed.  If we don't have a host, does that cleanup work?

Comment 10 Lukas Zapletal 2015-01-08 15:42:50 UTC
Right, if you pre-create a reservation on a proxy/dhcp and save the IP address with a host in Foreman, the moment you try to save it orchestration code sends the very same proxy/dhcp request again (MAC/IP/hostname) which will likely fail rolling the whole transaction back. This paragraph applies only to primary (provisioning) interface.

Now, for all the other interfaces, you have option to create "unmanaged" NICs. Those interfaces are not being registered against DHCP/DNS. Although there was no user interface for this until current nightly versions (will be 1.8), you were able to set the flag. It's called "managed" and if you set it to false, then you can do the DHCP/DNS reservations yourself, but in that case you also need to make sure it get's removed upon host deletion.

Can't you just keep those interfaces simly unmanaged (set the flag) so they will get "random" IP address and will have no hostname?

Comment 11 Mike Burns 2015-01-08 17:01:48 UTC
The issue that we were facing is that we have virtual ip addresses that need to be set.  They're logical IP addresses that move from host to host in a cluster using pacemaker.  There is no physical nic on a host with that IP.  

The way we've modeled this in staypuft (and Scott can correct me when I say it incorrectly) is that we have a set of nics that are not associated with any physical host.  We use those nics to get an ip address which gets passed into puppet and used to configure pacemaker vips.  

The issue in this BZ is that the ips of these virtual nics aren't reserved, so new dhcp requests can get the same addresses.  

The proposed fix is:

* during deployment creation, the virtual nics are created with no host association.  
* A call to unused_ip is made to get an ip address for that nic
* An immediate call to reserve that ip address is made to dhcp
* On deployment deletion, for each nic, a call is made to dhcp to delete the reservation for that ip address.

Comment 12 Scott Seago 2015-01-13 05:19:26 UTC
https://github.com/theforeman/staypuft/pull/402

Pull request is here.

Comment 15 Mike Burns 2015-01-23 12:30:07 UTC
*** Bug 1185107 has been marked as a duplicate of this bug. ***

Comment 16 Hugh Brock 2015-01-27 15:28:28 UTC
Ohad, is there a remote chance we can get a quick fix for this, or should we push it to async?

Comment 18 Scott Seago 2015-01-27 18:40:57 UTC
This is going to be much more involved than initially thought. For the PXE network, we essentially need a pool of addresses for VIPs that's separate from the DHCP pool. The challenges include how to split the allocation such that we don't prematurely run out of either VIP IPs or normal host IPs, and coming up with a staypuft-specific IPAM scheme for PXE-network VIPs since we wo't use the foreman 'suggest IP' here.

The workaround is to put all VIP API network traffic types on networks *other than* the PXE provisioning network.

Comment 19 Mike Burns 2015-01-27 18:53:22 UTC
(In reply to Scott Seago from comment #18)
> This is going to be much more involved than initially thought. For the PXE
> network, we essentially need a pool of addresses for VIPs that's separate
> from the DHCP pool. The challenges include how to split the allocation such
> that we don't prematurely run out of either VIP IPs or normal host IPs, and
> coming up with a staypuft-specific IPAM scheme for PXE-network VIPs since we
> wo't use the foreman 'suggest IP' here.
> 
> The workaround is to put all VIP API network traffic types on networks
> *other than* the PXE provisioning network.

Or pre-discover all hosts to be used in your environment.

Comment 21 Mike Burns 2015-02-03 16:56:28 UTC
Scott,

Is this something we could add a deployment validation for without doing all the general deployment validation for?

What we're looking for is for a comparison of the vips and the ips on the hosts to catch the conflicts and throw a message if there is a conflict.

Comment 23 Mike Burns 2015-02-20 18:56:58 UTC
*** Bug 1190825 has been marked as a duplicate of this bug. ***

Comment 26 Mike Burns 2015-04-29 14:55:29 UTC
This would require significant re-architecture to resolve.  The workaround listed in the release notes avoids the issue completely.

Comment 27 Keith Schincke 2015-05-02 00:18:20 UTC
The Doc Text for this issue is not completely correct. 
I have had the IP address theft/conflict occur when the Public API network is completely separate. 

If the Public API Network is 192.168.141.0/21 with a pool range of 25 through 225, I have had the auto assigned IP for the host interface also be assigned for a VIP. 

My solution has been to manually assign the interface to high in the pool range (>200). 

Are the VIPs picked from a specific range of the pool?


Note You need to log in before you can comment on or make changes to this bug.