Bug 1804793 - [IPI baremetal]: during bootstrap, two dhcp servers could be active on the provisioning network
Summary: [IPI baremetal]: during bootstrap, two dhcp servers could be active on the pr...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.4.0
Assignee: Stephen Benjamin
QA Contact: Nataf Sharabi
URL:
Whiteboard:
Depends On: 1800746
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-19 16:09 UTC by Stephen Benjamin
Modified: 2020-05-04 11:38 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1800746
Environment:
Last Closed: 2020-05-04 11:37:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 3138 0 None closed [release-4.4] Bug 1804793: baremetal: only respond to dhcp for control plane mac's 2020-05-27 11:34:27 UTC
Red Hat Product Errata RHBA-2020:0581 0 None None None 2020-05-04 11:38:06 UTC

Description Stephen Benjamin 2020-02-19 16:09:13 UTC
+++ This bug was initially created as a clone of Bug #1800746 +++

The bootstrap VM can now co-exist with machine-api being up. That means there could be an instance of Ironic, dnsmasq, etc running in both the cluster and the bootstrap. This causes problems, as it's not deterministic which dnsmasq instance the worker provisioned by the machine-api will use. If bootstrap responds first then the worker will not come online as it'll be pointing at the wrong place.

This is causing a percentage of baremetal installs to fail, with the worker being offline, ingress and other operators never come up.

Comment 3 Nataf Sharabi 2020-03-24 16:18:06 UTC
In order to verify: 

1.During installation notice that the bootstrap machine is created:
  virsh list --all
  Id    Name                               State
  ----------------------------------------------------
   219   provisionhost-0                    running
   220   ocp-edge-cluster-77jtp-bootstrap   running

2. from baremetal run : 
   virsh console ocp-edge-cluster-77jtp-bootstrap

3. You should see in the console:
   ens3: 192.168.123.126 fe80::9337:ec5a:fc32:16c1                                                                                                                                               
   ens4:  fd00:1101::2  

4. from baremetal run:
   ssh kni@provisionhost

5.from provisionhost run:
  ssh core.123.126

6.from bootstrap run:
   sudo ip6tables -t raw -L


Chain PREROUTING (policy ACCEPT)                                                                                                                                                              
target     prot opt source               destination                                                                                                                                          
DHCP       udp      anywhere             anywhere             udp dpt:bootps                                                                                                                  
DHCP       udp      anywhere             anywhere             udp dpt:dhcpv6-server                                                                                                           
                                                                                                                                                                                              
Chain OUTPUT (policy ACCEPT)                                                                                                                                                                  
target     prot opt source               destination                                                                                                                                          
                                                                                                                                                                                              
Chain DHCP (2 references)                                                                                                                                                                     
target     prot opt source               destination                                                                                                                                          
ACCEPT     all      anywhere             anywhere             MAC 52:54:00:2B:C2:2A                                                                                                           
ACCEPT     all      anywhere             anywhere             MAC 52:54:00:07:5C:BA                                                                                                           
ACCEPT     all      anywhere             anywhere             MAC 52:54:00:47:48:CB
DROP       all      anywhere             anywhere            


The rules match the code in : https://github.com/openshift/installer/pull/3079/files
                              https://github.com/openshift/installer/pull/3243/files

Comment 5 errata-xmlrpc 2020-05-04 11:37:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581


Note You need to log in before you can comment on or make changes to this bug.