Bug 2048887

Summary: [dhcpd] DHCPD not handing out infinite leases
Product: Red Hat Enterprise Linux 8 Reporter: aygarg
Component: dhcpAssignee: Martin Osvald 🛹 <mosvald>
Status: CLOSED NOTABUG QA Contact: rhel-cs-infra-services-qe <rhel-cs-infra-services-qe>
Severity: medium Docs Contact:
Priority: medium    
Version: 8.5CC: akaris, augol, bnemec, dmoessne, dosmith, ealcaniz, jmalde, oarribas, vvoronko
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-02-06 12:16:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description aygarg 2022-02-01 05:11:56 UTC
Description of problem:
OCP nodes are not able to retain the IPs after reboot as it always relies on the DHCP server while the expectation is to have the DHCP provided IPs configured as static IPs during the installation so that DHCP can be turned off post-installation.

As per the following document, during the OCP cluster deployment, DHCP assigned IPs are supposed to be configured as static IPs on the nodes which isn't seeming to be the case here. The expectation is to have the DHCP service turned off post-installation so that the IPs assigned to the nodes will remain as static IPs which will be retained forever without relying on the DHCP to be available every time the nodes are rebooted. As per the document, IP addresses have been reserved with an infinite lease yet the nodes are not coming up with the designated IPs after reboot without relying on the DHCP server. 

Please find the necessary DHCP & OCP cluster configuration details furnished below for further analysis.

https://docs.openshift.com/container-platform/4.8/installing/installing_bare_metal_ipi/ipi-install-prerequisites.html 

**IMPORTANT:**
"Reserving IP addresses so they become static IP addresses
Some administrators prefer to use static IP addresses so that each node’s IP address remains constant in the absence of a DHCP server. To use static IP addresses in the OpenShift Container Platform cluster, reserve the IP addresses with an infinite lease. During deployment, the installer will reconfigure the NICs from DHCP assigned addresses to static IP addresses. NICs with DHCP leases that are not infinite will remain configured to use DHCP."


Version-Release number of selected component (if applicable):
OCP 4.8.25 (IPI)/Baremetal Platform


How reproducible:
Every time in customer's environment.



Actual results:
The node's IP address changes after reboot.

Expected results:
IP address must remain persistent after reboot when DHCP is turned off.

Additional info:
The DHCP is used on the provisioning node along with dnsmasq to allow node get the ip address.

The reported behavior is noticed right after successful installation of the cluster & when the nodes are rebooted for testing purpose or as a result of performing some change on the cluster. The customer purposefully turns off DHCP once the cluster installation is successful just to verify if the IPs are persistent during reboot & they don't rely on the DHCP server anymore but it's not working as expected as all the time, it looks for the DHCP server to get the IPs during reboot.

Comment 21 Martin Osvald 🛹 2022-02-06 12:16:58 UTC
I was able to reproduce on RHEL8.5 and can confirm this works on RHEL7.9.

This is not a bug but an intended change in behavior introduced by the below upstream patch (since upstream 4.3.5, RHEL8.x is based on 4.3.6):

https://github.com/isc-projects/dhcp/commit/68507137e1cbd783b6cb7ce84a4805c96af6eb5e
~~~
- Altered DHCPv4 lease time calculation to avoid roll over errors on 64-bit
  OS systems when using -1 or large values for default-lease-time.  Rollover
  values will be replaced with 0x7FFFFFFF - 1.  This alleviates unintentionally
  short expiration times being handed out when infinite lease times (-1) in
  conjuction with failover.
  [ISC-Bugs #41976]
~~~

This means dhcp no longer supports infinite leases. This may sound wrong to you, but RFC itself allows this behavior:

https://datatracker.ietf.org/doc/html/rfc2131#section-2.2
~~~
                                        The client may ask for a
   permanent assignment by asking for an infinite lease.  Even when
   assigning "permanent" addresses, a server may choose to give out
   lengthy but non-infinite leases to allow detection of the fact that
   the client has been retired.
~~~

There is no client/server side configuration directive that could workaround the responsible code behavior in ack_lease().