Bug 1258578

Summary: Discovered hosts fail to move to 'built' due to DHCP conflict
Product: Red Hat Satellite Reporter: David Critch <dcritch>
Component: DHCP & DNSAssignee: Lukas Zapletal <lzap>
Status: CLOSED ERRATA QA Contact: Sachin Ghai <sghai>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.1.0CC: bbuckingham, bkearney, cwelton, lzap, sghai
Target Milestone: UnspecifiedKeywords: Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
URL: http://projects.theforeman.org/issues/8727
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-15 09:19:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Critch 2015-08-31 16:50:09 UTC
Description of problem:
Hosts are failing to move from discovered to built due to a 409 DHCP reserveration error:
2015-08-23 23:49:48 [I] Found pod1-controller1.cloud.practice.redhat.com
2015-08-23 23:49:48 [I] unattended: pod1-controller1.cloud.practice.redhat.com is Built!
2015-08-23 23:49:48 [W] DHCP records -b8:2a:72:d3:63:f0/10.12.32.106 already exists
2015-08-23 23:49:48 [W] Failed to set Build on pod1-controller1.cloud.practice.redhat.com: Validation failed: Conflict DHCP records -b8:2a:72:d3:63:f0/10.12.32.106 already exists
2015-08-23 23:49:48 [I] Completed 409 Conflict in 587ms (ActiveRecord: 27.9ms)

Version-Release number of selected component (if applicable):
katello-2.2.0.14-1.el7sat.noarch
foreman-1.7.2.33-1.el7sat.noarch

How reproducible:
Always

Steps to Reproduce:
1. Boot host to discovery mode
2. Click provision, and change IP from DHCP issued lease to desired static IP
3. Click Submit
4. Acknowledge DHCP conflict and click ACK
5. Host builds successfully through Satellite
6. When host goes to "inform Foreman we are built"


Actual results:
Host is not moved to 'built' and following error occurs in log:

2015-08-23 23:49:48 [I] Found pod1-controller1.cloud.practice.redhat.com
2015-08-23 23:49:48 [I] unattended: pod1-controller1.cloud.practice.redhat.com is Built!
2015-08-23 23:49:48 [W] DHCP records -b8:2a:72:d3:63:f0/10.12.32.106 already exists
2015-08-23 23:49:48 [W] Failed to set Build on pod1-controller1.cloud.practice.redhat.com: Validation failed: Conflict DHCP records -b8:2a:72:d3:63:f0/10.12.32.106 already exists
2015-08-23 23:49:48 [I] Completed 409 Conflict in 587ms (ActiveRecord: 27.9ms)

Expected results:
Host is moved to 'built'

Additional info:
Same error message occurs if you 'cancel build' after the OS is provisioned. Cleaning up dhcpd.leases and restarting dhcpd allows 'cancel build' to proceed.

Similarly, removing bad dhcpd leases and restarting dhcpd before host checks in to foreman allows the host to move to 'built' properly

Thanks to Lukas, the following patch fixes the issue: https://github.com/theforeman/foreman/pull/2022/files

Comment 7 Bryan Kearney 2015-11-13 19:02:54 UTC
Upstream bug assigned to gsutclif

Comment 8 Lukas Zapletal 2015-11-16 13:04:49 UTC
Next errata please.

Comment 9 Sachin Ghai 2015-11-27 14:11:17 UTC
Ok. Verified with Sat 6.1.5 compose2(Satellite-6.1.0-RHEL-7-20151125.0)
and using scratchbuild of discovery image from brew having version: foreman-discovery-image-3.0.5-2.iso

Here is what I tried, I discovered a host with same mac for which there was already an entry in dhcpd.leases file with some IP. 

Host is discovered with same mac and dhcp has given same IP, but while provisioning I just changed the IP and submit the edit_host form. Provisioning started successfully.

Here is snippet from my dhcpd.leases file.

lease 192.168.100.10 {
  starts 4 2015/11/26 12:03:21;
  ends 5 2015/11/27 00:03:21;
  tstp 5 2015/11/27 00:03:21;
  cltt 4 2015/11/26 12:03:21;
  binding state free;
  hardware ethernet 52:54:00:6c:82:44;
  uid "\001RT\000l\202D";
}
lease 192.168.100.11 {
  starts 5 2015/11/27 03:26:22;
  ends 5 2015/11/27 15:26:22;
  tstp 5 2015/11/27 15:26:22;
  cltt 5 2015/11/27 03:26:22;
  binding state active;
  next binding state free;
  rewind binding state free;
  hardware ethernet 52:54:00:6c:82:44;
}
l

lease 192.168.100.11 {
  starts 5 2015/11/27 14:00:30;
  ends 6 2015/11/28 02:00:30;
  cltt 5 2015/11/27 14:00:30;
  binding state active;
  next binding state free;
  rewind binding state free;
  hardware ethernet 52:54:00:6c:82:44;
}
host mac5254006c8244.xxxxxcom {
  dynamic;
  hardware ethernet 52:54:00:6c:82:44;
  fixed-address 192.168.100.18;
        supersede server.filename = "pxelinux.0";
        supersede server.next-server = 0a:10:60:64;
        supersede host-name = "mac5254006c8244.xxx.xxxcom";


Lukas, Could you please confirm, if its right way to reproduce the original issue ?

Comment 10 Lukas Zapletal 2015-11-30 09:33:04 UTC
Yes this is exactly it, you've correctly used "uid" flag which is sent by some BIOS systems. The other option to simulate this is "client-id" but it's essentialy the same thing. We see one MAC address to share two IPs and this should not fail or issue any "Overwrite conflict" warning dialog.

Comment 11 Sachin Ghai 2015-11-30 09:48:06 UTC
thank you Lukas. Based on comment9 and comment 10, moving this to verified. thanks

Comment 13 errata-xmlrpc 2015-12-15 09:19:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2015:2622