Red Hat Bugzilla – Full Text Bug Listing
|Summary:||Race condition with NetworkManager on discovery image|
|Product:||Red Hat Satellite 6||Reporter:||David Critch <dcritch>|
|Component:||Discovery Image||Assignee:||Lukas Zapletal <lzap>|
|Status:||CLOSED ERRATA||QA Contact:||Sachin Ghai <sghai>|
|Version:||6.1.1||CC:||bbuckingham, chrobert, dcritch, hartsjc, kdixon, lzap, meeveret, mmccune, sghai, sthirugn|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2016-07-27 07:06:43 EDT||Type:||Bug|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
Description David Critch 2015-09-14 13:52:08 EDT
Created attachment 1073364 [details] screenshot of discovery Description of problem: Essentially the same as https://bugzilla.redhat.com/show_bug.cgi?id=1227017 but that was specific to the OSP installer. This is in a lab within Red Hat. Note that this is on some older gear. I don't see the issue in another lab w/ more modern kit. It seems that DHCP is still coming up when the discoveryd process starts, so the process fails to resolve the foreman host. If I -HUP the discovery process in another terminal, the host properly resolves the foreman IP, reports in and is discovered. Version-Release number of selected component (if applicable): foreman-discovery-image-2.1.0-36.el7sat.noarch How reproducible: Always Steps to Reproduce: 1. configure Satellite or capsule for discovery, with fdi.rootpw=PASSWD 2. boot host over PXE 3. host gets DHCP lease and gets TFTP file Actual results: 4. host fails to resolve foreman IP. must kill -HUP discovery process in separate terminal Expected results: 4. host checks in to foreman for discovery Additional info: this is in a lab within Red Hat and I can share some access/logs as needed.
Comment 2 David Critch 2015-10-05 14:08 EDT
Created attachment 1080015 [details] journal from discovery boot This is a dump of journalctl from the host once booted in discovery mode. You can see the first couple of times, the host fails to send to foreman due to DNS issues. After kill -HUP the discovery process, you can see it then register properly. Discovery is starting before DHCP is fully up, and can't resolve the foreman URL at that time. Eventually the host is on the network and can resolve foreman, but the discovery process never seems to learn that the host is resolvable until after restarting the discovery daemon.
Comment 4 Lukas Zapletal 2015-10-20 07:06:02 EDT
Hello David, I can confirm we encountered this kind of behaviour and it has been fixed upstream already. We are planning an discovery errata in one or two months that will rebase the image and include this fix as well.
Comment 18 Lukas Zapletal 2015-12-08 10:25:40 EST
I can make you another build if you want and upload it for you. Just let me know on IRC.
Comment 20 David Critch 2015-12-11 13:24:57 EST
with the latest build, the discovery-register service doesn't start. attaching latest journalctl
Comment 21 David Critch 2015-12-11 13:25 EST
Created attachment 1104757 [details] journal output from foreman-discovery-image-2.1.1-1.el7sat.noarch.rpm
Comment 24 Lukas Zapletal 2015-12-17 06:14:35 EST
Yes, Brad, there is a patch pending we need to include. David, does the above link from comment 22 work? Anyway, 6.1.5 errata is out and it contains completly rebased image, it won't be compatible with OSP tho anymore, but this bug was filed against Satellite 6, so use it. We track one additional race condition which hasn't been merged yet upstream. I am attaching it to this BZ, symptoms are similar (this time foreman-proxy is not started properly): http://projects.theforeman.org/issues/12429
Comment 26 Bryan Kearney 2016-02-24 18:10:31 EST
Moving to POST since upstream bug http://projects.theforeman.org/issues/12429 has been closed ------------- Kamil Madac Solution (workaround?) is to start foreman-proxy after NetworkManager-wait-online.service is ready. More in https://github.com/theforeman/foreman-discovery-image/pull/48 ------------- Kamil Madac My fault. I forgot to replace both lines (Wants= and After=) in foreman-proxy.service (https://github.com/lzap/foreman-discovery-image/commit/c72e80902b4cd34c4b8369f3ec118b8ef7ac9bf6). Once I did it, provisioning works as expected. ------------- Lukas Zapletal Great, can you please confirm in the PR itself that the build I made works as expected? Or at least show us the patch you made on your own build. Thanks. It's the https://github.com/theforeman/foreman-discovery-image/pull/50 ------------- Anonymous Applied in changeset commit:foreman-discovery-image|0c18ba2a6d04e5105db1e2085fe69f091b6922c7.
Comment 28 Sachin Ghai 2016-04-15 02:23:23 EDT
@Lukas, Please provide verification steps. Assuming host should have multiple interfaces to reproduce this ? Please advise.
Comment 29 Lukas Zapletal 2016-04-15 03:45:37 EDT
QA steps: Simply verify if discovery works with one or multiple NICs in various environments. Also, if possible, simulate slow DHCP and verify it starts correctly as well. You could easily simulate this by turning off DHCP server on the network, waiting until Welcome screen appears and then turning it on. The background process should start discovery request and after few seconds, you should be able to refresh the screen. The status will likely be UNKNOWN - Use Refresh button to update info, this is expected.
Comment 30 Sachin Ghai 2016-04-18 06:17:23 EDT
Verified with sat6.2 beta snap8.2 I discovered a host with two nics and tried to simulate the slow DHCP as suggested in comment29. However, I'm not able to reproduce the reported issue. Host is discovered successfully and I can see that host in webUI.
Comment 31 Bryan Kearney 2016-07-27 07:06:43 EDT
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1501