Bug 1468557 - Discovery KExec does not work with Atomic Host 7 [NEEDINFO]
Discovery KExec does not work with Atomic Host 7
Status: NEW
Product: Red Hat Satellite 6
Classification: Red Hat
Component: Discovery Image (Show other bugs)
x86_64 Linux
high Severity high (vote)
: Unspecified
: --
Assigned To: Lukas Zapletal
: Triaged
Depends On:
  Show dependency treegraph
Reported: 2017-07-07 08:06 EDT by Mihir Lele
Modified: 2018-06-15 18:37 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
lzap: needinfo? (gkonda)

Attachments (Terms of Use)
screengrab1 (12.57 KB, image/png)
2017-07-07 08:14 EDT, Mihir Lele
no flags Details

  None (edit)
Description Mihir Lele 2017-07-07 08:06:24 EDT
Description of problem:

PXEless provisioning through Forman Discovery Image fails for "Red Hat Enterprise Linux Atomic Host 7"

Version-Release number of selected component (if applicable): 6.2.10

How reproducible: Always

Steps to Reproduce:
1. Discover a host using FDI. Image used by me and the Customer: foreman-discovery-image-3.1.1-22.iso
2. Go to discovered host, fill the host profile and hit "Submit".

Actual results:

Provisioning is not initiated for the host and the Host displays the "Discovery Status" page again which is normally displayed after the host sends the facts to the Satellite.

Discovery Status page says:

"Status: N/A (use Status to update)"

Expected results:

Host should be put in build mode, when we hit submit.

Additional info:

1) This issue is not observed for rhel base os. (it worked for me with rhel7.3)
2) I am attaching the screenshot of the "Discovery Status" from the host
Comment 1 Mihir Lele 2017-07-07 08:14 EDT
Created attachment 1295279 [details]
Comment 5 Lukas Zapletal 2017-07-18 12:06:04 EDT
Yeah, Mihir are you using ONDEMAND policy? There is a bug in Katello (not sure if this was fixed in Satellite) when it does not correctly download kickstarts with ONDEMAND policy.
Comment 6 Mihir Lele 2017-07-18 13:45:20 EDT

I just logged in and cross checked the download policy its "Immidiate". (I am not sure what the download policy is at the customer end. I Can check that and confirm)
And like I said, I am able to provision the same kickstart using pxe. Issue is only observed for provisioning through FDI
Comment 7 Lukas Zapletal 2017-07-21 11:25:38 EDT
I am able to reproduce, inspecting kexec JSON shows that it has correct info:

==> /var/log/foreman/production.log <==
2017-07-21 11:22:41 049a3bbd [app] [I] KEXEC JSON: {
 |   "kernel": "http://older.home.lan/pulp/repos/MyOrg/Library/content/dist/rhel/atomic/7/7Server/x86_64/kickstart//images/pxeboot/vmlinuz",
 |   "initram": "http://older.home.lan/pulp/repos/MyOrg/Library/content/dist/rhel/atomic/7/7Server/x86_64/kickstart//images/pxeboot/initrd.img",
 |   "append": "ks=http://older.home.lan:8000/unattended/provision?token=b51cbe25-8a2f-4ecf-adc4-389a900e63ed&static=yes inst.ks.sendmac ip= nameserver= ksdevice=bootif BOOTIF=00-52-54-00-e6-13-01 "
 | }
Comment 8 Lukas Zapletal 2017-07-21 11:38:21 EDT
I am able to kexec into Anaconda Atomic installer manually

wget http://older.home.lan/pulp/repos/MyOrg/Library/content/dist/rhel/atomic/7/7Server/x86_64/kickstart//images/pxeboot/vmlinuz

wget http://older.home.lan/pulp/repos/MyOrg/Library/content/dist/rhel/atomic/7/7Server/x86_64/kickstart//images/pxeboot/initrd.img

kexec -d --initrd=initrd.img --append="ks=http://older.home.lan:8000/unattended/provision?token=b51cbe25-8a2f-4ecf-adc4-389a900e63ed&static=yes inst.ks.sendmac ip= nameserver= ksdevice=bootif BOOTIF=00-52-54-00-e6-13-01" vmlinuz

It fails to download kickstart. That's perhaps my environment misconfiguration.

Anyway, modify:


file and put this line

    Foreman::Logging.logger('app').info "KEXEC JSON: #{json}"

just before this one (line 57)

    old.becomes(Host::Discovered).kexec json

Restart httpd, then you will see KEXEC JSON log in production log. There you can check the URLs and append line. Paste it here please.
Comment 10 Lukas Zapletal 2017-08-09 04:09:43 EDT
Need more info, I need to see the KEXEC JSON, see comment 8.
Comment 12 Lukas Zapletal 2017-08-10 08:56:28 EDT
The "ampersand" bug mentioned in the case was found in Anaconda installer in RHEL 7.4 beta, it should be fixed in anaconda- When Atomic Anaconda boots up, check its version. Here is the bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1443485

Anyway with FDI 3.1.1 I see the same behavior, reboot.

With FDI 3.4.1 (latest) I see it just freezes in libvirt VM.

But when I add "nomodeset" to the Red Hat Kexec template (this was fixed in 6.3 but not in 6.2), FDI 3.4.1 works fine.


That's first problem. Second, the DHCP boot mode of the network must be set to Static if kexec should happen there correctly. Or template must be edited to provide static NIC configuration.

I am now able to correctly boot Anaconda, it errors out with dracut timeout - starting timeout scripts tho.
Comment 13 Lukas Zapletal 2017-08-10 09:28:12 EDT
Scratch my note about DHCP boot mode, there is a static flag in the KS URL.

Anyway, I can pass the stuck kexec with "nomodeset" option, but still Anaconda won't continue and error out with timeout error. Not sure why. Can you try to reproduce this?
Comment 14 Mihir Lele 2017-11-12 11:00:12 EST
My apologies for the delayed response.

I tried with fdi-3.4.2 with the default kexec template as well as the one in https://github.com/theforeman/foreman_discovery/pull/272/files.

But, didnt work for me
Comment 20 Lukas Zapletal 2018-01-16 11:49:55 EST
Please locate smart_proxy_discovery_image/power_api.rb file on the FDI image (you need shell and GNU find) and patch it:


Then restart proxy:

systemctl restart foreman-proxy

Then switch to console tty2 or run journalctl -f and then perform KExec from Satellite. Now it should print MD5 sum of kernel/iniatramdisk and wait 90 seconds before performing actual kexec command.

This is to rule out transmission errors.

Also please try with 3.4.3-1 discovery image which will be part of 6.3 and 6.2.14 versions, there are some improvements around logging as well.

Try to record the case with screen recorder, I struggle reproducing it. I have seen this once, but now I am unable. Thanks.
Comment 37 Dan Stock 2018-05-25 04:06:11 EDT
I've been following this bug for a while, because we have the same problem. 

Last week we updated to Satellite 6.3.1. One of the problems that we hoped would be solved with the update is this one. 

I've double checked all the configuration and recommendation for the installation. 
The current Version in use is: 

I can say that the installation still stops with the same Status Error.
  Status: N/A (use Status to update)

Could anyone give me Information regarding what coming? 
Are there any other solutions? I've thought about setting up a new segment that uses PXE, even though we use FDI for all our other installation and it would be a big change in our infrastructure.
Comment 38 Lukas Zapletal 2018-05-25 04:47:42 EDT
Dan, there was a lot of private conversation. In short, kickstart repositories to have .treeinfo file which specifies few things but most importantly:

mainimage = LiveOS/squashfs.img

This is expected to be present, e.g. this example is from RHEL Server repo. The thing is - it is missing in Atomic 7 repos at the moment, therefore Anaconda won't load stage 2.

It looks like Atomic team requires users to put this onto kernel command line:


But at some point, the documentation downstream was merged with RHEL documentation where this is no longer required. We are still investigating this.




to your KExec template (or PXE template). You can use templating ERB to find the path in a generic way (untested):

inst.stage2=<%= @host.operatingsystem.medium_uri(@host) %>

The reason why this takes so long is we have several teams involved: RCM (release engineering), Satellite engineering, Atomic engineering and docs. I will update as soon as I will know where we are gonna be fixing this problem (RCM/CDN, Satellite template, RHEL docs or combination).

Note You need to log in before you can comment on or make changes to this bug.