Overloaded machine hits the timeout in Puppet+Foreman's node.rb script therefore puppet won't start running. Execution of '/etc/puppet/node.rb returnes 1 because of time out. Therefore puppet won't start run automatically on the controller which will lead HA deployment to fail. There's a default timeout of 10 seconds, which was quite probably hit on overloaded machine. Jiri was able to reproduce this bug by manually setting timeout to 1 second locally. Workaround: Start puppet manually.
A workaround which can prevent this from happening is to edit /etc/puppet/foreman.yaml and change the timeout from 10 to a higher number, e.g. 45 (the unit is seconds). Fix will need to happen upstream in puppet-foreman to make the timeout value configurable from the installer, and then in rhel-osp-installer we'll set a higher value.
3 pull requests submitted, addressing reviews now: https://github.com/theforeman/foreman-installer-staypuft/pull/134 https://github.com/theforeman/puppet-puppet/pull/236 https://github.com/theforeman/puppet-foreman/pull/289
Merged upstream. Foreman has adopted an increased default timeout of 60 seconds, so the first patch is not needed (we'll go with the new defaults), only the second and third patches are necessary.
The actual issue is intermittent, but we can validate that the fix is in by checking /etc/puppet/foreman.yaml and seeing timeout set to 60.
tested with foreman-installer-1.6.0-0.3.RC1.el7ost.noarch. doesn't reproduce.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-0641.html