Hide Forgot
Description of problem: I was doing a RHV(self-hosted, 4 nodes) + CFME + OSE (1 controller+2 workers) deployment. It failed in deploying RHV when it said: Puppet run for heuristic-algo4.b.b puppet run reported as out of sync for the last 10 polls - something may have gone wrong However when I looked in /var/log/messages* on the host that failed, there were no puppet errors or failures. It should also be noted that I was using the custom naming scheme for the hosts. This probably did not contribute to the problem but was something different I did. Version-Release number of selected component (if applicable): QCI-1.0-RHEL-7-20160819.t.0 How reproducible: Don't know. Steps to Reproduce: 1. Do a RHV(self-hosted, 4 nodes) + CFME + OSE (1 controller+2 workers) deployment using the custom naming scheme. Actual results: It failed deploying a host with Puppet run for heuristic-algo4.b.b puppet run reported as out of sync for the last 10 polls - something may have gone wrong Expected results: It to succeed. Additional info:
We tried to resume the task, but it later timed out with the following error: ERF42-7017 [Foreman::Exception]: You've reached the timeout set for this action. If the action is still ongoing, you can click on the "Resume Deployment" button to continue.
I am seeing this repeatedly on my deployments of RHV self-hosted. In my environment, the puppet run on the hypervisor system often takes an hour or more to complete. Resuming the Deploy Red Hat Virtualization task in dynflow after the puppet run completes, then resuming the Deploy task, will complete the deployment. I think that either the poll_intervals and attempts_before_next_interval values in server/app/lib/actions/fusor/host/wait_for_puppet.rb need to be changed, or the Out of sync interval and Puppet interval settings in Administer > Settings > Puppet should be tweaked, or there should at least be a more user-friendly way to resume the deployment than going into multiple dynflow tasks and resuming them in the right order.
Fixed during 1.1 development work
This bug is now invalid because puppet is no longer part of our deployment. Instead we use ansible, so I'm just going to mark as verified without any testing. If the ansible code shows a similar behavior I'll write a new bug.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:0335