Created attachment 923596 [details] var log messages of Controller and compute Description of problem: Deployment stuck at 87%, Neutron compute node hangs at 60%. See attached screen shoot for more details. Rebooting compute didn't resolve. Version-Release number of selected component (if applicable): openstack-foreman-installer-2.0.16-1.el6ost.noarch How reproducible: Unsure first time it happens. Steps to Reproduce: 1. Simple Neutron deployment: 1 controller, 1 neutron controller, 1 compute. 2. 3. Actual results: Deployment stuck at 87%, compute host is stuck at 60%. Expected results: Should complete 100%. Additional info: Attached /var/log/messages from Compute and controller, plus journalctl from compute. Now trying puppet agent -t will update on output.
Created attachment 923597 [details] Compute1
Created attachment 923598 [details] Compute1
Created attachment 923599 [details] After running puppet-agent -t
Created attachment 923613 [details] journalctl -xn (on compute)
relevant error from messages: Aug 3 11:24:50 maca25400868096 puppet-agent[4302]: (/Stage[main]/Nova::Compute/Nova::Generic_service[compute]/Service[nova-compute]/ensure) change from stopped to running failed: Could not start Service[nova-compute]: Execution of '/usr/bin/systemctl start openstack-nova-compute' returned 1: Job for openstack-nova-compute.service failed. See 'systemctl status openstack-nova-compute.service' and 'journalctl -xn' for details. journalctl: Aug 03 11:29:54 maca25400868096.example.com puppet-agent[2337]: Could not set 'present' on ensure: Connection timed out - connect(2) at 110:/etc/puppet/environments/production/modules/neutron/manifests/server/notifications.pp Aug 03 11:29:54 maca25400868096.example.com puppet-agent[2337]: Could not set 'present' on ensure: Connection timed out - connect(2) at 110:/etc/puppet/environments/production/modules/neutron/manifests/server/notifications.pp Aug 03 11:29:54 maca25400868096.example.com puppet-agent[2337]: Wrapped exception: Aug 03 11:29:54 maca25400868096.example.com puppet-agent[2337]: Connection timed out - connect(2) Aug 03 11:29:54 maca25400868096.example.com puppet-agent[2337]: (/Stage[main]/Neutron::Server::Notifications/Nova_admin_tenant_id_setter[nova_admin_tenant_id]/ensure) change from absent to present failed: Could not set 'present' on ensure: Connection timed out - connect(2) at 110:/etc/puppet/environments/production/modules/neutron/manifests/server/notifications.pp Aug 03 11:29:54 maca25400868096.example.com puppet-agent[2337]: (/Stage[main]/Nova::Compute::Libvirt/Service[messagebus]/ensure) ensure changed 'stopped' to 'running' Aug 03 11:31:24 maca25400868096.example.com puppet-agent[2337]: Could not start Service[nova-compute]: Execution of '/usr/bin/systemctl start openstack-nova-compute' returned 1: Job for openstack-nova-compute.service failed. See 'systemctl status openstack-nova-compute.service' and 'journalctl -xn' for details. Aug 03 11:31:24 maca25400868096.example.com puppet-agent[2337]: Wrapped exception: Aug 03 11:31:24 maca25400868096.example.com puppet-agent[2337]: Execution of '/usr/bin/systemctl start openstack-nova-compute' returned 1: Job for openstack-nova-compute.service failed. See 'systemctl status openstack-nova-compute.service' and 'journalctl -xn' for details. Aug 03 11:31:24 maca25400868096.example.com puppet-agent[2337]: (/Stage[main]/Nova::Compute/Nova::Generic_service[compute]/Service[nova-compute]/ensure) change from stopped to running failed: Could not start Service[nova-compute]: Execution of '/usr/bin/systemctl start openstack-nova-compute' returned 1: Job for openstack-nova-compute.service failed. See 'systemctl status openstack-nova-compute.service' and 'journalctl -xn' for details.
Parts of this looks very similar to the error in https://bugzilla.redhat.com/show_bug.cgi?id=1122693 I am going to ask for the same information as that BZ. Can you get 'systemctl status openstack-nova-network.service' and 'journalctl -xn' as suggested in the attached messages? Also, if you could post the yaml for the node that had a problem (controller and networker would be useful as well), information on the bridges created on compute and network nodes? ifconfig+'ovs-vsctl show' would be very helpful. I have not yet been able to reproduce this, so the more information you can provide, the better chance I can verify it is a bug vs some kind of configuration issue
Sorry Jason, that setup is gone by now. Running this again in test day just now, I'll update and try tips if I get stuck with same problem again.
*** This bug has been marked as a duplicate of bug 1122693 ***