Bug 1126194 - [Rubygem-Staypuft]: Deployment stuck at 87% (Neutron compute stuck@60%)
Summary: [Rubygem-Staypuft]: Deployment stuck at 87% (Neutron compute stuck@60%)
Keywords:
Status: CLOSED DUPLICATE of bug 1122693
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-foreman-installer
Version: Foreman (RHEL 6)
Hardware: x86_64
OS: Linux
urgent
unspecified
Target Milestone: ga
: Installer
Assignee: Jason Guiditta
QA Contact: Ami Jeain
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-08-03 12:25 UTC by Tzach Shefi
Modified: 2014-08-12 19:25 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-08-12 19:25:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
var log messages of Controller and compute (87.03 KB, application/x-gzip)
2014-08-03 12:25 UTC, Tzach Shefi
no flags Details
Compute1 (90.71 KB, image/png)
2014-08-03 12:27 UTC, Tzach Shefi
no flags Details
Compute1 (71.96 KB, image/png)
2014-08-03 12:27 UTC, Tzach Shefi
no flags Details
After running puppet-agent -t (53.19 KB, text/plain)
2014-08-03 12:28 UTC, Tzach Shefi
no flags Details
journalctl -xn (on compute) (10.48 MB, text/plain)
2014-08-03 12:46 UTC, Tzach Shefi
no flags Details

Description Tzach Shefi 2014-08-03 12:25:42 UTC
Created attachment 923596 [details]
var log messages of Controller and compute

Description of problem: Deployment stuck at 87%, Neutron compute node hangs at 60%. See attached screen shoot for more details. Rebooting compute didn't resolve. 


Version-Release number of selected component (if applicable):
openstack-foreman-installer-2.0.16-1.el6ost.noarch


How reproducible:
Unsure first time it happens. 

Steps to Reproduce:
1. Simple Neutron deployment: 1 controller, 1 neutron controller, 1 compute. 
2. 
3.

Actual results:
Deployment stuck at 87%, compute host is stuck at 60%.

Expected results:
Should complete 100%. 

Additional info:
Attached /var/log/messages from Compute and controller, plus journalctl from compute. 
Now trying puppet agent -t will update on output.

Comment 1 Tzach Shefi 2014-08-03 12:27:22 UTC
Created attachment 923597 [details]
Compute1

Comment 2 Tzach Shefi 2014-08-03 12:27:47 UTC
Created attachment 923598 [details]
Compute1

Comment 3 Tzach Shefi 2014-08-03 12:28:50 UTC
Created attachment 923599 [details]
After running puppet-agent -t

Comment 4 Tzach Shefi 2014-08-03 12:46:34 UTC
Created attachment 923613 [details]
journalctl -xn  (on compute)

Comment 5 Mike Burns 2014-08-04 12:04:12 UTC
relevant error from messages:

Aug  3 11:24:50 maca25400868096 puppet-agent[4302]: (/Stage[main]/Nova::Compute/Nova::Generic_service[compute]/Service[nova-compute]/ensure) change from stopped to running failed: Could not start Service[nova-compute]: Execution of '/usr/bin/systemctl start openstack-nova-compute' returned 1: Job for openstack-nova-compute.service failed. See 'systemctl status openstack-nova-compute.service' and 'journalctl -xn' for details.

journalctl:

Aug 03 11:29:54 maca25400868096.example.com puppet-agent[2337]: Could not set 'present' on ensure: Connection timed out - connect(2) at 110:/etc/puppet/environments/production/modules/neutron/manifests/server/notifications.pp
Aug 03 11:29:54 maca25400868096.example.com puppet-agent[2337]: Could not set 'present' on ensure: Connection timed out - connect(2) at 110:/etc/puppet/environments/production/modules/neutron/manifests/server/notifications.pp
Aug 03 11:29:54 maca25400868096.example.com puppet-agent[2337]: Wrapped exception:
Aug 03 11:29:54 maca25400868096.example.com puppet-agent[2337]: Connection timed out - connect(2)
Aug 03 11:29:54 maca25400868096.example.com puppet-agent[2337]: (/Stage[main]/Neutron::Server::Notifications/Nova_admin_tenant_id_setter[nova_admin_tenant_id]/ensure) change from absent to present failed: Could not set 'present' on ensure: Connection timed out - connect(2) at 110:/etc/puppet/environments/production/modules/neutron/manifests/server/notifications.pp
Aug 03 11:29:54 maca25400868096.example.com puppet-agent[2337]: (/Stage[main]/Nova::Compute::Libvirt/Service[messagebus]/ensure) ensure changed 'stopped' to 'running'
Aug 03 11:31:24 maca25400868096.example.com puppet-agent[2337]: Could not start Service[nova-compute]: Execution of '/usr/bin/systemctl start openstack-nova-compute' returned 1: Job for openstack-nova-compute.service failed. See 'systemctl status openstack-nova-compute.service' and 'journalctl -xn' for details.
Aug 03 11:31:24 maca25400868096.example.com puppet-agent[2337]: Wrapped exception:
Aug 03 11:31:24 maca25400868096.example.com puppet-agent[2337]: Execution of '/usr/bin/systemctl start openstack-nova-compute' returned 1: Job for openstack-nova-compute.service failed. See 'systemctl status openstack-nova-compute.service' and 'journalctl -xn' for details.
Aug 03 11:31:24 maca25400868096.example.com puppet-agent[2337]: (/Stage[main]/Nova::Compute/Nova::Generic_service[compute]/Service[nova-compute]/ensure) change from stopped to running failed: Could not start Service[nova-compute]: Execution of '/usr/bin/systemctl start openstack-nova-compute' returned 1: Job for openstack-nova-compute.service failed. See 'systemctl status openstack-nova-compute.service' and 'journalctl -xn' for details.

Comment 6 Jason Guiditta 2014-08-05 21:19:41 UTC
Parts of this looks very similar to the error in https://bugzilla.redhat.com/show_bug.cgi?id=1122693

I am going to ask for the same information as that BZ. Can you get 'systemctl status openstack-nova-network.service' and 'journalctl -xn' as suggested in the attached messages?  Also, if you could post the yaml for the node that had a problem (controller and networker would be useful as well), information on the bridges created on compute and network nodes?  ifconfig+'ovs-vsctl show' would be very helpful.  I have not yet been able to reproduce this, so the more information you can provide, the better chance I can verify it is a bug vs some kind of configuration issue

Comment 8 Tzach Shefi 2014-08-06 08:26:28 UTC
Sorry Jason, that setup is gone by now. 
Running this again in test day just now, I'll update and try tips if I get stuck with same problem again.

Comment 9 Mike Burns 2014-08-12 19:25:45 UTC

*** This bug has been marked as a duplicate of bug 1122693 ***


Note You need to log in before you can comment on or make changes to this bug.