+++ This bug was initially created as a clone of Bug #1269610 +++ Description of problem: Overcloud deployment fails with the nova instances being in error state. Neutron server log shows 'Failed to bind port' messages. Version-Release number of selected component (if applicable): instack-0.0.8-dev4.el7.centos.noarch instack-undercloud-2.1.3-dev222.el7.centos.noarch How reproducible: 100% Steps to Reproduce: 1. openstack overcloud deploy --templates Actual results: [stack@instack ~]$ openstack overcloud deploy --templates Deploying templates in the directory /usr/share/openstack-tripleo-heat-templates Stack failed with status: Resource CREATE failed: resources.Controller: ResourceInError: resources[0].resources.Controller: Went to status ERROR due to "Message: Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance c63bb1e9-ae43-48e2-8824-b9a8f1ccfc5e. Last exception: [u'Traceback (most recent call last): \n', u' File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1, Code: 500" Expected results: Stack is created successfully. Additional info: [stack@instack ~]$ nova list +--------------------------------------+-------------------------+--------+------------+-------------+----------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+-------------------------+--------+------------+-------------+----------+ | e5c061ce-ad2e-4b62-8f13-8180776b5727 | overcloud-controller-0 | ERROR | - | NOSTATE | | | b50ad55c-55ce-4916-888f-ed5d601f91d1 | overcloud-novacompute-0 | ERROR | - | NOSTATE | | +--------------------------------------+-------------------------+--------+------------+-------------+----------+ neutron logs show Failed to bind port messages which point to neutron-openvswitch-agent not running. openvswitch agent log shows: ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-ab5b4e8e-bf5c-444b-a74c-338fd54e3363 - - - - -] invalid literal for int() with base 10: 'None' Agent terminated! Attaching openvswitch agent log. --- Additional comment from Marius Cornea on 2015-10-07 13:49:37 EDT --- Following the docs at http://docs.openstack.org/developer/tripleo-docs/installation/installing.html Installed repos: http://paste.openstack.org/show/475641/ [stack@instack ~]$ rpm -qa | grep neutron python-neutron-8.0.0-dev362.el7.centos.noarch python-neutronclient-3.1.1-dev7.el7.centos.noarch openstack-neutron-openvswitch-8.0.0-dev362.el7.centos.noarch openstack-neutron-8.0.0-dev362.el7.centos.noarch openstack-neutron-ml2-8.0.0-dev362.el7.centos.noarch openstack-neutron-common-8.0.0-dev362.el7.centos.noarch --- Additional comment from John Trowbridge on 2015-10-08 09:47:49 EDT --- I see one issue here, but I am not sure if it is the root cause. Those neutron packages are Mitaka. This is a problem with using upstream docs for RDO. We need to fork the upstream docs so that we can point to liberty trunk repos. Could you see if you can reproduce this issue using liberty repos? http://trunk.rdoproject.org/centos7-liberty/ instead of http://trunk.rdoproject.org/centos7/ --- Additional comment from Graeme Gillies on 2015-10-12 22:22:30 EDT --- I can reproduce this problem in liberty with the following packages python-neutron-7.0.0.0-rc2.dev21.el7.centos.noarch python-neutronclient-3.1.1-dev1.el7.centos.noarch openstack-neutron-ml2-7.0.0.0-rc2.dev21.el7.centos.noarch openstack-neutron-7.0.0.0-rc2.dev21.el7.centos.noarch openstack-neutron-common-7.0.0.0-rc2.dev21.el7.centos.noarch openstack-neutron-openvswitch-7.0.0.0-rc2.dev21.el7.centos.noarch Others are hitting this in RDO Liberty as well https://bugs.launchpad.net/neutron/+bug/1494281 --- Additional comment from Marius Cornea on 2015-10-16 08:20:52 EDT --- I've just hit this again after a couple of overcloud redeployments. The openvswitch agent eventually fails with 'invalid literal for int() with base 10: 'None' Agent terminated!' and no further instances can be boot up. openstack-neutron-ml2-7.0.0.0-rc2.dev21.el7.centos.noarch openstack-neutron-openvswitch-7.0.0.0-rc2.dev21.el7.centos.noarch openstack-neutron-common-7.0.0.0-rc2.dev21.el7.centos.noarch openstack-neutron-7.0.0.0-rc2.dev21.el7.centos.noarch python-neutron-7.0.0.0-rc2.dev21.el7.centos.noarch python-neutronclient-3.1.1-dev1.el7.centos.noarch --- Additional comment from Marius Cornea on 2015-10-16 09:37:17 EDT --- To reproduce this issue you can restart the openvswitch-agent right after the undercloud installation and the error should show up: systemctl restart neutron-openvswitch-agent.service --- Additional comment from Ihar Hrachyshka on 2015-10-19 08:18:02 EDT --- The segment id is provided by neutron-server which should allocate it for a port and push into the agent. I suspect a configuration issue. Please attach config and log files for ovs agent and neutron-server. --- Additional comment from Marius Cornea on 2015-10-19 14:23 EDT --- Attached. Thanks. --- Additional comment from Ihar Hrachyshka on 2015-10-20 08:00:17 EDT --- I don't see all config files that are read by neutron-server, specifically, plugin.ini (which is probably ml2_conf.ini). I believe it's the issue known to upstream: https://bugs.launchpad.net/neutron/+bug/1494281 I see that your neutron-server uses flat networking (as per logs). This is probably what is broken in Liberty. --- Additional comment from Marius Cornea on 2015-10-20 08:09 EDT --- I attached the ml2_conf.ini. By default there is a single flat network that gets created. Note that the configuration is done by the installer so if any config changes are required we'd probably want to track those in a separate BZ. --- Additional comment from Ihar Hrachyshka on 2015-11-05 07:55:27 EST --- The fix was merged in upstream into Liberty branch. We expect a new release of Neutron this week; once it's there, we'll rebase the package.
I hit this on the undercloud after restarting neutron-openvswitch-agent. Version: openstack-neutron-common-7.0.0-4.el7ost.noarch openstack-neutron-openvswitch-7.0.0-4.el7ost.noarch openstack-neutron-ml2-7.0.0-4.el7ost.noarch openstack-neutron-7.0.0-4.el7ost.noarch python-neutron-7.0.0-4.el7ost.noarch python-neutronclient-3.1.0-1.el7ost.noarch
Verified - fixed in : [stack@instack ~]$ rpm -qa | grep neutron openstack-neutron-7.0.0-5.el7ost.noarch openstack-neutron-ml2-7.0.0-5.el7ost.noarch openstack-neutron-openvswitch-7.0.0-5.el7ost.noarch openstack-neutron-common-7.0.0-5.el7ost.noarch python-neutron-7.0.0-5.el7ost.noarch python-neutronclient-3.1.0-1.el7ost.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-0603.html