Description of problem: Testing in the scale lab, trying to boot 256 instances across 8 compute nodes concurrently. Have 16 L3 networks setup with nets, subnets, routers. 190 of the vms booted, the other 66 all failed with 'Port e0a326e6-8fa4-4a5b-8b07-8d1b9c2461a5 could not be found, (different uuid in each case). The failures were spread across all of the hypervisors Version-Release number of selected component (if applicable): openstack-ceilometer-api.noarch 2013.2.2-1.el6ost @RHOS-4.0 openstack-ceilometer-central.noarch 2013.2.2-1.el6ost @RHOS-4.0 openstack-ceilometer-collector.noarch 2013.2.2-1.el6ost @RHOS-4.0 openstack-ceilometer-common.noarch 2013.2.2-1.el6ost @RHOS-4.0 openstack-cinder.noarch 2013.2.2-1.el6ost @RHOS-4.0 openstack-dashboard.noarch 2013.2.2-1.el6ost @RHOS-4.0 openstack-dashboard-theme.noarch 2013.2.2-1.el6ost @RHOS-4.0 openstack-glance.noarch 2013.2.2-2.el6ost @RHOS-4.0 openstack-heat-api.noarch 2013.2.2-1.el6ost @RHOS-4.0 openstack-heat-api-cfn.noarch 2013.2.2-1.el6ost @RHOS-4.0 openstack-heat-api-cloudwatch.noarch 2013.2.2-1.el6ost @RHOS-4.0 openstack-heat-common.noarch 2013.2.2-1.el6ost @RHOS-4.0 openstack-heat-engine.noarch 2013.2.2-1.el6ost @RHOS-4.0 openstack-keystone.noarch 2013.2.2-1.el6ost @RHOS-4.0 openstack-neutron.noarch 2013.2.2-1.el6ost @RHOS-4.0 openstack-neutron-openvswitch.noarch 2013.2.2-1.el6ost @RHOS-4.0 openstack-nova-api.noarch 2013.2.2-2.el6ost @RHOS-4.0 openstack-nova-cert.noarch 2013.2.2-2.el6ost @RHOS-4.0 openstack-nova-common.noarch 2013.2.2-2.el6ost @RHOS-4.0 openstack-nova-conductor.noarch 2013.2.2-2.el6ost @RHOS-4.0 openstack-nova-console.noarch 2013.2.2-2.el6ost @RHOS-4.0 openstack-nova-novncproxy.noarch 2013.2.2-2.el6ost @RHOS-4.0 openstack-nova-scheduler.noarch 2013.2.2-2.el6ost @RHOS-4.0 How reproducible: Occurs every run Steps to Reproduce: 1.fire up a large environemnt 2.create the networks, get the uuids in a list 3.loop through the list creating 16 vms on each netork Actual results: 190 successful, 66 failed Expected results: all boot Additional info: fault | {u'message': u'Port 884dec0f-d106-4839-a197-3ac770c90081 could not be found | | | ', u'code': 500, u'details': u' File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 258, in decorated_function | | | return function(self, context, *args, **kwargs) | | | File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1630, in run_instance | | | do_run_instance() | | | File "/usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py", line 246, in inner | | | return f(*args, **kwargs) | | | File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1629, in do_run_instance | | | legacy_bdm_in_spec) | | | File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 968, in _run_instance | | | notify("error", msg=unicode(e)) # notify that build failed | | | File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 952, in _run_instance | | | instance, image_meta, legacy_bdm_in_spec) | | | File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1091, in _build_instance | | | filter_properties, bdms, legacy_bdm_in_spec) | | | File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1135, in _reschedule_or_error | | | self._log_original_error(exc_info, instance_uuid) | | | File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1130, in _reschedule_or_error | | | bdms, requested_networks) | | | File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1681, in _shutdown_instance | | | self._try_deallocate_network(context, instance, requested_networks) | | | File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1641, in _try_deallocate_network | | | self._set_instance_error_state(context, instance[\'uuid\']) | | | File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1636, in _try_deallocate_network | | | self._deallocate_network(context, instance, requested_networks) | | | File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1479, in _deallocate_network | | | context, instance, requested_networks=requested_networks) | | | File "/usr/lib/python2.6/site-packages/nova/network/neutronv2/api.py", line 421, in deallocate_for_instance | | | port) | | | File "/usr/lib/python2.6/site-packages/nova/network/neutronv2/api.py", line 414, in deallocate_for_instance | | | neutron.delete_port(port) | | | File "/usr/lib/python2.6/site-packages/neutronclient/v2_0/client.py", line 108, in with_params | | | ret = self.function(instance, *args, **kwargs) | | | File "/usr/lib/python2.6/site-packages/neutronclient/v2_0/client.py", line 318, in delete_port | | | return self.delete(self.port_path % (port)) | | | File "/usr/lib/python2.6/site-packages/neutronclient/v2_0/client.py", line 1179, in delete | | | headers=headers, params=params) | | | File "/usr/lib/python2.6/site-packages/neutronclient/v2_0/client.py", line 1168, in retry_request | | | headers=headers, params=params) | | | File "/usr/lib/python2.6/site-packages/neutronclient/v2_0/client.py", line 1111, in do_request | | | self._handle_fault_response(status_code, replybody) | | | File "/usr/lib/python2.6/site-packages/neutronclient/v2_0/client.py", line 1081, in _handle_fault_response | | | exception_handler_v20(status_code, des_error_body) | | | File "/usr/lib/python2.6/site-packages/neutronclient/v2_0/client.py", line 78, in exception_handler_v20 | | | raise ex | | | ', u'created': u'2014-03-26T17:06:00Z'}
Hi Mark, Did you find the root cause of the problem?
This was probably fixed in Icehouse by introducing Nova notifications: https://blueprints.launchpad.net/neutron/+spec/nova-event-callback These notifications facilitated reliable startup of instances only after port is ready on neutron side. FYI I think it can't be fixed in Havana. Closing the bug as fixed.
We have booted 128 VMs on two compute no error was seen python-neutron-2014.1-35.el7ost.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-0848.html