openstack-nova: can't launch instances: Multiple security groups found matching 'default'. Use an ID to be more specific. Environment: openstack-nova-console-2014.2.1-7.el7ost.noarch openstack-nova-conductor-2014.2.1-7.el7ost.noarch openstack-nova-novncproxy-2014.2.1-7.el7ost.noarch openstack-nova-scheduler-2014.2.1-7.el7ost.noarch openstack-nova-api-2014.2.1-7.el7ost.noarch openstack-nova-cert-2014.2.1-7.el7ost.noarch openstack-nova-common-2014.2.1-7.el7ost.noarch Steps to reproduce: 1. Attempt to launch instances. Result: Error: Failed to launch instance "f-a4b191a1-a5d5-4af0-8b75-bed28d0790b9": Please try again later [Error: No valid host was found. ]. Checking further, get this in the log: Dec 17 12:47:08 maca25400868097 nova-compute: Traceback (most recent call last): Dec 17 12:47:08 maca25400868097 nova-compute: File "/usr/lib/python2.7/site-packages/eventlet/hubs/poll.py", line 115, in wait Dec 17 12:47:08 maca25400868097 nova-compute: listener.cb(fileno) Dec 17 12:47:08 maca25400868097 nova-compute: File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 212, in main Dec 17 12:47:08 maca25400868097 nova-compute: result = function(*args, **kwargs) Dec 17 12:47:08 maca25400868097 nova-compute: File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1643, in _allocate_network_async Dec 17 12:47:08 maca25400868097 nova-compute: dhcp_options=dhcp_options) Dec 17 12:47:08 maca25400868097 nova-compute: File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 357, in allocate_for_instance Dec 17 12:47:08 maca25400868097 nova-compute: security_group) Dec 17 12:47:08 maca25400868097 nova-compute: NoUniqueMatch: Multiple security groups found matching 'default'. Use an ID to be more specific. Dec 17 12:47:08 maca25400868097 nova-compute: Removing descriptor: 23 [root@maca25400702875 ~(openstack_admin)]# nova secgroup-list +--------------------------------------+---------+-------------+ | Id | Name | Description | +--------------------------------------+---------+-------------+ | 0ae22148-ffa9-4a25-a3c6-97f4ccaf462f | default | default | | 25eaeff9-1005-4f1f-ae99-eaf9faec144a | default | default | [root@maca25400702875 ~(openstack_admin)]# nova secgroup-delete 25eaeff9-1005-4f1f-ae99-eaf9faec144a +--------------------------------------+---------+-------------+ | Id | Name | Description | +--------------------------------------+---------+-------------+ | 25eaeff9-1005-4f1f-ae99-eaf9faec144a | default | default | +--------------------------------------+---------+-------------+ Even then unable to launch instances. Expected result: Should be able to launch instances.
Created attachment 970251 [details] nova-compute logs from computes
I think this bug is known upstream and a fix is in progress. It's a race condition in which multiple default security groups can be created for a tenant.
(In reply to Maru Newby from comment #2) > I think this bug is known upstream and a fix is in progress. It's a race > condition in which multiple default security groups can be created for a > tenant. I'm going to move this bug over to Neutron based on the above comment.
didn't reproduce with HAneutron deployment + GRE network: Environment: openstack-heat-api-2014.2.1-2.el7ost.noarch openstack-utils-2014.2-1.el7ost.noarch openstack-dashboard-theme-2014.2.1-3.el7ost.noarch openstack-ceilometer-api-2014.2.1-1.el7ost.noarch openstack-neutron-2014.2.1-2.el7ost.noarch openstack-ceilometer-central-2014.2.1-1.el7ost.noarch openstack-nova-console-2014.2.1-8.el7ost.noarch openstack-keystone-2014.2.1-1.el7ost.noarch openstack-cinder-2014.2.1-1.el7ost.noarch openstack-heat-common-2014.2.1-2.el7ost.noarch openstack-heat-api-cloudwatch-2014.2.1-2.el7ost.noarch openstack-heat-api-cfn-2014.2.1-2.el7ost.noarch openstack-dashboard-2014.2.1-3.el7ost.noarch openstack-ceilometer-collector-2014.2.1-1.el7ost.noarch openstack-selinux-0.6.5-1.el7ost.noarch openstack-ceilometer-common-2014.2.1-1.el7ost.noarch openstack-ceilometer-alarm-2014.2.1-1.el7ost.noarch openstack-nova-conductor-2014.2.1-8.el7ost.noarch openstack-nova-novncproxy-2014.2.1-8.el7ost.noarch openstack-nova-scheduler-2014.2.1-8.el7ost.noarch openstack-nova-api-2014.2.1-8.el7ost.noarch openstack-nova-cert-2014.2.1-8.el7ost.noarch openstack-neutron-ml2-2014.2.1-2.el7ost.noarch openstack-heat-engine-2014.2.1-2.el7ost.noarch python-django-openstack-auth-1.1.7-3.el7ost.noarch redhat-access-plugin-openstack-6.0.2-0.el7ost.noarch openstack-glance-2014.2.1-1.el7ost.noarch openstack-ceilometer-notification-2014.2.1-1.el7ost.noarch openstack-neutron-openvswitch-2014.2.1-2.el7ost.noarch openstack-nova-common-2014.2.1-8.el7ost.noarch [root@maca25400702875 ~(openstack_admin)]# nova secgroup-list +--------------------------------------+---------+-------------+ | Id | Name | Description | +--------------------------------------+---------+-------------+ | e8941af7-e0f9-4385-a7de-22b228f2a594 | default | default | +--------------------------------------+---------+-------------+ [root@maca25400702875 ~(openstack_admin)]# Was also able to launch instances.
Didn't reproduced: Environment: openstack-puppet-modules-2014.2.8-1.el7ost.noarch ruby193-rubygem-foreman_openstack_simplify-0.0.6-8.el7ost.noarch openstack-foreman-installer-3.0.8-1.el7ost.noarch rhel-osp-installer-0.5.4-1.el7ost.noarch ruby193-rubygem-staypuft-0.5.9-1.el7ost.noarch rhel-osp-installer-client-0.5.4-1.el7ost.noarch
The u/s bug was not resolved yet and looks like this race is not easily reproducible. We would wait for u/s fix and then backport if relevant.
It won't be easy to backport u/s bug fix since it includes new db migration.
Livnat, please comment on how we're going to handle that in RHOS6 if the fix requires backporting db migrations. We could try to solve it without db migrations, though it would mean a downstream only patch that hugely diverges from upstream solution. We should consider all that before targeting it to A1.
Reading the code, it makes me think that Nova will fail like that only if user explicitly requested default security group in his 'nova boot' request. Otherwise a default group should be properly assigned with no issue. Alex, is it the case for the bug?
This is the command I used: nova boot --flavor 1 --key_name [key name as seen in 'nova keypair-list'] --image [image id as seen in 'glance image-list'] --nic net-id=[net id as seen in neutron net-list] <instance name> The issue was that there were 2 sec groups named "default".
moving to A2 as this won't make it in time for A1
I have tried to apply locking on neutron side to see how it hits sec group performance. Rally run showed it hits pretty hard: https://review.openstack.org/#/c/156596/ I am not sure we want to follow this route then.
The fix for this bug is available in upstream Kilo. The backport of this fix requires a data base change which is not trivial to backport. https://review.openstack.org/#/c/142101/ In any case, once this bug occurs there isa need for a manual intervention to fix the data base Since this issue has been in the code-base for at least 2 cycles we decided to hold back on backporting the db fix.