Bug 971082
Summary: | Instance boot fails when security groups enabled and no network created | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Daniel Berrangé <berrange> | |
Component: | openstack-nova | Assignee: | Brent Eagles <beagles> | |
Status: | CLOSED ERRATA | QA Contact: | Ami Jeain <ajeain> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | unspecified | CC: | afazekas, apevec, beagles, bperkins, chrisw, dallan, jkt, jliberma, jturner, mlopes, ndipanov, sclewis, xqueralt, yeylon | |
Target Milestone: | Upstream M3 | |||
Target Release: | 4.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | openstack-nova-2013.2-0.23.rc1.el6ost | Doc Type: | Bug Fix | |
Doc Text: |
Prior to this update, security group checks run against instances without networks would result in 0 matches and rejection. As a result, booting an instance would fail if there were no configured networks.
With this update, security group checks do not run if there are no configured networks, and it is now possible to create instances without networks.
|
Story Points: | --- | |
Clone Of: | ||||
: | 981028 (view as bug list) | Environment: | ||
Last Closed: | 2013-12-20 00:04:43 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 981028 |
Description
Daniel Berrangé
2013-06-05 15:53:24 UTC
Reproduced on a fresh install, after reboot: DEBUG: quantumclient.client REQ: curl -i http://192.168.129.3:9696/v2.0/security-groups.json -X GET -H "User-Agent: python-quantumclient" -H "Content-Type: application/json" -H "Accept: application/json" -H "X-Auth-Token: c44d9611508b435ea4f570253eeb27bf" DEBUG: quantumclient.client RESP:{'date': 'Wed, 05 Jun 2013 17:37:59 GMT', 'status': '200', 'content-length': '23', 'content-type': 'application/json; charset=UTF-8', 'content-location': u'http://192.168.129.3:9696/v2.0/security-groups.json'} {"security_groups": []} In ovs_quantum db: mysql> select * from securitygroups; Empty set (0.00 sec) When is the default group created then? Alan answered the following questions: - nova.conf security_group_api = quantum (not nova) - nova secgroup-list is null list like quantum security-group-list - the db is truly empty (no default security group) Some relevant code snippets: quantum/plugins/openvswitch/ovs_quantum_plugin.py::create_network() ... self._ensure_default_security_group(context, tenant_id) The default security group is created lazily. And in response to #c1, Dan, the default security group is also built when a new security group is created: quantum/db/securitygroups_db.py::create_security_group() ... if not default_sg: self._ensure_default_security_group(context, tenant_id) So I believe the issue is at launch time, in nova. Nova should not look for a security group if there is no network. (In reply to Chris Wright from comment #4) > So I believe the issue is at launch time, in nova. Nova should not look for > a security group if there is no network. Should this be moved to openstack-nova then? (In reply to Daniel Berrange from comment #0) > After uploading an image to glance, I am unable to start a VM. Nova compute > logs an error What where your steps exactly? Docs[1] actually assume network is created first, and quantum net-create would trigger creation of the default security group. [1] http://docs.openstack.org/trunk/openstack-network/admin/content/basic_workflow_with_nova.html Exact set of commands were: # packstack --allinone # . keystonerc_admin # glance image-create --name f17 --disk-format qcow2 --container-format bare --file /root/f17-x86_64-openstack-sda.qcow2 --is-public True # nova boot --flavor m1.tiny --image f17 f17demo1 I did not attempt to create any network, nor specify any --nic when booting the guest. The Quantum security groups are for quantum ports. A port can only be part of a Quantum network. When a nic is added there should be a security group attached to the VM. Can you please clarfiy why this is a problem. (In reply to Gary Kotton from comment #8) > The Quantum security groups are for quantum ports. A port can only be part > of a Quantum network. > When a nic is added there should be a security group attached to the VM. > Can you please clarfiy why this is a problem. The above commands & stack trace show the problem. Booting an instance with the above sequence of commands shouldn't result in a stack trace about a missing security group. *** Bug 967283 has been marked as a duplicate of this bug. *** Reading through the thread, based on analysis by cdub in Comment # 4 this looks like an issue in openstack-nova. Moving to that component and assigning to a nova engineer to look at it more closely. In short... If a quantum network is created and you launch an instance on that network, the default security group should be created lazily. But if you launch an instance without a network, the default security group is not created by quantum. In this case, nova should probably either not launch the VM at all (because what use is a VM without a network?) or just skip associating the VM with a security group completely. Also, if it is the case that this only happens when launching an instance before a quantum network is created, I think this can be easily worked around by telling users "prior to launching an instance, please create a quantum network for the instance to use". This can be part of the users guide. If that workaround is accurate, this is not a RHOS 3.0 blocker and we can push it off. I'd like confirmation from cdub and danpb before we do that though. I don't think the proposed workaround is sufficient I also see this error when booting an instance in a second tenant I can reproduce this error by... 1. installing packstack 2. create two tenants, each with a user and role 3. import an image into glance 4. as first tenant user, create a network and boot an instance (works no errors) 5. as second tenant user, create a network and boot an instance --> FAILs with "Security group default not found." > 1. installing packstack
> 2. create two tenants, each with a user and role
> 3. import an image into glance
> 4. as first tenant user, create a network and boot an instance (works no
> errors)
> 5. as second tenant user, create a network and boot an instance --> FAILs
> with "Security group default not found."
Ok, so the lazy creation of default security group is only happening for the first quantum network or tenant, and not for the second. That makes this bug a bit higher in severity.
Information note: The check that is causing the issue is called from _validate_and_provision_instance in compute/api.py. If no security group or default security group is specified, the code throws this exception simply because it checks for the security group before it does the network check. So the lazy initialization makes the order of checks a little more fragile. Personal observations: I'm mulling this one over... a hack might be to "eat the exception" if the security group is default. Sure does feel like a hack. You could also reverse the checks. Would that be bad? I don't think security groups are meant to affect the ability to check for the existence of a network. As long as it does not violate the notion of security group checks in this context, that's preferable to swallowing the exception. Comment 16 is not valid to this case, the api handles the "default" security group case and it throws further on. I was thinking about my patch and got to wondering if it was actually a little silly. It still will throw an exception and the VM will not be created, but the error information in the nova show output will be "NoValidHost"... which I think is similar to what happens if there is no network available when using nova-network (not 100% sure but I vaguely recall something like that). While it is better than throwing a bogus exception, does this actually fix the bug in the minds of all? verified: 1. created an image 2. neutron net-list [root@cougar14 ~(keystone_admin)]# 3. booted that image (saw there is only a default secgroup and didn't create any network 4. no error during the boot: # nova boot --flavor m1.tiny --image f18 f18demo1 +--------------------------------------+--------------------------------------+ | Property | Value | +--------------------------------------+--------------------------------------+ | OS-EXT-STS:task_state | scheduling | | image | f18 | | OS-EXT-STS:vm_state | building | | OS-EXT-SRV-ATTR:instance_name | instance-00000001 | | OS-SRV-USG:launched_at | None | | flavor | m1.tiny | | id | 9e7e0929-e02e-429c-8125-445b4d0556a4 | | security_groups | [{u'name': u'default'}] | | user_id | c6311b5310ce4306831c4673b79a452a | | OS-DCF:diskConfig | MANUAL | | accessIPv4 | | | accessIPv6 | | | progress | 0 | | OS-EXT-STS:power_state | 0 | | OS-EXT-AZ:availability_zone | nova | | config_drive | | | status | BUILD | | updated | 2013-10-11T05:13:52Z | | hostId | | | OS-EXT-SRV-ATTR:host | None | | OS-SRV-USG:terminated_at | None | | key_name | None | | OS-EXT-SRV-ATTR:hypervisor_hostname | None | | name | f18demo1 | | adminPass | aiDMWTLn8Psa | | tenant_id | a20e6029a126406ca896fbdf0ac7a353 | | created | 2013-10-11T05:13:51Z | | os-extended-volumes:volumes_attached | [] | | metadata | {} | +--------------------------------------+--------------------------------------+ Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2013-1859.html |