Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1083675

Summary: nova-network and nova-compute dies when there is a network without VlanID specified
Product: Red Hat OpenStack Reporter: Jaroslav Henner <jhenner>
Component: openstack-novaAssignee: Brent Eagles <beagles>
Status: CLOSED NOTABUG QA Contact: Ami Jeain <ajeain>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.0CC: jhenner, ndipanov, vpopovic, yeylon
Target Milestone: ---Keywords: ZStream
Target Release: 4.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-23 14:11:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs none

Description Jaroslav Henner 2014-04-02 17:21:55 UTC
Created attachment 881912 [details]
logs

Description of problem:
When switching between FlatDHCP and Vlan manager, the network may contain None in VlanId, making nova fail to _restart_  _after_ booting the VM, which obfuscates the root cause of the problem a bit.


Version-Release number of selected component (if applicable):
openstack-nova-api.noarch             2013.2.2-2.el6ost      @puddle
openstack-nova-cert.noarch            2013.2.2-2.el6ost      @puddle
openstack-nova-common.noarch          2013.2.2-2.el6ost      @puddle
openstack-nova-compute.noarch         2013.2.2-2.el6ost      @puddle
openstack-nova-conductor.noarch       2013.2.2-2.el6ost      @puddle
openstack-nova-console.noarch         2013.2.2-2.el6ost      @puddle
openstack-nova-network.noarch         2013.2.2-2.el6ost      @puddle
openstack-nova-novncproxy.noarch      2013.2.2-2.el6ost      @puddle
openstack-nova-objectstore.noarch     2013.2.2-2.el6ost      @puddle
openstack-nova-scheduler.noarch       2013.2.2-2.el6ost      @puddle
python-nova.noarch                    2013.2.2-2.el6ost      @puddle
python-novaclient.noarch              1:2.15.0-3.el6ost      @puddle


How reproducible:
always

Steps to Reproduce:
# openstack-config --set /etc/nova/nova.conf DEFAULT network_manager nova.network.manager.FlatDHCPManager
# openstack-service reload nova
# openstack-service status nova
openstack-nova-api (pid  29952) is running...
openstack-nova-cert (pid  29967) is running...
openstack-nova-compute (pid  29987) is running...
openstack-nova-conductor (pid  30025) is running...
openstack-nova-consoleauth (pid  30040) is running...
openstack-nova-network (pid  30105) is running...
openstack-nova-novncproxy (pid  30120) is running...
openstack-nova-objectstore (pid  30141) is running...
openstack-nova-scheduler (pid  30158) is running...

# # To make reproducer easy, make sure there is not network:
# nova-manage network list
id   	IPv4              	IPv6           	start address  	DNS1           	DNS2           	VlanID         	project        	uuid           
No networks found

# # Create a network that will have VlanID set to None:
# nova network-create novanetwork --fixed-range-v4 192.168.32.0/22 

# nova network-show novanetwork
+---------------------+--------------------------------------+
| Property            | Value                                |
+---------------------+--------------------------------------+
| bridge              | br100                                |
| bridge_interface    | lo                                   |
| broadcast           | 192.168.35.255                       |
| cidr                | 192.168.32.0/22                      |
| cidr_v6             | -                                    |
| created_at          | 2014-04-02T16:48:15.000000           |
| deleted             | 0                                    |
| deleted_at          | -                                    |
| dhcp_start          | 192.168.32.2                         |
| dns1                | 8.8.4.4                              |
| dns2                | -                                    |
| gateway             | 192.168.32.1                         |
| gateway_v6          | -                                    |
| host                | -                                    |
| id                  | 49c58014-d3e0-4d3b-88e6-15d1971bc1da |
| injected            | False                                |
| label               | novanetwork                          |
| multi_host          | False                                |
| netmask             | 255.255.252.0                        |
| netmask_v6          | -                                    |
| priority            | -                                    |
| project_id          | -                                    |
| rxtx_base           | -                                    |
| updated_at          | -                                    |
| vlan                | -                                    |
| vpn_private_address | -                                    |
| vpn_public_address  | -                                    |
| vpn_public_port     | -                                    |
+---------------------+--------------------------------------+


# # Switch to VLanManager:
# openstack-config --set /etc/nova/nova.conf DEFAULT network_manager nova.network.manager.VlanManager
# openstack-config --set /etc/nova/nova.conf DEFAULT  vlan_interface eth0
# openstack-service reload nova
...
# openstack-service status nova
openstack-nova-api (pid  31551) is running...
openstack-nova-cert (pid  31566) is running...
openstack-nova-compute (pid  31591) is running...
openstack-nova-conductor (pid  31607) is running...
openstack-nova-consoleauth (pid  31628) is running...
openstack-nova-network (pid  31687) is running...
openstack-nova-novncproxy (pid  31702) is running...
openstack-nova-objectstore (pid  31725) is running...
openstack-nova-scheduler (pid  31742) is running...

# . keystonerc_admin 
# nova boot --image cirros-0.3.1-x86_64-uec --flavor m1.tiny foo_vm
# nova show foo_vm
...
error
....

# openstack-service status nova
openstack-nova-api (pid  31965) is running...
openstack-nova-cert (pid  31980) is running...
openstack-nova-compute (pid  32005) is running...
openstack-nova-conductor (pid  32038) is running...
openstack-nova-consoleauth (pid  32093) is running...
openstack-nova-network (pid  32108) is running...
openstack-nova-novncproxy (pid  32127) is running...
openstack-nova-objectstore (pid  32146) is running...
openstack-nova-scheduler (pid  32174) is running...

# openstack-service restart nova
Stopping openstack-nova-api: [  OK  ]
Starting openstack-nova-api: [  OK  ]
Stopping openstack-nova-cert: [  OK  ]
Starting openstack-nova-cert: [  OK  ]
Stopping openstack-nova-compute: [  OK  ]
Starting openstack-nova-compute: [  OK  ]
Stopping openstack-nova-conductor: [  OK  ]
Starting openstack-nova-conductor: [  OK  ]
Stopping openstack-nova-consoleauth: [  OK  ]
Starting openstack-nova-consoleauth: [  OK  ]
Stopping openstack-nova-network: [  OK  ]
Starting openstack-nova-network: [  OK  ]
Stopping openstack-nova-novncproxy: [  OK  ]
Starting openstack-nova-novncproxy: [  OK  ]
Stopping openstack-nova-objectstore: [  OK  ]
Starting openstack-nova-objectstore: [  OK  ]
Stopping openstack-nova-scheduler: [  OK  ]
Starting openstack-nova-scheduler: [  OK  ]

# openstack-service status nova
openstack-nova-api (pid  504) is running...
openstack-nova-cert (pid  519) is running...
openstack-nova-compute dead but pid file exists
openstack-nova-conductor (pid  577) is running...
openstack-nova-consoleauth (pid  592) is running...
openstack-nova-network (pid  648) is running...
openstack-nova-novncproxy (pid  663) is running...
openstack-nova-objectstore (pid  686) is running...
openstack-nova-scheduler (pid  701) is running...

# openstack-service status nova
openstack-nova-api (pid  504) is running...
openstack-nova-cert (pid  519) is running...
openstack-nova-compute dead but pid file exists
openstack-nova-conductor (pid  577) is running...
openstack-nova-consoleauth (pid  592) is running...
openstack-nova-network dead but pid file exists
openstack-nova-novncproxy (pid  663) is running...
openstack-nova-objectstore (pid  686) is running...
openstack-nova-scheduler (pid  701) is running...


Actual results:
1) nova fails to boot the VM and the VM is stuck because there is no nova to delete it.

2) nova-network fails to start if the network where VlanID is not set is associated. 

3) To remove a VM, one has to switch back to FlatDHCP, reload the services and then _delete_ the VM. Note the nova-compute may still fail to start, but the VM can be deleted.

It seems like neither disassociating the network, nor removing the network helps. There seems to be no way to modify the VlanID of the network.


Expected results:
None in VlanID in any network should not make nova die and it should not prevent the VM be deleted.

It should report errors in the log and in the `nova boot` command output.


Additional info:

Comment 1 Brent Eagles 2014-05-23 19:27:49 UTC
Switching network managers when active networks for the previous network manager existed probably has never worked. IIRC this was not considered a bug upstream when I brought it up and I think you can probably still delete it with nova-manage. As a proper workflow exists (delete your networks first, then change the network manager) and nova-manage should be able to rescue you this is either a low priority or non-bug. As an aside, getting nova to ignore improperly defined networks for the current manager would probably end up being a rather extensive change. Can you try using nova-manage to delete the bogus network definition and we'll go from there?

Comment 2 Russell Bryant 2014-06-23 14:11:48 UTC
(In reply to Brent Eagles from comment #1)
> Switching network managers when active networks for the previous network
> manager existed probably has never worked. IIRC this was not considered a
> bug upstream when I brought it up and I think you can probably still delete
> it with nova-manage. As a proper workflow exists (delete your networks
> first, then change the network manager) and nova-manage should be able to
> rescue you this is either a low priority or non-bug. As an aside, getting
> nova to ignore improperly defined networks for the current manager would
> probably end up being a rather extensive change. Can you try using
> nova-manage to delete the bogus network definition and we'll go from there?

Based on these comments, I'm going to close this as not a bug.  Let us know if you need further assistance.  Thanks!