Bug 1010941
Summary: | neutron-ovs-cleanup delete the port that nova compute has plugged into an tap interface | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Jian Wen <wenjianhn> | |
Component: | openstack-neutron | Assignee: | Solly Ross <sross> | |
Status: | CLOSED ERRATA | QA Contact: | Ofer Blaut <oblaut> | |
Severity: | unspecified | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 3.0 | CC: | ajeain, apevec, chrisw, dallan, ddomingo, hateya, mlopes, ndipanov, oblaut, sclewis, sross, twilson, xqueralt, yeylon | |
Target Milestone: | beta | |||
Target Release: | 4.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | openstack-neutron-2013.2-3.el6 | Doc Type: | Bug Fix | |
Doc Text: |
The asynchronous operation of service startup resulted in neutron-ovs-cleanup finishing its run after nova-compute startup. Consequently, devices necessary for proper function were deleted.
With this fix, neutron-ovs-cleanup now blocks while it runs. This ensures there is no interference with nova-compute port creation.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1022578 (view as bug list) | Environment: | ||
Last Closed: | 2013-12-20 00:24:30 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1022578 |
Description
Jian Wen
2013-09-23 10:38:44 UTC
So, the cleanup agent is already set at priority 97, which should run before nova-compute's priority of 98. Perhaps we should have the process block until the cleanup is finished? I'll look into it @apevec: you were on the original bug's discussion. Is there a reason that we shouldn't make the init script block, since the process isn't actually a daemon? Yes, cleanup init script should not exit before cleanup is done, otherwise you run into race conditions. So, we should make it block? The wording in your respond was a bit unclear. Currently, the process is run with the `daemon` tool, which makes it run in the background in a non-blocking way. I'm thinking we just need to remove the `daemon` part from the init script: > daemon --user neutron $exec --log-file /var/log/$proj/ovs-cleanup.log --config-file /etc/$proj/$proj.conf --config-file $config &>/dev/null becomes > runuser -s /bin/bash $user -c "neutron $exec --log-file /var/log/$proj/ovs-cleanup.log --config-file /etc/$proj/$proj.conf --config-file $config &>/dev/null" Sorry if unclear, I left implementation details to you :) Looking back to original BZ, looks like Garry switched to using "daemon" (first version of the script didn't have it) only for --user option. So replacing it with "runuser" is correct, just in your comment 5 it should be runuser -s /bin/bash neutron -c "$exec ..." Also note recent change in Havana Neutron packages added support for dist.conf http://pkgs.fedoraproject.org/cgit/openstack-neutron.git/commit/?h=el6-havana&id=afcc20396ac906b29dcd1ee2cb32138e4eab59ce where -c "$exec..." from above would break due to quoting from ${configs[@]/#/--config-file } + runuser -s /bin/bash neutron -c '/usr/bin/neutron-ovs-cleanup --log-file /var/log/neutron/ovs-cleanup.log --config-file /usr/share/neutron/neutron-dist.conf' '--config-file /etc/neutron/neutron.conf' '--config-file /etc/neutron/plugins/openvswitch/ovs_neutron_plugin.ini &>/dev/null' runuser: unrecognized option '--config-file /etc/neutron/neutron.conf' BTW, this should be fixed for RHOS 3.0 too, so I'm adding 3.0z flag. @oblaut: just FYI: the script priority here is correct; the issue is that the cleanup tool is being run as a daemon, which means the script exits before the cleanup is finished, meaning that nova can launch while the actual cleanup script is still running. @oblaut: I think the bug may actually be a race condition. It on whether the daemonized version of the cleanup util finishes before the nova daemon starts, so reproducing it may be difficult, unfortunately. I have followed the steps but VM has no traffic , OVS cleanup-logs 2013-10-31 09:11:17 INFO [quantum.common.config] Logging enabled! 2013-10-31 09:11:17 INFO [quantum.agent.ovs_cleanup_util] Cleaning br-int 2013-10-31 09:11:18 INFO [quantum.agent.ovs_cleanup_util] OVS cleanup completed successfully 2013-10-31 12:40:42 INFO [quantum.common.config] Logging enabled! 2013-10-31 12:40:43 INFO [quantum.agent.ovs_cleanup_util] Cleaning br-int 2013-10-31 12:40:45 INFO [quantum.agent.ovs_cleanup_util] Delete qvo2ed438a8-a5 2013-10-31 12:40:46 INFO [quantum.agent.ovs_cleanup_util] Delete qvo4f81529e-a9 2013-10-31 12:40:47 INFO [quantum.agent.ovs_cleanup_util] Delete qvoc2cdc2cc-2c 2013-10-31 12:40:47 INFO [quantum.agent.ovs_cleanup_util] OVS cleanup completed successfully 2013-10-31 12:52:36 INFO [quantum.common.config] Logging enabled! 2013-10-31 12:52:37 INFO [quantum.agent.ovs_cleanup_util] Cleaning br-int 2013-10-31 12:52:37 INFO [quantum.agent.ovs_cleanup_util] OVS cleanup completed successfully ~ OVS-logs 2013-10-31 11:03:14 INFO [quantum.openstack.common.rpc.impl_qpid] Connected to AMQP server on 10.35.160.17:5672 2013-10-31 11:03:15 INFO [quantum.plugins.openvswitch.agent.ovs_quantum_agent] Port c2cdc2cc-2c42-43f5-9f3f-05702625746d added 2013-10-31 11:03:15 INFO [quantum.plugins.openvswitch.agent.ovs_quantum_agent] Port c2cdc2cc-2c42-43f5-9f3f-05702625746d updated. Details: {u'admin_state_up': True, u'network_id': u'b7f821d3-b437-400e-b294-e43aa1330184', u'segmentation_id': 201, u'physical_network': u'inter-vlan', u'device': u'c2cdc2cc-2c42-43f5-9f3f-05702625746d', u'port_id': u'c2cdc2cc-2c42-43f5-9f3f-05702625746d', u'network_type': u'vlan'} 2013-10-31 11:03:15 INFO [quantum.plugins.openvswitch.agent.ovs_quantum_agent] Assigning 1 as local vlan for net-id=b7f821d3-b437-400e-b294-e43aa1330184 2013-10-31 11:03:16 INFO [quantum.plugins.openvswitch.agent.ovs_quantum_agent] Port 2ed438a8-a585-40d8-8df2-f89ccd610858 added 2013-10-31 11:03:16 INFO [quantum.plugins.openvswitch.agent.ovs_quantum_agent] Port 2ed438a8-a585-40d8-8df2-f89ccd610858 updated. Details: {u'admin_state_up': True, u'network_id': u'a5fb3fe2-a1bb-439b-95c2-a69b844cc185', u'segmentation_id': 202, u'physical_network': u'inter-vlan', u'device': u'2ed438a8-a585-40d8-8df2-f89ccd610858', u'port_id': u'2ed438a8-a585-40d8-8df2-f89ccd610858', u'network_type': u'vlan'} 2013-10-31 11:03:16 INFO [quantum.plugins.openvswitch.agent.ovs_quantum_agent] Assigning 2 as local vlan for net-id=a5fb3fe2-a1bb-439b-95c2-a69b844cc185 2013-10-31 11:03:16 INFO [quantum.plugins.openvswitch.agent.ovs_quantum_agent] Port 4f81529e-a9a9-467a-b6a3-c18b8d86a7f8 added 2013-10-31 11:03:17 INFO [quantum.plugins.openvswitch.agent.ovs_quantum_agent] Port 4f81529e-a9a9-467a-b6a3-c18b8d86a7f8 updated. Details: {u'admin_state_up': True, u'network_id': u'b7f821d3-b437-400e-b294-e43aa1330184', u'segmentation_id': 201, u'physical_network': u'inter-vlan', u'device': u'4f81529e-a9a9-467a-b6a3-c18b8d86a7f8', u'port_id': u'4f81529e-a9a9-467a-b6a3-c18b8d86a7f8', u'network_type': u'vlan'} 2013-10-31 11:40:18 INFO [quantum.agent.securitygroups_rpc] Security group member updated [u'9e9f2c59-b3bf-4d1e-be0b-135f9f944e4e'] 2013-10-31 11:40:18 INFO [quantum.agent.securitygroups_rpc] Refresh firewall rules 2013-10-31 11:40:18 INFO [quantum.agent.securitygroups_rpc] Provider rule updated 2013-10-31 11:40:18 INFO [quantum.agent.securitygroups_rpc] Refresh firewall rules 2013-10-31 11:52:00 INFO [quantum.agent.securitygroups_rpc] Security group member updated [u'9e9f2c59-b3bf-4d1e-be0b-135f9f944e4e'] 2013-10-31 11:52:00 INFO [quantum.agent.securitygroups_rpc] Refresh firewall rules 2013-10-31 11:52:00 INFO [quantum.agent.securitygroups_rpc] Security group member updated [u'9e9f2c59-b3bf-4d1e-be0b-135f9f944e4e'] 2013-10-31 11:52:00 INFO [quantum.agent.securitygroups_rpc] Refresh firewall rules 2013-10-31 12:31:35 INFO [quantum.agent.securitygroups_rpc] Security group rule updated [u'9e9f2c59-b3bf-4d1e-be0b-135f9f944e4e'] 2013-10-31 12:31:35 INFO [quantum.agent.securitygroups_rpc] Refresh firewall rules 2013-10-31 12:41:07 INFO [quantum.common.config] Logging enabled! 2013-10-31 12:41:08 INFO [quantum.plugins.openvswitch.agent.ovs_quantum_agent] Mapping physical network inter-vlan to bridge br-eth3 2013-10-31 12:41:09 INFO [quantum.openstack.common.rpc.impl_qpid] Connected to AMQP server on 10.35.160.17:5672 2013-10-31 12:41:10 INFO [quantum.plugins.openvswitch.agent.ovs_quantum_agent] Agent initialized successfully, now running... 2013-10-31 12:41:10 INFO [quantum.plugins.openvswitch.agent.ovs_quantum_agent] Agent out of sync with plugin! 2013-10-31 12:41:10 INFO [quantum.openstack.common.rpc.impl_qpid] Connected to AMQP server on 10.35.160.17:5672 2013-10-31 12:54:30 INFO [quantum.common.config] Logging enabled! 2013-10-31 12:54:30 INFO [quantum.plugins.openvswitch.agent.ovs_quantum_agent] Mapping physical network inter-vlan to bridge br-eth3 2013-10-31 12:54:32 INFO [quantum.openstack.common.rpc.impl_qpid] Connected to AMQP server on 10.35.160.17:5672 2013-10-31 12:54:32 INFO [quantum.plugins.openvswitch.agent.ovs_quantum_agent] Agent initialized successfully, now running... 2013-10-31 12:54:32 INFO [quantum.plugins.openvswitch.agent.ovs_quantum_agent] Agent out of sync with plugin! 2013-10-31 12:54:32 INFO [quantum.openstack.common.rpc.impl_qpid] Connected to AMQP server on 10.35.160.17:5672 wrong bug , meant to update https://bugzilla.redhat.com/show_bug.cgi?id=1022578 Tested - openstack-neutron-2013.2-5.el6ost.noarch The reproduction steps are 1. stop ovs-agent 2. stop nova-compute 3. run cleanup 4. start ovs-agent 5. start nova-compute Only after restarting nova compute the interfaces are back before the cleanup [root@puma34 ~]# ifconfig | grep a67 qbra67f01b4-4d Link encap:Ethernet HWaddr BA:79:83:BB:F0:52 qvba67f01b4-4d Link encap:Ethernet HWaddr BA:79:83:BB:F0:52 qvoa67f01b4-4d Link encap:Ethernet HWaddr A2:C6:ED:E0:E3:EC tapa67f01b4-4d Link encap:Ethernet HWaddr FE:16:3E:41:81:8E after the clean up [root@puma34 ~]# ifconfig | grep a67 qbra67f01b4-4d Link encap:Ethernet HWaddr FE:16:3E:41:81:8E tapa67f01b4-4d Link encap:Ethernet HWaddr FE:16:3E:41:81:8E [root@puma34 ~]# service neutron-openvswitch-agent start Starting neutron-openvswitch-agent: [ OK ] [root@puma34 ~]# ifconfig | grep a67 qbra67f01b4-4d Link encap:Ethernet HWaddr FE:16:3E:41:81:8E tapa67f01b4-4d Link encap:Ethernet HWaddr FE:16:3E:41:81:8E [root@puma34 ~]# service openstack-nova-compute start Starting openstack-nova-compute: [ OK ] [root@puma34 ~]# ifconfig | grep a67 qbra67f01b4-4d Link encap:Ethernet HWaddr 2A:94:86:48:2A:28 qvba67f01b4-4d Link encap:Ethernet HWaddr 2A:94:86:48:2A:28 qvoa67f01b4-4d Link encap:Ethernet HWaddr E6:24:4F:AE:B9:0A tapa67f01b4-4d Link encap:Ethernet HWaddr FE:16:3E:41:81:8E Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2013-1859.html |