Bug 1457358 - neutronovsagent container on compute node have forever restarting state after deployment of overcloud
Summary: neutronovsagent container on compute node have forever restarting state after...
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 12.0 (Pike)
Hardware: Unspecified
OS: Linux
Target Milestone: ga
: 12.0 (Pike)
Assignee: Brent Eagles
QA Contact: Toni Freger
Andrew Burden
: 1470682 (view as bug list)
Depends On: 1433535
TreeView+ depends on / blocked
Reported: 2017-05-31 14:48 UTC by Artem Hrechanychenko
Modified: 2018-02-05 19:07 UTC (History)
16 users (show)

Fixed In Version: openstack-tripleo-heat-templates-7.0.0-0.20170901051303.0rc1.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2017-12-13 21:29:42 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Launchpad 1699261 0 None None None 2017-06-20 16:38:54 UTC
OpenStack gerrit 499137 0 None MERGED container ovs-agent, ensure br-ex exists 2020-09-17 01:22:53 UTC
Red Hat Product Errata RHEA-2017:3462 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 12.0 Enhancement Advisory 2018-02-16 01:43:25 UTC

Description Artem Hrechanychenko 2017-05-31 14:48:28 UTC
Description of problem:
neutronovsagent container on compute node have forever restarting state after deployment of overcloud

[heat-admin@overcloud-compute-0 ~]$ sudo docker ps
CONTAINER ID        IMAGE                                                                               COMMAND             CREATED             STATUS                         PORTS               NAMES
0e6121bbe0c9   "kolla_start"       25 minutes ago      Restarting (0) 2 minutes ago                       neutronovsagent
92d0d3dcb950                "kolla_start"       25 minutes ago      Up 25 minutes                                      novacompute
01a185801b7e                "kolla_start"       33 minutes ago      Up 33 minutes                                      nova_libvirt

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.http://etherpad.corp.redhat.com/testing-osp12-containers, use rhel7.4 for creating vm infrastructure via infrared - --image-url http://download-node-02.eng.bos.redhat.com/brewroot/packages/rhel-guest-image/7.4/135/images/rhel-guest-image-7.4-135.x86_64.qcow2

2.Before deployment of overcloud Apply workarounds for:
  1) https://bugzilla.redhat.com/show_bug.cgi?id=1448482
  2) https://bugzilla.redhat.com/show_bug.cgi?id=1450370
  3) https://bugzilla.redhat.com/show_bug.cgi?id=1452082
  4) https://bugzilla.redhat.com/show_bug.cgi?id=1455348

3.Deploy overcloud
source /home/stack/stackrc && openstack overcloud deploy --templates /usr/share/openstack-tripleo-heat-templates --libvirt-type kvm  -e /home/stack/nodes_data.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/docker.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/docker-osp12.yaml --log-file overcloud_deployment_0.log

Actual results:
0e6121bbe0c9   "kolla_start"       25 minutes ago      Restarting (0) 2 minutes ago                       neutronovsagent

Expected results:
state of  neutronovsagent container is "Up"

Additional info:
from docker logs of container

INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json
INFO:__main__:Validating config file
INFO:__main__:Kolla config strategy set to: COPY_ALWAYS
INFO:__main__:Writing out command to execute
INFO:__main__:Setting permission for /var/log/neutron
INFO:__main__:Setting permission for /var/log/neutron/neutron-openvswitch-agent.log
Running command: '/usr/bin/neutron-openvswitch-agent --config-file /usr/share/neutron/neutron-dist.conf --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/openvswitch_agent.ini --config-file /etc/neutron/plugins/ml2/ml2_conf.ini'
Guru meditation now registers SIGUSR1 and SIGUSR2 by default for backward compatibility. SIGUSR1 will no longer be registered in a future release, so please use SIGUSR2 to generate reports.
Option "notification_driver" from group "DEFAULT" is deprecated. Use option "driver" from group "oslo_messaging_notifications".
Could not load neutron.openstack.common.notifier.rpc_notifier

Comment 1 Red Hat Bugzilla Rules Engine 2017-05-31 14:48:35 UTC
This bugzilla has been removed from the release and needs to be reviewed and Triaged for another Target Release.

Comment 2 Martin André 2017-05-31 15:22:15 UTC
Checking the ovs agent logs in /var/log/containers/neutron/neutron-openvswitch-agent.log gives more info:

2017-05-31 10:22:44.302 24231 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-460c7f8f-1b03-4546-a873-ce8843df941d - - - - -] Mapping physical network datacentre to bridge br-ex
2017-05-31 10:22:44.302 24231 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-460c7f8f-1b03-4546-a873-ce8843df941d - - - - -] Bridge br-ex for physical network datacentre does not exist. Agent terminated!
2017-05-31 10:22:44.303 24231 ERROR ryu.lib.hub [req-460c7f8f-1b03-4546-a873-ce8843df941d - - - - -] hub: uncaught exception: Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ryu/lib/hub.py", line 54, in _launch
    return func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ovs_ryuapp.py", line 40, in agent_main_wrapper
  File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 2167, in main
    agent = OVSNeutronAgent(bridge_classes, cfg.CONF)
  File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 183, in __init__
  File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 153, in wrapper
    return f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 1096, in setup_physical_bridges
SystemExit: 1

Comment 3 Alexander Chuzhoy 2017-05-31 17:52:15 UTC

Comment 4 Dan Prince 2017-06-01 20:56:11 UTC
The bridge is normally created by os-net-config although it can be manually created by the puppet-vswitch module as well I think if it isn't first created by os-net-config.

You can look in /var/lib/heat-config/heat-config-script/ and found the os-net-config heat script that would have been used to configure the bridge during provisioning. What does this script say?

Comment 5 Dan Prince 2017-06-01 21:10:23 UTC
A couple more things masking the issue here are that /etc/os-net-config/config.json seems to get overwritten by the old element. See here: 


Not directly related to this bug but could be confusing the issue of how things are wired up I think.

Comment 6 Attila Fazekas 2017-06-13 12:50:14 UTC
I had vxlan tenant-natwork, non dvr setup. I do not supposed to have br-ex on the compute node, so it should not be in the bridge mapping.

When I just remove the datacentre:br-ex
etc/neutron/plugins/ml2/openvswitch_agent.ini:bridge_mappings =tenant:br-isolated

it continues to the next bug https://bugzilla.redhat.com/show_bug.cgi?id=1459592 .

Comment 7 Omri Hochman 2017-06-19 15:18:39 UTC
We should re-test with latest version  - can you check it's still reproduce?

Comment 8 Alexander Chuzhoy 2017-06-19 20:23:06 UTC
The issue is still there:

openstack-neutron-openvswitch-agent-docker   2017-06-15.2

Comment 10 Brent Eagles 2017-06-20 16:35:40 UTC
I suspect that this is actually caused by br-ex being part of the ovs agent's configuration but the bridge isn't configured on the compute node. I noticed this in my environment a few days ago, but haven't had a chance to get a fix up.

Comment 12 Brent Eagles 2017-06-21 18:50:42 UTC
A quick workaround if the overcloud is already deployed, log in the compute node(s) and manually created the bridge. e.g.
   ssh heat-admin@<compute-ip>
   sudo ovs-vsctl add-br br-ex

The agent will come up on the next restart of the container.

Comment 13 Brent Eagles 2017-06-21 19:28:34 UTC
The instructions in docker/README-containers.md suggests including the "environments/docker-network.yaml" environment file in the deployment command line. This environment file appears to set the compute's network configuration to be the same as the controller.

Comment 14 Martin André 2017-07-10 14:23:39 UTC
Brent, the content of file docker/README-containers.md is terribly outdated. I wouldn't trust it if I were you.

More seriously, I'll update the file to redirect to https://docs.openstack.org/tripleo-docs/latest/install/containers_deployment/index.html which should provide much more accurate information.

Comment 15 Brent Eagles 2017-07-13 16:55:05 UTC
Note that the core issue is that br-ex wasn't being created by default on compute nodes. If you use a non-default network configuration (network isolation, multiple nics, etc. etc.) the network environment files being used need to take care of creating the br-ex bridge on the compute nodes.

Comment 16 Alexander Chuzhoy 2017-07-27 16:19:26 UTC
Need to understand the relevancy of the bug, since the neutron,ovs moved back to BM.

[root@overcloud-compute-0 ~]# docker ps
CONTAINER ID        IMAGE                                                                   COMMAND             CREATED             STATUS              PORTS               NAMES
99e4009a0ed4   "kolla_start"       46 minutes ago      Up 46 minutes                           nova_compute
c4eed184f57a         "kolla_start"       50 minutes ago      Up 50 minutes                           iscsid
e63cadbd5884   "kolla_start"       50 minutes ago      Up 50 minutes                           nova_libvirt
[root@overcloud-compute-0 ~]# systemctl|grep openv
  neutron-openvswitch-agent.service                                             loaded active running   OpenStack Neutron Open vSwitch Agent
  openvswitch.service                                                           loaded active exited    Open vSwitch

Comment 17 Omri Hochman 2017-08-04 14:55:11 UTC
the openvswitch service is running on BM during OSP12 , therefore it's not a bug .

Comment 18 Assaf Muller 2017-08-04 16:14:00 UTC
Re-opening. Containerized Neutron will still be available as TP for OSP 12 and is intended for full support in 13, so the bug is still relevant.

Comment 19 Assaf Muller 2017-09-07 22:13:55 UTC
*** Bug 1470682 has been marked as a duplicate of this bug. ***

Comment 31 errata-xmlrpc 2017-12-13 21:29:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.