Bug 1544150 - Restarting Neutron containers which make use of network namespaces doesn't work
Summary: Restarting Neutron containers which make use of network namespaces doesn't work
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: beta
: 13.0 (Queens)
Assignee: Brent Eagles
QA Contact: Roee Agiman
URL:
Whiteboard:
Depends On:
Blocks: 1527130
TreeView+ depends on / blocked
 
Reported: 2018-02-10 18:50 UTC by Daniel Alvarez Sanchez
Modified: 2020-05-21 21:34 UTC (History)
11 users (show)

Fixed In Version: openstack-tripleo-heat-templates-8.0.2-0.20180327213843.f25e2d8.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-27 13:44:34 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Launchpad 1748658 None None None 2018-02-10 18:50:01 UTC
OpenStack gerrit 542858 'None' MERGED Mount netns as shared to persist namespaces 2020-09-17 01:22:00 UTC
Red Hat Product Errata RHEA-2018:2086 None None None 2018-06-27 13:45:18 UTC

Description Daniel Alvarez Sanchez 2018-02-10 18:50:01 UTC
When DHCP, L3, Metadata or OVN-Metadata containers are restarted they can't
set the previous namespaces:

[heat-admin@overcloud-novacompute-0 neutron]$ sudo docker restart 8559f5a7fa45
8559f5a7fa45

[heat-admin@overcloud-novacompute-0 neutron]$ tail -f /var/log/containers/neutron/networking-ovn-metadata-agent.log
2018-02-09 08:34:41.059 5 CRITICAL neutron [-] Unhandled error: ProcessExecutionError: Exit code: 2; Stdin: ; Stdout: ; Stderr: RTNETLINK answers: Invalid argument
2018-02-09 08:34:41.059 5 ERROR neutron Traceback (most recent call last):
2018-02-09 08:34:41.059 5 ERROR neutron File "/usr/bin/networking-ovn-metadata-agent", line 10, in <module>
2018-02-09 08:34:41.059 5 ERROR neutron sys.exit(main())
2018-02-09 08:34:41.059 5 ERROR neutron File "/usr/lib/python2.7/site-packages/networking_ovn/cmd/eventlet/agents/metadata.py", line 17, in main
2018-02-09 08:34:41.059 5 ERROR neutron metadata_agent.main()
2018-02-09 08:34:41.059 5 ERROR neutron File "/usr/lib/python2.7/site-packages/networking_ovn/agent/metadata_agent.py", line 38, in main
2018-02-09 08:34:41.059 5 ERROR neutron agt.start()
2018-02-09 08:34:41.059 5 ERROR neutron File "/usr/lib/python2.7/site-packages/networking_ovn/agent/metadata/agent.py", line 147, in start
2018-02-09 08:34:41.059 5 ERROR neutron self.sync()
2018-02-09 08:34:41.059 5 ERROR neutron File "/usr/lib/python2.7/site-packages/networking_ovn/agent/metadata/agent.py", line 56, in wrapped
2018-02-09 08:34:41.059 5 ERROR neutron return f(*args, **kwargs)
2018-02-09 08:34:41.059 5 ERROR neutron File "/usr/lib/python2.7/site-packages/networking_ovn/agent/metadata/agent.py", line 169, in sync
2018-02-09 08:34:41.059 5 ERROR neutron metadata_namespaces = self.ensure_all_networks_provisioned()
2018-02-09 08:34:41.059 5 ERROR neutron File "/usr/lib/python2.7/site-packages/networking_ovn/agent/metadata/agent.py", line 350, in ensure_all_networks_provisioned
2018-02-09 08:34:41.059 5 ERROR neutron netns = self.provision_datapath(datapath)
2018-02-09 08:34:41.059 5 ERROR neutron File "/usr/lib/python2.7/site-packages/networking_ovn/agent/metadata/agent.py", line 294, in provision_datapath
2018-02-09 08:34:41.059 5 ERROR neutron veth_name[0], veth_name[1], namespace)
2018-02-09 08:34:41.059 5 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 182, in add_veth
2018-02-09 08:34:41.059 5 ERROR neutron self._as_root([], 'link', tuple(args))
2018-02-09 08:34:41.059 5 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 94, in _as_root
2018-02-09 08:34:41.059 5 ERROR neutron namespace=namespace)
2018-02-09 08:34:41.059 5 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 102, in _execute
2018-02-09 08:34:41.059 5 ERROR neutron log_fail_as_error=self.log_fail_as_error)
2018-02-09 08:34:41.059 5 ERROR neutron File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 151, in execute
2018-02-09 08:34:41.059 5 ERROR neutron raise ProcessExecutionError(msg, returncode=returncode)
2018-02-09 08:34:41.059 5 ERROR neutron ProcessExecutionError: Exit code: 2; Stdin: ; Stdout: ; Stderr: RTNETLINK answers: Invalid argument
2018-02-09 08:34:41.059 5 ERROR neutron
2018-02-09 08:34:41.059 5 ERROR neutron
2018-02-09 08:34:41.177 21 INFO oslo_service.service [-] Parent process has died unexpectedly, exiting
2018-02-09 08:34:41.178 21 INFO eventlet.wsgi.server [-] (21) wsgi exited, is_accepting=True

An easy way to reproduce the bug:

[heat-admin@overcloud-novacompute-0 ~]$ sudo docker exec -u root -it 5c5f254a9321bd74b5911f46acb9513574c2cd9a3c59805a85cffd960bcc864d /bin/bash

[root@overcloud-novacompute-0 /]# ip netns a my_netns
[root@overcloud-novacompute-0 /]# exit

[heat-admin@overcloud-novacompute-0 ~]$ sudo ip netns
[heat-admin@overcloud-novacompute-0 ~]$ sudo docker restart 5c5f254a9321bd74b5911f46acb9513574c2cd9a3c59805a85cffd960bcc864d
5c5f254a9321bd74b5911f46acb9513574c2cd9a3c59805a85cffd960bcc864d

[heat-admin@overcloud-novacompute-0 ~]$ sudo docker exec -u root -it 5c5f254a9321bd74b5911f46acb9513574c2cd9a3c59805a85cffd960bcc864d /bin/bash
[root@overcloud-novacompute-0 /]# ip netns
RTNETLINK answers: Invalid argument
RTNETLINK answers: Invalid argument
my_netns

[root@overcloud-novacompute-0 /]# ip netns e my_netns ip a
RTNETLINK answers: Invalid argument
setting the network namespace "my_netns" failed: Invalid argument

Deleting everything under /run/netns/* from kolla_start but this would involve
a full sync of the agents which is not desirable:

[root@overcloud-novacompute-0 /]# rm /run/netns/my_netns
rm: remove regular empty file '/run/netns/my_netns'? y
[root@overcloud-novacompute-0 /]# ip netns
[root@overcloud-novacompute-0 /]# ip netns a my_netns
[root@overcloud-novacompute-0 /]#

Comment 4 Daniel Alvarez Sanchez 2018-02-23 12:41:09 UTC
This BZ is addressed by https://review.openstack.org/#/c/542858/
Still we have this other problem: https://bugzilla.redhat.com/show_bug.cgi?id=1527130 that we should look into (for example, trying to make processes started in the container moved to PID 1 so that they persist after the container is stopped?).

Comment 9 errata-xmlrpc 2018-06-27 13:44:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086


Note You need to log in before you can comment on or make changes to this bug.