While debugging some OVN issues, sometimes it's handy to create a namespace and a port inside it (this is quicker than create VM's and we don't need OpenStack). So while doing this on one of the compute nodes, we saw an error with the OVN metadata agent where it was unable to set the ovnmeta namespace. We have been digging this morning and this is not apparently an OVN related issue. It's easily reproducible with the following steps: 1. Create a namespace within a container: # docker run --net=host --user=root --privileged=True -v /run/netns:/run/netns:shared -it f45d92a693e8 /bin/bash ()[root@overcloud-controller-0 /]# ip netns add netns1 2. Create a namespace in the host [root@overcloud-controller-0 heat-admin]# ip netns add netns2 [root@overcloud-controller-0 heat-admin]# ip netns RTNETLINK answers: Invalid argument netns2 RTNETLINK answers: Invalid argument netns1 3. The namespaces become unusable. From the container: ()[root@overcloud-controller-0 /]# ip netns RTNETLINK answers: Invalid argument netns2 RTNETLINK answers: Invalid argument netns1 ()[root@overcloud-controller-0 /]# ip netns e netns1 ip a RTNETLINK answers: Invalid argument setting the network namespace "netns1" failed: Invalid argument Only way to recover from this is rebooting the node. So I don't know how unlikely is that someone creates a network namespace in a controller but if that's done, basically all the existing namespaces (qrouter, qdhcp) become unusable and dataplane would be disrupted. We have tried in both RHEL 7.5 and CentOS 7.4. CentOS kernel is: Linux overcloud-controller-0 3.10.0-693.21.1.el7.x86_64 #1 SMP Wed Mar 7 19:03:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux As this was a TripleO virt setup, we wanted to confirm and try the same on the baremetal node. Spawning a container there mounting the /run dir as shared didn't work so we changed the following file: /etc/systemd/system/docker.service - MountFlags=slave +MountFlags=shared Restarted docker and then we could create namespaces both inside and outside the containers.
Terry to investigate further to assess both impact and RCA / fix options. It's not clear to me if this should be considered a blocker at this point.
*** This bug has been marked as a duplicate of bug 1771556 ***
*** This bug has been marked as a duplicate of bug 1758547 ***