Bug 1903091

Summary: [OSP16.1] Duplicate mount entries related to openvswitch "tmpfs /run/openvswitch tmpfs rw,seclabel,nosuid,nodev,mode=755 0 0"
Product: Red Hat OpenStack Reporter: ggrimaux
Component: openstack-tripleo-heat-templatesAssignee: Brent Eagles <beagles>
Status: CLOSED ERRATA QA Contact: Candido Campos <ccamposr>
Severity: high Docs Contact:
Priority: high    
Version: 16.1 (Train)CC: apevec, atragler, bcafarel, bdobreli, beagles, broose, cfields, chrisw, coldford, dbecker, fhallal, fiezzi, hakhande, igallagh, jfargen, jlibosva, mburns, morazi, pmorey, rhos-maint, riramos, rkhan, slinaber, spower, supadhya
Target Milestone: z4Keywords: Triaged
Target Release: 16.1 (Train on RHEL 8.2)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-11.3.2-1.20210104205661.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-03-17 15:36:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description ggrimaux 2020-12-01 10:15:23 UTC
Description of problem:
Client using Openvswitch and NOT OVN have very high number of duplicate openvswitch mounts making it slow to ssh to compute nodes and 'runc' processes using 100% cpu.

After stracing one of those process we noticed it kept timing out while accessing /proc/$pid/mountinfo.

We cat the file and found 32767 tmpfs mount:
117756 68588 0:24 /openvswitch /run/openvswitch rw,nosuid,nodev shared:26 - tmpfs tmpfs rw,seclabel,mode=755

We stopped systemd ovsdb-server on the node, umount /run/openvswitch (might not have been necessary) and after that all the mounts were gone, the processes were all back to normal.

This was experienced on OSP16.1.1.
Client updated his environment to OSP16.1.2 and the same thing can be seen.

We need your help to find the source of those entries.

Version-Release number of selected component (if applicable):
OSP16.1.2
openvswitch2.13-2.13.0-60.el8fdp.x86_64

How reproducible:
100% on this environment (compute nodes)

Steps to Reproduce:
1. Unknown
2.
3.

Actual results:
Duplicate openvswitch mount entries in /proc/mounts

Expected results:
Only one entry.

Additional info:
We have sosreport with the issue.

Comment 11 Chris Fields 2021-01-18 14:54:40 UTC
The tested work around is a reboot of overcloud nodes that have excessive mounts as detected like this:

cat /proc/mounts | grep 'tmpfs /run/openvswitch' | wc -l

Comment 13 Chris Fields 2021-01-19 19:02:14 UTC
KCS Solution is published and linked here.  We would like to find a resolution that is less invasive than Overcloud node reboots.

Comment 42 coldford@redhat.com 2021-02-11 16:41:34 UTC
Hello,

The customer reports the following seems to clean up the mounts:

# systemctl stop ovs-dbserver
# umount /run/openvswitch
# systemctl start ovs-dbserver
# podman stop neutron_ovs_agent
# sleep 5
# podman start neutron_ovs_agent

- Cory

Comment 48 errata-xmlrpc 2021-03-17 15:36:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.4 director bug fix advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0817

Comment 49 Robin Jarry 2023-07-19 12:20:09 UTC
*** Bug 2218577 has been marked as a duplicate of this bug. ***