Bug 2026386

Summary: [4.9]Kubelet/crio leaks netns and veth ports in the host
Product: OpenShift Container Platform Reporter: Christoffer Back <cback>
Component: NodeAssignee: Peter Hunt <pehunt>
Node sub component: CRI-O QA Contact: Sunil Choudhary <schoudha>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: aos-bugs, atn, cback, cgoncalves, danw, dblack, ealcaniz, eglottma, eminguez, gwest, minmli, peasters, pehunt, schoudha, smalleni, sscheink, trozet, vpickard, wei-david.li, wking
Version: 4.9   
Target Milestone: ---   
Target Release: 4.9.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: perfscale-ovn
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 2003193
: 2026388 2078400 (view as bug list) Environment:
Last Closed: 2022-01-04 18:41:24 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2003193, 2028126, 2028127    
Bug Blocks: 2012836, 2026388, 2078400    

Comment 1 Peter Hunt 2021-11-29 20:47:38 UTC
fixed in attacehd version

Comment 2 Peter Hunt 2021-12-03 14:29:48 UTC
*** Bug 2028126 has been marked as a duplicate of this bug. ***

Comment 3 Li Wei 2021-12-16 01:14:21 UTC
Hi, is the target release version fixed?

Comment 6 Peter Hunt 2021-12-16 14:37:36 UTC
PR merged 3 days ago, so I expect it'll make 4.9.12, though that would need some verification when it's actually released

Comment 12 MinLi 2021-12-28 10:20:29 UTC
verified!
Created pods and checked netns on a node while pods were running and after pods were deleted.

$ oc get clusterversion 
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.9.12    True        False         41m     Cluster version is 4.9.12

Comment 14 errata-xmlrpc 2022-01-04 18:41:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.9.12 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:5214

Comment 15 MinLi 2022-01-07 08:02:19 UTC
verification step:

1) create a pod
$ oc get pod -o wide 
NAME                                                      READY   STATUS    RESTARTS         AGE     IP             NODE                                                NOMINATED NODE   READINESS GATES
hello-pod-1                                               1/1     Running   0                4m34s   10.129.2.184   minmli410010601-22jvg-worker-northcentralus-nmzxz   <none>           <none>

2)login node and get the pod's NetNS info from crio log
$ oc debug node/minmli410010601-22jvg-worker-northcentralus-nmzxz
Starting pod/minmli410010601-22jvg-worker-northcentralus-nmzxz-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.128.4
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host 
sh-4.4# journalctl -u crio --since="6 minutes ago" | grep hello-pod-1 | grep NetNS
Jan 07 07:13:45 minmli410010601-22jvg-worker-northcentralus-nmzxz crio[1459]: time="2022-01-07 07:13:45.990971476Z" level=info msg="Got pod network &{Name:hello-pod-1 Namespace:default ID:67fd70c7b1f167a9c729dae525b119f8cc4d0c92c58d24afcbf4ec9bc73b5b99 UID:52119968-1afe-4f1e-a8a6-b4e48cf80d99 NetNS:/var/run/netns/d547e411-7395-4d63-b582-b9b20b0cb3df Networks:[] RuntimeConfig:map[multus-cni-network:{IP: MAC: PortMappings:[] Bandwidth:<nil> IpRanges:[]}] Aliases:map[]}"
Jan 07 07:13:48 minmli410010601-22jvg-worker-northcentralus-nmzxz crio[1459]: time="2022-01-07 07:13:48.068682131Z" level=info msg="Got pod network &{Name:hello-pod-1 Namespace:default ID:67fd70c7b1f167a9c729dae525b119f8cc4d0c92c58d24afcbf4ec9bc73b5b99 UID:52119968-1afe-4f1e-a8a6-b4e48cf80d99 NetNS:/var/run/netns/d547e411-7395-4d63-b582-b9b20b0cb3df Networks:[] RuntimeConfig:map[multus-cni-network:{IP: MAC: PortMappings:[] Bandwidth:<nil> IpRanges:[]}] Aliases:map[]}"
sh-4.4# ls -l /var/run/netns/d547e411-7395-4d63-b582-b9b20b0cb3df
-r--r--r--. 1 root root 0 Jan  7 07:13 /var/run/netns/d547e411-7395-4d63-b582-b9b20b0cb3df

3) delete the pod and check the NetNs file was cleaned.
$ oc delete pod hello-pod-1 
pod "hello-pod-1" deleted

sh-4.4# ls -l /var/run/netns/d547e411-7395-4d63-b582-b9b20b0cb3df
ls: cannot access '/var/run/netns/d547e411-7395-4d63-b582-b9b20b0cb3df': No such file or directory