Bug 1701326 - Unexpected command output nsenter: cannot open /proc/34316/ns/net
Summary: Unexpected command output nsenter: cannot open /proc/34316/ns/net
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.2.0
Assignee: Ryan Phillips
QA Contact: Weinan Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-04-18 16:01 UTC by Steve Milner
Modified: 2019-10-29 16:41 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-07-29 22:15:19 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Steve Milner 2019-04-18 16:01:38 UTC
From openshift_cluster-openshift-controller-manager-operator.

Seen 16 occurrences over the last 24 hours. The pods which fail are not consistent.

Example:

ns/openshift-monitoring pod/prometheus-adapter-5f7c5567b-7nhx4 Failed create pod sandbox: rpc error: code = Unknown desc = failed to get network status for pod sandbox k8s_prometheus-adapter-5f7c5567b-7nhx4_openshift-monitoring_2eef8612-614f-11e9-a571-12652b58265c_1(f82a62b13bce9a2fa46fc91893f595a4553853fb71d29f8af62f923b72dcb8fc): Unexpected command output nsenter: cannot open /proc/18561/ns/net: No such file or directory\n with error: exit status 1

Related:
- https://bugzilla.redhat.com/show_bug.cgi?id=1434950#c15
- https://github.com/kubernetes/kubernetes/pull/72105

Comment 2 W. Trevor King 2019-04-18 16:06:54 UTC
Giuseppe points out that when this happens we leak the network namespace that CRI-O failed to clean up because of kubelet's early container-process reaping.

Comment 4 W. Trevor King 2019-04-23 09:03:44 UTC
Why the Containers assignment?  Isn't tge issue the early kubelet process reaping without giving CRI-O time to tear down?  See kubernetes#72105, linked from the description.

Comment 6 W. Trevor King 2019-05-10 03:09:36 UTC
This can also present as [1] (so I can find this issue from that direction too ;):

  Warning  Failed            11m (x447 over 128m)   kubelet, ip-10-0-139-192.ec2.internal  Error: container create failed: container_linux.go:329: creating new parent process caused "container_linux.go:1762: running lstat on namespace path \"/proc/3905/ns/ipc\" caused \"lstat /proc/3905/ns/ipc: no such file or directory\""

[1]: https://github.com/cri-o/cri-o/issues/1927#issuecomment-474678516

Comment 7 Seth Jennings 2019-06-18 14:06:17 UTC
I can't find "running lstat on namespace path" or "Unexpected command output nsenter" in search.svc.ci.openshift.org for the past 14d.  Maybe this fixed itself?

Comment 8 Ryan Phillips 2019-07-23 21:07:52 UTC
Is this still happening on a 4.2 cluster? crio was updated to 1.14 about 2 weeks ago (beginning of July).

Comment 9 Seth Jennings 2019-07-29 22:15:19 UTC
This should be fixed in current versions of cri-o 1.13 (OCP 4.1) and 1.14 (OCP 4.2)
https://github.com/cri-o/cri-o/pull/2143


Note You need to log in before you can comment on or make changes to this bug.