Bug 1507257

Summary: Messages flooded with messages like StopPodSandbox $SHA from runtime service failed: rpc error: code = 2 desc = NetworkPlugin cni failed to teardown pod <pod-name> network: CNI failed to retrieve network namespace path: Error: No such container: $SHA
Product: OpenShift Container Platform Reporter: Alexis Solanas <asolanas>
Component: NetworkingAssignee: Rajat Chopra <rchopra>
Status: CLOSED ERRATA QA Contact: Meng Bo <bmeng>
Severity: high Docs Contact:
Priority: high    
Version: 3.6.1CC: ansverma, aos-bugs, bbennett, eparis, javier.ramirez, jokerman, mmccomas, rchopra, stwalter
Target Milestone: ---   
Target Release: 3.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Network namespace path is nil when using hostnetwork. Consequence: The teardown in such a case gives out error messages. Fix: Upstream fix - do not let CNI manage containers with hostnetwork=true. Result: No error messages
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-11-28 22:20:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alexis Solanas 2017-10-28 23:56:29 UTC
Description of problem:

 Messages files are getting full with messages:

StopPodSandbox $SHA from runtime service failed: rpc error: code = 2 desc = NetworkPlugin cni failed to teardown pod <pod-name> network: CNI failed to retrieve network namespace path: Error: No such container: $SHA

 Pods no longer exist in the nodes, and there's no reference to them in etcd

 Doing a rm -rf /var/lib/docker doesn't solve the problem


Version-Release number of selected component (if applicable):

atomic-openshift-3.6.173.0.21-1.git.0.f95b0e7

How reproducible:

 Unknown

Steps to Reproduce:
1.
2.
3.

Actual results:

 Too many error messages are logged for pods that no longer exist

Expected results:

 "NetworkPlugin cni failed to teardown pod ... " messages are not logged 


Additional info:


 Upstream issue: https://github.com/kubernetes/kubernetes/issues/44307

Comment 1 Rajat Chopra 2017-10-30 23:44:05 UTC
https://github.com/openshift/origin/pull/17097
This should fix the issue, or we need a chain of backports from upstream.

Comment 3 Javier Ramirez 2017-10-31 14:14:30 UTC
(In reply to Rajat Chopra from comment #1)
> https://github.com/openshift/origin/pull/17097
> This should fix the issue, or we need a chain of backports from upstream.

Thanks Rajat, any idea if there is a workaround we could apply now?

Comment 4 Rajat Chopra 2017-11-02 18:12:50 UTC
Javier, unfortunately no workaround there.

Comment 9 Meng Bo 2017-11-03 10:16:43 UTC
Tested on OCP build v3.7.0-0.190.0
Issue has been fixed.

There will be no error messages in the atomic-openshift-node log when deleting the docker container which is created as hostnetwork enabled pod.


With the same steps, the error messages appear in the node log as below on build prior 3.7.0-0.190.0.

Nov 03 03:51:14 qe-bmeng.interna atomic-openshift-node[23194]: W1103 03:51:14.541341   23194 cni.go:258] CNI failed to retrieve network namespace path: Error: No such container: c5e769d5f72ef91c44e50f9eb6a73a74ce92372e5a705a3edf941a38bec60454

Comment 12 errata-xmlrpc 2017-11-28 22:20:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188