Bug 2071800 - Multus CNI should exit cleanly on CNI DEL when the API server is unavailable
Summary: Multus CNI should exit cleanly on CNI DEL when the API server is unavailable
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.11
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 4.10.z
Assignee: Tomofumi Hayashi
QA Contact: Weibin Liang
Depends On: 2071799
Blocks: 2101436
TreeView+ depends on / blocked
Reported: 2022-04-04 20:15 UTC by Douglas Smith
Modified: 2022-12-14 05:59 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 2071799
: 2101436 (view as bug list)
Last Closed: 2022-12-14 05:59:06 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Github openshift multus-cni pull 132 0 None open Bug 2071800: Remove error handling for getPod to force to proceed cmdDel [backport 4.10] 2022-06-27 14:30:51 UTC
Red Hat Product Errata RHBA-2022:8882 0 None None None 2022-12-14 05:59:08 UTC

Description Douglas Smith 2022-04-04 20:15:02 UTC
+++ This bug was initially created as a clone of Bug #2071799 +++

Description of problem: On the CNI DEL path, Multus CNI should exit cleanly, otherwise, pods can wind up in a crash loop.

How reproducible: Difficult, requires API server to be unreachable.

Comment 1 Douglas Smith 2022-11-21 20:07:15 UTC
Need a cherry pick approval and QE run on openshift/multus-cni/pull/132, thanks!

Comment 3 zhaozhanqi 2022-11-24 05:37:40 UTC
Move this to POST status since it still not be merged yet.

Comment 7 Weibin Liang 2022-11-28 16:42:44 UTC
Tested and verified in 4.10.0-0.nightly-2022-11-24-131934

sh-4.4# journalctl -xe -u crio | grep 'but continue to delete'
Nov 28 16:41:05 weliang-1128b-t8ghl-compute-0 crio[1545]: 2022-11-28T16:41:05Z [error] Multus: getPod failed: Multus: [test/hello-pod/ea6c9fd1-a8be-4351-ae39-c77c2e7b7b48]: error getting pod: Get "https://[api-int.weliang-1128b.qe.devcluster.openshift.com]:6443/api/v1/namespaces/test/pods/hello-pod?timeout=1m0s": dial tcp i/o timeout, but continue to delete

Comment 12 errata-xmlrpc 2022-12-14 05:59:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.10.45 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.