Bug 1880365 - kube-proxy health check probe failing and pods are seen in CrashLoopBackOff state
Summary: kube-proxy health check probe failing and pods are seen in CrashLoopBackOff s...
Keywords:
Status: CLOSED DUPLICATE of bug 1880680
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.3.z
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.7.0
Assignee: Andrew Stoycos
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-18 11:07 UTC by Rutvik
Modified: 2024-06-13 23:06 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-15 13:15:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Rutvik 2020-09-18 11:07:58 UTC
Description of problem:

One of our IBM CloudPack customers seems to be hitting the same issue which was fixed here https://bugzilla.redhat.com/show_bug.cgi?id=1820778

-<>-
Description:*  kube-proxy was working completely fine, but after the
upgrade from CE 2.6.2 to CE 3.2 it keeps restarting. It sounds like the
kube-proxy keeps dying complaining liveness/readiness probe is failing.
-<>-


Version-Release number of selected component (if applicable):

OCP 4.3.29 (Calico)


Actual results:
openshift-kube-proxy pods enter crashloop backoff 

Expected results:
openshift-kube-proxy pods run as normal and should not fail on healthchecks.

Additional info:

Below are the warnings collected from the pods which is why I think we might be hitting the same BZ again.

~~~~
2020-09-01T10:31:32.37026376Z W0901 10:31:32.369966       1 proxier.go:584] Failed to read file /lib/modules/4.18.0-147.20.1.el8_1.x86_64/modules.builtin with error open /lib/modules/4.18.0-147.20.1.el8_1.x86_64/modules.builtin: no such file or directory. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules

2020-09-01T10:31:32.584849247Z W0901 10:31:32.584805       1 proxier.go:597] Failed to load kernel module ip_vs with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
2020-09-01T10:31:32.587739958Z W0901 10:31:32.587730       1 proxier.go:597] Failed to load kernel module nf_conntrack_ipv4 with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
~~~~


I've also come across this error in most of the kube-proxy pods. For this, I think we should also consider checking BZ https://bugzilla.redhat.com/show_bug.cgi?id=1843646.
~~~~
2020-09-17T09:09:48.91469971Z E0917 09:09:48.914578       1 proxier.go:1449] Failed to execute iptables-restore: exit status 4 (iptables-restore v1.8.4 (nf_tables):
2020-09-17T09:09:48.91469971Z line 27016: CHAIN_USER_DEL failed (Device or resource busy): chain KUBE-SEP-4VGFJQBLLXV6WZYJ
~~~~

Comment 2 Mark Rooks 2020-09-23 07:01:36 UTC
Duplicate of 1880680?

Comment 3 Pooriya Aghaalitari 2020-09-25 16:17:29 UTC
What is the timeline for a fix for this bug please? Thank you.

Comment 6 Andrew Stoycos 2020-09-28 21:15:58 UTC
Still investigating no timeline for a fix ATM, see https://bugzilla.redhat.com/show_bug.cgi?id=1880680 as well

Comment 7 Juan Luis de Sousa-Valadas 2020-10-02 10:55:34 UTC
Can you please provide kube-proxy logs at log level 5 or 6?

Comment 8 Andrew Stoycos 2020-10-15 13:15:49 UTC
This was a duplicate of the issue found here https://bugzilla.redhat.com/show_bug.cgi?id=1880365 see IPtables fix

*** This bug has been marked as a duplicate of bug 1880680 ***


Note You need to log in before you can comment on or make changes to this bug.