1880365 – kube-proxy health check probe failing and pods are seen in CrashLoopBackOff state

Bug 1880365 - kube-proxy health check probe failing and pods are seen in CrashLoopBackOff state

Summary: kube-proxy health check probe failing and pods are seen in CrashLoopBackOff s...

Keywords:
Status:	CLOSED DUPLICATE of bug 1880680
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.3.z
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	4.7.0
Assignee:	Andrew Stoycos
QA Contact:	zhaozhanqi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-09-18 11:07 UTC by Rutvik
Modified:	2024-06-13 23:06 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-10-15 13:15:49 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Rutvik 2020-09-18 11:07:58 UTC

Description of problem:

One of our IBM CloudPack customers seems to be hitting the same issue which was fixed here https://bugzilla.redhat.com/show_bug.cgi?id=1820778

-<>-
Description:*  kube-proxy was working completely fine, but after the
upgrade from CE 2.6.2 to CE 3.2 it keeps restarting. It sounds like the
kube-proxy keeps dying complaining liveness/readiness probe is failing.
-<>-


Version-Release number of selected component (if applicable):

OCP 4.3.29 (Calico)


Actual results:
openshift-kube-proxy pods enter crashloop backoff 

Expected results:
openshift-kube-proxy pods run as normal and should not fail on healthchecks.

Additional info:

Below are the warnings collected from the pods which is why I think we might be hitting the same BZ again.

~~~~
2020-09-01T10:31:32.37026376Z W0901 10:31:32.369966       1 proxier.go:584] Failed to read file /lib/modules/4.18.0-147.20.1.el8_1.x86_64/modules.builtin with error open /lib/modules/4.18.0-147.20.1.el8_1.x86_64/modules.builtin: no such file or directory. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules

2020-09-01T10:31:32.584849247Z W0901 10:31:32.584805       1 proxier.go:597] Failed to load kernel module ip_vs with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
2020-09-01T10:31:32.587739958Z W0901 10:31:32.587730       1 proxier.go:597] Failed to load kernel module nf_conntrack_ipv4 with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
~~~~


I've also come across this error in most of the kube-proxy pods. For this, I think we should also consider checking BZ https://bugzilla.redhat.com/show_bug.cgi?id=1843646.
~~~~
2020-09-17T09:09:48.91469971Z E0917 09:09:48.914578       1 proxier.go:1449] Failed to execute iptables-restore: exit status 4 (iptables-restore v1.8.4 (nf_tables):
2020-09-17T09:09:48.91469971Z line 27016: CHAIN_USER_DEL failed (Device or resource busy): chain KUBE-SEP-4VGFJQBLLXV6WZYJ
~~~~

Comment 2 Mark Rooks 2020-09-23 07:01:36 UTC

Duplicate of 1880680?

Comment 3 Pooriya Aghaalitari 2020-09-25 16:17:29 UTC

What is the timeline for a fix for this bug please? Thank you.

Comment 6 Andrew Stoycos 2020-09-28 21:15:58 UTC

Still investigating no timeline for a fix ATM, see https://bugzilla.redhat.com/show_bug.cgi?id=1880680 as well

Comment 7 Juan Luis de Sousa-Valadas 2020-10-02 10:55:34 UTC

Can you please provide kube-proxy logs at log level 5 or 6?

Comment 8 Andrew Stoycos 2020-10-15 13:15:49 UTC

This was a duplicate of the issue found here https://bugzilla.redhat.com/show_bug.cgi?id=1880365 see IPtables fix

*** This bug has been marked as a duplicate of bug 1880680 ***

Note You need to log in before you can comment on or make changes to this bug.