1877367 – KubeAPIErrorsHigh firing due to "too large resource version"

Bug 1877367 - KubeAPIErrorsHigh firing due to "too large resource version"

Summary: KubeAPIErrorsHigh firing due to "too large resource version"

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	kube-apiserver
Sub Component:
Version:	4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	4.6.0
Assignee:	Lukasz Szaszkiewicz
QA Contact:	Ke Wang
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1877346
TreeView+	depends on / blocked

Reported:	2020-09-09 13:15 UTC by Lukasz Szaszkiewicz
Modified:	2024-06-13 23:02 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1877346
Environment:
Last Closed:	2020-10-27 16:38:53 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2020:4196	0	None	None	None	2020-10-27 16:39:12 UTC

Comment 1 Lukasz Szaszkiewicz 2020-09-09 13:20:23 UTC

Once the rebase (1.19) PR lands https://github.com/openshift/kubernetes/pull/325 the fixes (https://github.com/openshift/origin/pull/25489 and https://github.com/openshift/origin/pull/25490) will be present in 4.6.

Comment 4 Ke Wang 2020-09-22 11:10:02 UTC

(In reply to Lukasz Szaszkiewicz from comment #1)
> Once the rebase (1.19) PR lands
> https://github.com/openshift/kubernetes/pull/325 the fixes
> (https://github.com/openshift/origin/pull/25489 and
> https://github.com/openshift/origin/pull/25490) will be present in 4.6.

Hi Lukasz, PR 25489 and 25490 have not been merged, could you have a look? without them merging, versification is unable to go on.

Comment 5 Ke Wang 2020-09-23 03:20:44 UTC

CC: lszaszki, PR 25489 and 25490 are 4.5 fixes, we need corresponding fixes for 4.6 here.

Comment 6 Ke Wang 2020-09-23 15:07:06 UTC

OCP 4.6 already has been re-based bump to kube 1.19 and have a master node connected to the cluster. Then, disconnect it from the network for 5 minutes, after network recovery kubelet reconnects to the Apiserver as before. Then observe kubelet's logs, such timeouts do not occur anymore.
# cat ~/test.sh 
ifconfig ens5 down
sleep 300
ifconfig ens5 up

#./test.sh &

# pwd
/var/log/pods
# grep -nr 'Timeout: Too large resource version' openshift-*
# journalctl -b -u kubelet | grep 'Timeout: Too large resource version'

Comment 8 errata-xmlrpc 2020-10-27 16:38:53 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196

Note You need to log in before you can comment on or make changes to this bug.