Bugzilla (bugzilla.redhat.com) will be under maintenance for infrastructure upgrades and will not be available on July 31st between 12:30 AM - 05:30 AM UTC. We appreciate your understanding and patience. You can follow status.redhat.com for details.
Bug 1877367 - KubeAPIErrorsHigh firing due to "too large resource version"
Summary: KubeAPIErrorsHigh firing due to "too large resource version"
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-apiserver
Version: 4.6
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.6.0
Assignee: Lukasz Szaszkiewicz
QA Contact: Ke Wang
URL:
Whiteboard:
Depends On:
Blocks: 1877346
TreeView+ depends on / blocked
 
Reported: 2020-09-09 13:15 UTC by Lukasz Szaszkiewicz
Modified: 2020-10-28 07:38 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1877346
Environment:
Last Closed: 2020-10-27 16:38:53 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:39:12 UTC

Comment 1 Lukasz Szaszkiewicz 2020-09-09 13:20:23 UTC
Once the rebase (1.19) PR lands https://github.com/openshift/kubernetes/pull/325 the fixes (https://github.com/openshift/origin/pull/25489 and https://github.com/openshift/origin/pull/25490) will be present in 4.6.

Comment 4 Ke Wang 2020-09-22 11:10:02 UTC
(In reply to Lukasz Szaszkiewicz from comment #1)
> Once the rebase (1.19) PR lands
> https://github.com/openshift/kubernetes/pull/325 the fixes
> (https://github.com/openshift/origin/pull/25489 and
> https://github.com/openshift/origin/pull/25490) will be present in 4.6.

Hi Lukasz, PR 25489 and 25490 have not been merged, could you have a look? without them merging, versification is unable to go on.

Comment 5 Ke Wang 2020-09-23 03:20:44 UTC
CC: lszaszki@redhat.com, PR 25489 and 25490 are 4.5 fixes, we need corresponding fixes for 4.6 here.

Comment 6 Ke Wang 2020-09-23 15:07:06 UTC
OCP 4.6 already has been re-based bump to kube 1.19 and have a master node connected to the cluster. Then, disconnect it from the network for 5 minutes, after network recovery kubelet reconnects to the Apiserver as before. Then observe kubelet's logs, such timeouts do not occur anymore.
# cat ~/test.sh 
ifconfig ens5 down
sleep 300
ifconfig ens5 up

#./test.sh &

# pwd
/var/log/pods
# grep -nr 'Timeout: Too large resource version' openshift-*
# journalctl -b -u kubelet | grep 'Timeout: Too large resource version'

Comment 8 errata-xmlrpc 2020-10-27 16:38:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.