Bug 1842673 - openshift-kube-proxy pods are crashing when upgrading from 4.3.8 to 4.3.10
Summary: openshift-kube-proxy pods are crashing when upgrading from 4.3.8 to 4.3.10
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.3.z
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.6.0
Assignee: Alexander Constantinescu
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On:
Blocks: 1843940 1843943 1843944
TreeView+ depends on / blocked
 
Reported: 2020-06-01 19:48 UTC by Jaspreet Kaur
Modified: 2020-10-27 16:03 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1843940 (view as bug list)
Environment:
Last Closed: 2020-10-27 16:03:25 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:03:31 UTC

Description Jaspreet Kaur 2020-06-01 19:48:13 UTC
Description of problem: Cluster is installed with 4.3.8 and because we are using a 3rd party network plugin and when upgrading to 4.3.10 we are blocked because of below  bug :

https://bugzilla.redhat.com/show_bug.cgi?id=1820778 

T

Version-Release number of selected component (if applicable):


How reproducible:

Install 4.3.8 with custom network plugin. try to upgrade to 4.3.10  but kube-proxy pods are in crashloopbackoff state. Error shows as below :

  finishedAt: "2020-05-29T07:35:12Z"
        message: "ut mounting /lib/modules\nW0529 07:33:42.293988       1 proxier.go:597]
          Failed to load kernel module ip_vs with modprobe. You can ignore this message
          when kube-proxy is running inside container without mounting /lib/modules\nW0529
          07:33:42.294780       1 proxier.go:597] Failed to load kernel module ip_vs_rr
          with modprobe. You can ignore this message when kube-proxy is running inside
          container without mounting /lib/modules\nW0529 07:33:42.295505       1 proxier.go:597]
          Failed to load kernel module ip_vs_wrr with modprobe. You can ignore this
          message when kube-proxy is running inside container without mounting /lib/modules\nW0529
          07:33:42.296201       1 proxier.go:597] Failed to load kernel module ip_vs_sh
          with modprobe. You can ignore this message when kube-proxy is running inside
          container without mounting /lib/modules\nW0529 07:33:42.296933       1 proxier.go:597]
          Failed to load kernel module nf_conntrack_ipv4 with modprobe. You can ignore
          this message when kube-proxy is running inside container without mounting
          /lib/modules\nI0529 07:33:42.297182       1 server.go:494] Neither kubeconfig
          file nor master URL was specified. Falling back to in-cluster config.\nI0529
          07:33:42.313013       1 node.go:135] Successfully retrieved node IP: 172.22.104.26\nI0529
          07:33:42.313039       1 server_others.go:146] Using iptables Proxier.\nI0529
          07:33:42.313332       1 server.go:529] Version: v0.0.0-master+$Format:%h$\nI0529
          07:33:42.314087       1 conntrack.go:52] Setting nf_conntrack_max to 131072\nI0529
          07:33:42.314445       1 config.go:313] Starting service config controller\nI0529
          07:33:42.314459       1 shared_informer.go:197] Waiting for caches to sync
          for service config\nI0529 07:33:42.314986       1 config.go:131] Starting
          endpoints config controller\nI0529 07:33:42.315000       1 shared_informer.go:197]
          Waiting for caches to sync for endpoints config\nI0529 07:33:42.414678       1
          shared_informer.go:204] Caches are synced for service config \nI0529 07:33:42.415117
                                                                                                                                179,3         89%
reason: Error
        startedAt: "2020-05-29T07:33:42Z"
    name: kube-proxy
    ready: false
    restartCount: 334
    started: false
    state:
      waiting:
        message: back-off 5m0s restarting failed container=kube-proxy pod=openshift-kube-proxy-2h7tw_openshift-kube-proxy(be11c6f8-1b91-4917-8605-f85be7012b9f)
        reason: CrashLoopBackOff




Steps to Reproduce:
1.
2.
3.

Actual results: Failed to upgrade to 4.3.10


Expected results: Should have succeeded with 3rd party network plugin.


Additional info:

Blocker bug : https://bugzilla.redhat.com/show_bug.cgi?id=1820778

Comment 1 Ben Bennett 2020-06-04 13:07:00 UTC
We need to document the procedure to upgrade this:
- Something like disabling CNO so that you can disable the readiness probe on kube-proxy so that you can do the upgrade

Unless there is another way to force the upgrade.

Comment 2 Ben Bennett 2020-06-04 13:08:35 UTC
Note that this is not an issue in anything but 4.3.8, so duplicating it, and closing this for releases other than 4.3.

Comment 3 Ben Bennett 2020-06-04 13:21:35 UTC
Moving to ON QA since this is not a bug in 4.6.

Comment 4 zhaozhanqi 2020-06-09 07:47:09 UTC
Move this to 'verified'

Comment 7 errata-xmlrpc 2020-10-27 16:03:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196


Note You need to log in before you can comment on or make changes to this bug.