Bug 1842673

Summary: openshift-kube-proxy pods are crashing when upgrading from 4.3.8 to 4.3.10
Product: OpenShift Container Platform Reporter: Jaspreet Kaur <jkaur>
Component: NetworkingAssignee: Alexander Constantinescu <aconstan>
Networking sub component: openshift-sdn QA Contact: zhaozhanqi <zzhao>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: bbennett
Version: 4.3.z   
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1843940 (view as bug list) Environment:
Last Closed: 2020-10-27 16:03:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1843940, 1843943, 1843944    

Description Jaspreet Kaur 2020-06-01 19:48:13 UTC
Description of problem: Cluster is installed with 4.3.8 and because we are using a 3rd party network plugin and when upgrading to 4.3.10 we are blocked because of below  bug :

https://bugzilla.redhat.com/show_bug.cgi?id=1820778 

T

Version-Release number of selected component (if applicable):


How reproducible:

Install 4.3.8 with custom network plugin. try to upgrade to 4.3.10  but kube-proxy pods are in crashloopbackoff state. Error shows as below :

  finishedAt: "2020-05-29T07:35:12Z"
        message: "ut mounting /lib/modules\nW0529 07:33:42.293988       1 proxier.go:597]
          Failed to load kernel module ip_vs with modprobe. You can ignore this message
          when kube-proxy is running inside container without mounting /lib/modules\nW0529
          07:33:42.294780       1 proxier.go:597] Failed to load kernel module ip_vs_rr
          with modprobe. You can ignore this message when kube-proxy is running inside
          container without mounting /lib/modules\nW0529 07:33:42.295505       1 proxier.go:597]
          Failed to load kernel module ip_vs_wrr with modprobe. You can ignore this
          message when kube-proxy is running inside container without mounting /lib/modules\nW0529
          07:33:42.296201       1 proxier.go:597] Failed to load kernel module ip_vs_sh
          with modprobe. You can ignore this message when kube-proxy is running inside
          container without mounting /lib/modules\nW0529 07:33:42.296933       1 proxier.go:597]
          Failed to load kernel module nf_conntrack_ipv4 with modprobe. You can ignore
          this message when kube-proxy is running inside container without mounting
          /lib/modules\nI0529 07:33:42.297182       1 server.go:494] Neither kubeconfig
          file nor master URL was specified. Falling back to in-cluster config.\nI0529
          07:33:42.313013       1 node.go:135] Successfully retrieved node IP: 172.22.104.26\nI0529
          07:33:42.313039       1 server_others.go:146] Using iptables Proxier.\nI0529
          07:33:42.313332       1 server.go:529] Version: v0.0.0-master+$Format:%h$\nI0529
          07:33:42.314087       1 conntrack.go:52] Setting nf_conntrack_max to 131072\nI0529
          07:33:42.314445       1 config.go:313] Starting service config controller\nI0529
          07:33:42.314459       1 shared_informer.go:197] Waiting for caches to sync
          for service config\nI0529 07:33:42.314986       1 config.go:131] Starting
          endpoints config controller\nI0529 07:33:42.315000       1 shared_informer.go:197]
          Waiting for caches to sync for endpoints config\nI0529 07:33:42.414678       1
          shared_informer.go:204] Caches are synced for service config \nI0529 07:33:42.415117
                                                                                                                                179,3         89%
reason: Error
        startedAt: "2020-05-29T07:33:42Z"
    name: kube-proxy
    ready: false
    restartCount: 334
    started: false
    state:
      waiting:
        message: back-off 5m0s restarting failed container=kube-proxy pod=openshift-kube-proxy-2h7tw_openshift-kube-proxy(be11c6f8-1b91-4917-8605-f85be7012b9f)
        reason: CrashLoopBackOff




Steps to Reproduce:
1.
2.
3.

Actual results: Failed to upgrade to 4.3.10


Expected results: Should have succeeded with 3rd party network plugin.


Additional info:

Blocker bug : https://bugzilla.redhat.com/show_bug.cgi?id=1820778

Comment 1 Ben Bennett 2020-06-04 13:07:00 UTC
We need to document the procedure to upgrade this:
- Something like disabling CNO so that you can disable the readiness probe on kube-proxy so that you can do the upgrade

Unless there is another way to force the upgrade.

Comment 2 Ben Bennett 2020-06-04 13:08:35 UTC
Note that this is not an issue in anything but 4.3.8, so duplicating it, and closing this for releases other than 4.3.

Comment 3 Ben Bennett 2020-06-04 13:21:35 UTC
Moving to ON QA since this is not a bug in 4.6.

Comment 4 zhaozhanqi 2020-06-09 07:47:09 UTC
Move this to 'verified'

Comment 7 errata-xmlrpc 2020-10-27 16:03:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196