Bug 1960199

Summary: container pod cannot access kubernete service after upgrade from 4.5 to 4.6
Product: OpenShift Container Platform Reporter: zhaozhanqi <zzhao>
Component: NetworkingAssignee: Jacob Tanenbaum <jtanenba>
Networking sub component: openshift-sdn QA Contact: zhaozhanqi <zzhao>
Status: CLOSED DUPLICATE Docs Contact:
Severity: high    
Priority: unspecified CC: aconstan
Version: 4.6.z   
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-06-07 15:28:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description zhaozhanqi 2021-05-13 10:00:01 UTC
Description of problem:

Upgrade path 4.1.41-x86_64->4.2.36-x86_64->4.3.40-x86_64->4.4.33-x86_64->4.5.37-x86_64->4.6.26-x86_64->4.7.8-x86_64	

IPI on AWS (FIPS off)

Found after upgrade to 4.6.26 version, there is node marked as scheduledisable

see http://virt-openshift-05.lab.eng.nay.redhat.com/ci-logs/upgrade_CI/13530/log

When check must-gather logs, router pod cannot be accessed kubenetes service:

2021-04-22T08:30:21.457587166Z E0422 08:30:21.457512       1 reflector.go:127] github.com/openshift/router/pkg/router/controller/factory/factory.go:125: Failed to watch *v1.Route: failed to list *v1.Route: Get "https://172.30.0.1:443/apis/route.openshift.io/v1/routes?limit=500&resourceVersion=0": dial tcp 172.30.0.1:443: connect: no route to host

when checking must-gather logs:

must-gather.local.6539144585504830095/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-c140c43210a8165d82229d01028e3aee86bc49b459a81040b20caaa678941141/namespaces/openshift-sdn/pods/sdn-gv5tv/sdn/sdn/logs/current.log

2021-04-22T07:45:51.30282013Z I0422 07:45:51.302785  171045 node.go:443] Reattaching pod 'policy-upgrade/test-rc-zs7kd' to SDN
2021-04-22T07:45:51.429963383Z I0422 07:45:51.429905  171045 pod.go:508] CNI_ADD policy-upgrade/test-rc-zs7kd got IP 10.129.2.11, ofport 13
2021-04-22T07:45:51.430085892Z I0422 07:45:51.430045  171045 node.go:443] Reattaching pod 'openshift-dns/dns-default-p8qp7' to SDN
2021-04-22T07:45:51.552597663Z I0422 07:45:51.549973  171045 pod.go:508] CNI_ADD openshift-dns/dns-default-p8qp7 got IP 10.129.2.2, ofport 14
2021-04-22T07:45:51.552597663Z I0422 07:45:51.550141  171045 node.go:443] Reattaching pod 'ui-upgrade/hello-openshift-6cd8699fff-nhcs6' to SDN
2021-04-22T07:45:51.597603478Z I0422 07:45:51.596843  171045 ovs.go:158] Error executing ovs-vsctl: 2021-04-22T07:45:51Z|00001|jsonrpc|WARN|unix:/var/run/openvswitch/db.sock: receive error: Connection reset by peer
2021-04-22T07:45:51.597603478Z 2021-04-22T07:45:51Z|00002|reconnect|WARN|unix:/var/run/openvswitch/db.sock: connection dropped (Connection reset by peer)
2021-04-22T07:45:51.597603478Z ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (Connection reset by peer)
2021-04-22T07:45:51.603626516Z I0422 07:45:51.603588  171045 ovs.go:158] Error executing ovs-ofctl: 2021-04-22T07:45:51Z|00001|vconn_stream|ERR|connection dropped mid-packet
2021-04-22T07:45:51.603626516Z ovs-ofctl: OpenFlow receive failed (Protocol error)
2021-04-22T07:45:52.102851287Z I0422 07:45:52.102765  171045 ovs.go:158] Error executing ovs-vsctl: ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (Connection refused)
2021-04-22T07:45:52.108073791Z I0422 07:45:52.108038  171045 ovs.go:158] Error executing ovs-ofctl: ovs-ofctl: /var/run/openvswitch/br0.mgmt: failed to open socket (Connection refused)
2021-04-22T07:45:52.734042402Z I0422 07:45:52.733932  171045 ovs.go:158] Error executing ovs-vsctl: ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (Connection refused)
2021-04-22T07:45:52.7374Z I0422 07:45:52.737365  171045 ovs.go:158] Error executing ovs-ofctl: ovs-ofctl: /var/run/openvswitch/br0.mgmt: failed to open socket (Connection refused)
2021-04-22T07:45:53.524003963Z I0422 07:45:53.523922  171045 ovs.go:158] Error executing ovs-ofctl: ovs-ofctl: /var/run/openvswitch/br0.mgmt: failed to open socket (Connection refused)
2021-04-22T07:45:53.524053536Z I0422 07:45:53.524000  171045 ovs.go:158] Error executing ovs-vsctl: ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (Connection refused)
2021-04-22T07:45:54.507192366Z I0422 07:45:54.507078  171045 ovs.go:158] Error executing ovs-ofctl: ovs-ofctl: /var/run/openvswitch/br0.mgmt: failed to open socket (Connection refused)
2021-04-22T07:45:54.507192366Z I0422 07:45:54.507149  171045 ovs.go:158] Error executing ovs-vsctl: ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (Connection refused)
2021-04-22T07:45:55.734411239Z I0422 07:45:55.734337  171045 ovs.go:158] Error executing ovs-ofctl: ovs-ofctl: /var/run/openvswitch/br0.mgmt: failed to open socket (Connection refused)
2021-04-22T07:45:55.734511295Z I0422 07:45:55.734471  171045 ovs.go:158] Error executing ovs-vsctl: ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (Connection refused)
2021-04-22T07:45:57.266757764Z I0422 07:45:57.266639  171045 ovs.go:158] Error executing ovs-ofctl: ovs-ofctl: /var/run/openvswitch/br0.mgmt: failed to open socket (Connection refused)
2021-04-22T07:45:57.266922594Z I0422 07:45:57.266887  171045 ovs.go:158] Error executing ovs-vsctl: ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (Connection refused)
2021-04-22T07:45:59.180645146Z I0422 07:45:59.180588  171045 ovs.go:158] Error executing ovs-ofctl: ovs-ofctl: /var/run/openvswitch/br0.mgmt: failed to open socket (Connection refused)
2021-04-22T07:45:59.180695385Z I0422 07:45:59.180650  171045 ovs.go:158] Error executing ovs-vsctl: ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (Connection refused)
2021-04-22T07:46:01.573128828Z I0422 07:46:01.573026  171045 ovs.go:158] Error executing ovs-vsctl: ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (Connection refused)
2021-04-22T07:46:01.585574558Z I0422 07:46:01.584146  171045 ovs.go:158] Error executing ovs-ofctl: ovs-ofctl: /var/run/openvswitch/br0.mgmt: failed to open socket (Connection refused)
2021-04-22T07:46:04.558616038Z I0422 07:46:04.558543  171045 ovs.go:158] Error executing ovs-vsctl: ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (Connection refused)
2021-04-22T07:46:04.562978106Z I0422 07:46:04.562937  171045 ovs.go:158] Error executing ovs-vsctl: ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (Connection refused)
2021-04-22T07:46:04.576717234Z I0422 07:46:04.572495  171045 ovs.go:158] Error executing ovs-ofctl: ovs-ofctl: /var/run/openvswitch/br0.mgmt: failed to open socket (Connection refused)
2021-04-22T07:46:04.576717234Z E0422 07:46:04.572521  171045 networkpolicy.go:280] Error syncing OVS flows: timed out waiting for the condition
2021-04-22T07:46:05.073391911Z I0422 07:46:05.073314  171045 ovs.go:158] Error executing ovs-vsctl: ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (Connection refused)
2021-04-22T07:46:05.845607705Z W0422 07:46:05.845097  171045 pod.go:275] CNI_ADD ui-upgrade/hello-openshift-6cd8699fff-nhcs6 failed: failed to get OVS port for vethafc7808e: timed out waiting for the condition
2021-04-22T07:46:05.845607705Z W0422 07:46:05.845159  171045 node.go:449] Could not reattach pod 'ui-upgrade/hello-openshift-6cd8699fff-nhcs6' to SDN: failed to get OVS port for vethafc7808e: timed out waiting for the condition



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. upgrade from 4.5 to 4.6
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 zhaozhanqi 2021-05-13 10:04:23 UTC
Trying build again with same path to see if it can be reproduced.