Bug 1828752
Summary: | [upgrade] Fail to upgrade from 4.3 to 4.4 with OVN network | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | zhaozhanqi <zzhao> | ||||||
Component: | Networking | Assignee: | Dan Winship <danw> | ||||||
Networking sub component: | ovn-kubernetes | QA Contact: | zhaozhanqi <zzhao> | ||||||
Status: | CLOSED WONTFIX | Docs Contact: | |||||||
Severity: | urgent | ||||||||
Priority: | high | CC: | aconstan, anusaxen, scuppett, xtian | ||||||
Version: | 4.4 | Keywords: | Regression, Upgrades | ||||||
Target Milestone: | --- | ||||||||
Target Release: | 4.4.z | ||||||||
Hardware: | All | ||||||||
OS: | All | ||||||||
Whiteboard: | SDN-CI-IMPACT,SDN-BP | ||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | 1824522 | Environment: | |||||||
Last Closed: | 2020-05-15 12:42:44 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 1824522 | ||||||||
Bug Blocks: | |||||||||
Attachments: |
|
need to backport https://github.com/ovn-org/ovn-kubernetes/pull/1309 still failed when upgrading from 4.3.0-0.nightly-2020-05-15-004013 to 4.4.0-0.nightly-2020-05-15-002555 oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.4.0-0.nightly-2020-05-15-002555 True False False 6h8m cloud-credential 4.4.0-0.nightly-2020-05-15-002555 True False False 6h22m cluster-autoscaler 4.4.0-0.nightly-2020-05-15-002555 True False False 6h15m console 4.4.0-0.nightly-2020-05-15-002555 True False False 53m csi-snapshot-controller 4.4.0-0.nightly-2020-05-15-002555 True False False 55m dns 4.3.0-0.nightly-2020-05-15-004013 True False False 6h19m etcd 4.4.0-0.nightly-2020-05-15-002555 True False False 65m image-registry 4.4.0-0.nightly-2020-05-15-002555 True False False 6h14m ingress 4.4.0-0.nightly-2020-05-15-002555 True False False 90m insights 4.4.0-0.nightly-2020-05-15-002555 True False False 6h15m kube-apiserver 4.4.0-0.nightly-2020-05-15-002555 True False False 64m kube-controller-manager 4.4.0-0.nightly-2020-05-15-002555 True False False 62m kube-scheduler 4.4.0-0.nightly-2020-05-15-002555 True False False 63m kube-storage-version-migrator 4.4.0-0.nightly-2020-05-15-002555 True False False 56m machine-api 4.4.0-0.nightly-2020-05-15-002555 True False False 6h19m machine-config 4.3.0-0.nightly-2020-05-15-004013 True False False 6h19m marketplace 4.4.0-0.nightly-2020-05-15-002555 True False False 54m monitoring 4.4.0-0.nightly-2020-05-15-002555 False True True 49m network 4.4.0-0.nightly-2020-05-15-002555 True False False 6h20m node-tuning 4.4.0-0.nightly-2020-05-15-002555 True False False 55m openshift-apiserver 4.4.0-0.nightly-2020-05-15-002555 True False False 60m openshift-controller-manager 4.4.0-0.nightly-2020-05-15-002555 True False False 6h20m openshift-samples 4.4.0-0.nightly-2020-05-15-002555 False True True 51m operator-lifecycle-manager 4.4.0-0.nightly-2020-05-15-002555 True False False 6h16m operator-lifecycle-manager-catalog 4.4.0-0.nightly-2020-05-15-002555 True False False 6h16m operator-lifecycle-manager-packageserver 4.4.0-0.nightly-2020-05-15-002555 False True False 50m service-ca 4.4.0-0.nightly-2020-05-15-002555 True False False 6h20m service-catalog-apiserver 4.4.0-0.nightly-2020-05-15-002555 True False False 6h16m service-catalog-controller-manager 4.4.0-0.nightly-2020-05-15-002555 True False False 6h16m storage 4.4.0-0.nightly-2020-05-15-002555 True False False 55m ****found there is one pod work well in openshift-apiserver $ oc get pod -n openshift-apiserver -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES apiserver-6678c68cd4-9d5rl 0/1 CrashLoopBackOff 15 64m 10.128.0.15 ip-10-0-175-168.us-east-2.compute.internal <none> <none> apiserver-6678c68cd4-cmbgk 1/1 Running 0 64m 10.129.0.43 ip-10-0-131-24.us-east-2.compute.internal <none> <none> apiserver-6678c68cd4-krqhd 0/1 CrashLoopBackOff 15 65m 10.130.0.14 ip-10-0-150-174.us-east-2.compute.internal <none> <none> ####still cannot access the kubernetes service #oc logs apiserver-6678c68cd4-9d5rl -n openshift-apiserver Copying system trust bundle F0515 09:05:45.042819 1 cmd.go:72] unable to load configmap based request-header-client-ca-file: Get https://172.30.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication: dial tcp 172.30.0.1:443: i/o timeout Created attachment 1688845 [details]
ovn master logs
ovnkube-master logs show that the bug from the original report is fixed (ie, no more "Failed to add logical port to router, stderr: "ovn-nbctl: rtos-zzhao43ovnup2-ws6vv-worker-westus-vqkpc: port already exists with mac 0A:58:0A:83:00:01\n", error: OVN command '/usr/bin/ovn-nbctl --timeout=15 --may-exist lrp-add ovn_cluster_router rtos-zzhao43ovnup2-ws6vv-worker-westus-vqkpc 0a:58:0a:83:00:01 10.131.0.1/23' failed: exit status 1") This should fix upgrades from older 4.4.z releases to newer 4.4.z releases. We don't actually support 4.3 to 4.4 ovn-kubernetes upgrades; fixing this would require backporting more fixes to 4.3, which we are not doing at this time. |
Created attachment 1682411 [details] ovn master logs from 4.3 to 4.4