Hide Forgot
Description of problem: Upgrade failed to 4.0.0-0.nightly-2019-04-02-133735 from 4.0.0-0.nightly-2019-04-02-081046. Hang here more than 1 hour: [jzhang@dhcp-140-18 ocp43]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-04-02-133735 True True 80m Working towards 4.0.0-0.nightly-2019-04-02-133735: 18% complete [jzhang@dhcp-140-18 ocp43]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-04-02-133735 True True 81m Unable to apply 4.0.0-0.nightly-2019-04-02-133735: the cluster operator kube-apiserver is failing Version-Release number of the following components: rpm -q openshift-ansible rpm -q ansible ansible --version How reproducible: Steps to Reproduce: 1. Install the OCP 4.0 with the payload: 4.0.0-0.nightly-2019-04-02-081046 2. Install the etcd operator. 3. Change the "upstream" value to "https://openshift-release.svc.ci.openshift.org/graph" and run "oc adm upgrade --to 4.0.0-0.nightly-2019-04-02-133735". Actual results: Upgrade failed. [jzhang@dhcp-140-18 ocp43]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-04-02-133735 True True 80m Working towards 4.0.0-0.nightly-2019-04-02-133735: 18% complete [jzhang@dhcp-140-18 ocp43]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-04-02-133735 True True 81m Unable to apply 4.0.0-0.nightly-2019-04-02-133735: the cluster operator kube-apiserver is failing [jzhang@dhcp-140-18 ocp43]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-04-02-133735 True True 89m Working towards 4.0.0-0.nightly-2019-04-02-133735: 18% complete [jzhang@dhcp-140-18 ocp43]$ oc get pods -n openshift-kube-apiserver NAME READY STATUS RESTARTS AGE installer-9-ip-10-0-154-74.ap-south-1.compute.internal 0/1 Error 0 45m kube-apiserver-ip-10-0-130-14.ap-south-1.compute.internal 2/2 Running 2 62m kube-apiserver-ip-10-0-138-187.ap-south-1.compute.internal 2/2 Running 2 60m kube-apiserver-ip-10-0-154-74.ap-south-1.compute.internal 2/2 Running 2 63m revision-pruner-8-ip-10-0-130-14.ap-south-1.compute.internal 0/1 Completed 0 50m revision-pruner-8-ip-10-0-138-187.ap-south-1.compute.internal 0/1 Completed 0 46m revision-pruner-8-ip-10-0-154-74.ap-south-1.compute.internal 0/1 Completed 0 45m revision-pruner-9-ip-10-0-154-74.ap-south-1.compute.internal 0/1 Completed 0 44m Expected results: Upgrade success. Additional info: Here are the whole "installer-9-ip-10-0-154-74.ap-south-1.compute.internal" logs: [jzhang@dhcp-140-18 ocp43]$ oc logs installer-9-ip-10-0-154-74.ap-south-1.compute.internal -n openshift-kube-apiserver I0403 09:25:50.813193 1 cmd.go:76] &{<nil> true {false} installer true map[resource-dir:0xc4206c0140 configmaps:0xc42001ff40 optional-configmaps:0xc4206c00a0 cert-configmaps:0xc4206c0a00 optional-cert-secrets:0xc4206c0aa0 v:0xc420534b40 revision:0xc42001fb80 namespace:0xc42001fc20 pod:0xc42001fe00 pod-manifest-dir:0xc4206c05a0 secrets:0xc42001fea0 cert-dir:0xc4206c0be0 cert-secrets:0xc4206c0960] [0xc420534b40 0xc42001fb80 0xc42001fc20 0xc42001fe00 0xc4206c0140 0xc4206c05a0 0xc42001ff40 0xc4206c00a0 0xc42001fea0 0xc4206c0be0 0xc4206c0a00 0xc4206c0960 0xc4206c0aa0] [] map[configmaps:0xc42001ff40 pod-manifest-dir:0xc4206c05a0 cert-secrets:0xc4206c0960 optional-cert-secrets:0xc4206c0aa0 optional-cert-configmaps:0xc4206c0b40 cert-dir:0xc4206c0be0 v:0xc420534b40 revision:0xc42001fb80 optional-configmaps:0xc4206c00a0 timeout-duration:0xc4206c0640 log-dir:0xc4205345a0 stderrthreshold:0xc4205346e0 vmodule:0xc420534be0 namespace:0xc42001fc20 pod:0xc42001fe00 secrets:0xc42001fea0 optional-secrets:0xc4206c0000 resource-dir:0xc4206c0140 kubeconfig:0xc42001f5e0 alsologtostderr:0xc4205341e0 log-backtrace-at:0xc420534280 log-flush-frequency:0xc420126640 logtostderr:0xc420534640 help:0xc4206648c0 cert-configmaps:0xc4206c0a00] [0xc42001f5e0 0xc42001fb80 0xc42001fc20 0xc42001fe00 0xc42001fea0 0xc42001ff40 0xc4206c0000 0xc4206c00a0 0xc4206c0140 0xc4206c05a0 0xc4206c0640 0xc4206c0960 0xc4206c0a00 0xc4206c0aa0 0xc4206c0b40 0xc4206c0be0 0xc4205341e0 0xc420534280 0xc4205345a0 0xc420126640 0xc420534640 0xc4205346e0 0xc420534b40 0xc420534be0 0xc4206648c0] [0xc4205341e0 0xc4206c0a00 0xc4206c0be0 0xc4206c0960 0xc42001ff40 0xc4206648c0 0xc42001f5e0 0xc420534280 0xc4205345a0 0xc420126640 0xc420534640 0xc42001fc20 0xc4206c0b40 0xc4206c0aa0 0xc4206c00a0 0xc4206c0000 0xc42001fe00 0xc4206c05a0 0xc4206c0140 0xc42001fb80 0xc42001fea0 0xc4205346e0 0xc4206c0640 0xc420534b40 0xc420534be0] map[118:0xc420534b40 104:0xc4206648c0] [] -1 0 0xc420481a40 true <nil> []} I0403 09:25:50.813704 1 cmd.go:77] (*installerpod.InstallOptions)(0xc4203ebb80)({ KubeConfig: (string) "", KubeClient: (kubernetes.Interface) <nil>, Revision: (string) (len=1) "9", Namespace: (string) (len=24) "openshift-kube-apiserver", PodConfigMapNamePrefix: (string) (len=18) "kube-apiserver-pod", SecretNamePrefixes: ([]string) (len=3 cap=4) { (string) (len=11) "etcd-client", (string) (len=42) "kube-apiserver-cert-syncer-client-cert-key", (string) (len=14) "kubelet-client" }, OptionalSecretNamePrefixes: ([]string) <nil>, ConfigMapNamePrefixes: ([]string) (len=5 cap=8) { (string) (len=18) "kube-apiserver-pod", (string) (len=6) "config", (string) (len=37) "kube-apiserver-cert-syncer-kubeconfig", (string) (len=15) "etcd-serving-ca", (string) (len=18) "kubelet-serving-ca" }, OptionalConfigMapNamePrefixes: ([]string) (len=4 cap=4) { (string) (len=14) "oauth-metadata", (string) (len=30) "initial-sa-token-signing-certs", (string) (len=24) "kube-apiserver-server-ca", (string) (len=46) "kube-controller-manager-sa-token-signing-certs" }, CertSecretNames: ([]string) (len=2 cap=2) { (string) (len=17) "aggregator-client", (string) (len=12) "serving-cert" }, OptionalCertSecretNamePrefixes: ([]string) (len=11 cap=16) { (string) (len=17) "user-serving-cert", (string) (len=21) "user-serving-cert-000", (string) (len=21) "user-serving-cert-001", (string) (len=21) "user-serving-cert-002", (string) (len=21) "user-serving-cert-003", (string) (len=21) "user-serving-cert-004", (string) (len=21) "user-serving-cert-005", (string) (len=21) "user-serving-cert-006", (string) (len=21) "user-serving-cert-007", (string) (len=21) "user-serving-cert-008", (string) (len=21) "user-serving-cert-009" }, CertConfigMapNamePrefixes: ([]string) (len=2 cap=2) { (string) (len=20) "aggregator-client-ca", (string) (len=9) "client-ca" }, OptionalCertConfigMapNamePrefixes: ([]string) <nil>, CertDir: (string) (len=57) "/etc/kubernetes/static-pod-resources/kube-apiserver-certs", ResourceDir: (string) (len=36) "/etc/kubernetes/static-pod-resources", PodManifestDir: (string) (len=25) "/etc/kubernetes/manifests", Timeout: (time.Duration) 2m0s, PodMutationFns: ([]installerpod.PodMutationFunc) <nil> }) F0403 09:26:00.475326 1 cmd.go:89] failed to get self-reference: Get https://172.30.0.1:443/api/v1/namespaces/openshift-kube-apiserver/pods/installer-9-ip-10-0-154-74.ap-south-1.compute.internal: dial tcp 172.30.0.1:443: connect: no route to host
`F0403 09:26:00.475326 1 cmd.go:89] failed to get self-reference: Get https://172.30.0.1:443/api/v1/namespaces/openshift-kube-apiserver/pods/installer-9-ip-10-0-154-74.ap-south-1.compute.internal: dial tcp 172.30.0.1:443: connect: no route to host` this looks like a networking error.
The installer pod logged its error at 09:26:00. The sdn pod on that node first observed the existence of the default/kubernetes service only 2 seconds before that: I0403 09:25:58.208561 4167 service.go:319] Adding new service port "default/kubernetes:https" at 172.30.0.1:443/TCP Given that a "no route to host" error would take a few seconds of network failure to trigger, it's likely that the installer pod had started its attempt to connect to 172.30.0.1 before that point (and it's not clear from the logs exactly when sdn/kube-proxy actually created the iptables rules for the service anyway; the iptables proxier would wait until at least 1 second after the last of the burst of initial updates). So it looks like the installer-pod got "no route to host" because it tried to connect to the service before openshift-sdn had fully set up service IPs. (Something that is not covered by any node readiness condition, I believe.) One weird thing is that the sdn was actually started 20 seconds before this point, and in the logs we see: W0403 09:25:42.835196 4167 node.go:399] Could not reattach pod 'openshift-machine-api/cluster-autoscaler-operator-848f5ff955-p7v7j' to SDN: error on port veth0f212092: "could not open network device veth0f212092 (No such device)" I0403 09:25:57.803708 4167 node.go:350] openshift-sdn network plugin registering startup Note the mysterious 15-second delay between the two messages. I'm not sure where that's coming from, but if it hadn't been there, then sdn would have had more than enough time to get the service IPs set up before the installer pod ran...
> W0403 09:25:42.835196 4167 node.go:399] Could not reattach pod 'openshift-machine-api/cluster-autoscaler-operator-848f5ff955-p7v7j' to SDN: error on port veth0f212092: "could not open network device veth0f212092 (No such device)" (just to be clear, that error is irrelevant to this bug; the node was rebooted, so we wouldn't expect to be able to reattach the pods that were running before it was rebooted. openshift-sdn just isn't smart enough currently to distinguish that from the case where it should have been able to reattach the pod)
Ugh, ok, it's because we're killing the pods after failing to reattach them, and crio ends up doing a CNI DELETE for each one, at which point multus needs to query the apiserver to figure out if the pod has any additional networks to also delete, but that times out because 172.30.0.1 doesn't point to anything yet.
Ugh, interesting. Multus should absolutely be checkpointing this - especially since the apiserver state could have changed after the pod was created. Doug, is there any way to configure multus to checkpoint added networks to disk, or will that require a code change? As a workaround, can we ensure the kubelet is NodeNotReady until services have been synced? It will require a bit of yucky plumbing in the proxy, but should otherwise be doable.
CC doug and feng. Doug, Feng, multus should not rely on the apiserver being up to do a DELETE. If there's no way to get rid of that requirement, it should be connecting to the "real" apiserver as opposed to the service IP. You can get that information from the operator, since we plumb it through the environment.
(In reply to Casey Callendrello from comment #8) > As a workaround, can we ensure the kubelet is NodeNotReady until services > have been synced? It will require a bit of yucky plumbing in the proxy, but > should otherwise be doable. Well the bigger problem is that as things were written before, we weren't even starting the proxy until after we had finished cleaning up the old pods. origin #37414 fixes that. But after talking some more, dcbw and I decided we should probably fix up multus to use the real apiserver too, though that PR isn't written yet.
PR has merged: https://github.com/openshift/cluster-network-operator/pull/139
This should be fixed by either of https://github.com/openshift/cluster-network-operator/pull/139 or https://github.com/openshift/origin/pull/22471. (And twice as fixed with both of them!) The effect of either fix is that there's no longer a long pause in SDN startup, so service IPs should be ready now before any other pods try to access them.
Tested on 4.0.0-0.nightly-2019-04-10-182914 The sdn node can finish the start up in 10 seconds. # oc logs sdn-7c5bm | grep node.go I0411 06:37:33.798980 41886 node.go:148] Initializing SDN node of type "redhat/openshift-ovs-networkpolicy" with configured hostname "ip-10-0-149-3.ap-northeast-1.compute.internal" (IP ""), iptables sync period "30s" I0411 06:37:33.803646 41886 node.go:267] Starting openshift-sdn network plugin I0411 06:37:43.323230 41886 node.go:326] Starting openshift-sdn pod manager I0411 06:37:43.324888 41886 node.go:334] OVS bridge has been recreated. Will reattach 5 existing pods... I0411 06:37:43.332196 41886 node.go:389] Interface veth91f19b45 for pod 'openshift-image-registry/node-ca-kxt2n' no longer exists I0411 06:37:43.336999 41886 node.go:389] Interface vethf4e807fa for pod 'openshift-monitoring/alertmanager-main-0' no longer exists I0411 06:37:43.341730 41886 node.go:389] Interface veth3a59281d for pod 'openshift-dns/dns-default-qqs45' no longer exists I0411 06:37:43.346474 41886 node.go:389] Interface vethbf232f67 for pod 'openshift-monitoring/prometheus-adapter-56c84b6556-tqkdm' no longer exists I0411 06:37:43.351180 41886 node.go:389] Interface veth6ba7c3de for pod 'openshift-monitoring/grafana-56879d5757-swp68' no longer exists I0411 06:37:43.351206 41886 node.go:350] openshift-sdn network plugin registering startup I0411 06:37:43.351346 41886 node.go:368] openshift-sdn network plugin ready And the multus is using the alias of internal lb to access the api service, which will not need the kube-proxy started firstly [root@ip-10-0-130-51 ~]# cat /etc/kubernetes/cni/net.d/multus.d/multus.kubeconfig # Kubeconfig file for Multus CNI plugin. apiVersion: v1 kind: Config clusters: - name: local cluster: server: https://api.qe-bmeng-ap-411.qe.devcluster.openshift.com:6443
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758