Hide Forgot
Description of problem: For zVM environments: 1. OCP 4.10 nightly builds starting with 4.10.0-0.nightly-s390x-2021-12-18-034912 and through 4.10.0-0.nightly-s390x-2022-01-02-012917 fail to upgrade from OCP 4.9.11 and 4.9.12 when using network type OVNKubernetes (OVN). 2. These same OCP 4.10 nightly builds starting with 4.10.0-0.nightly-s390x-2021-12-18-034912 and through 4.10.0-0.nightly-s390x-2022-01-02-012917 successfully upgrade from OCP 4.9.11 and 4.9.12 when using network type openshiftSDN (OVS). 3. For these OCP 4.9.11 and 4.9.12 to OCP 4.10 upgrade failures, using the OCP nightly build 4.10.0-0.nightly-s390x-2021-12-24-235654 as an example, the "oc get clusterversion" command consistently reports: "Unable to apply 4.10.0-0.nightly-s390x-2021-12-24-235654: wait has exceeded 40 minutes for these operators: ingress" 4. For these upgrade failures, using the OCP nightly build 4.10.0-0.nightly-s390x-2021-12-24-235654 as an example, the "oc get co" command consistently reports information similar to the following: NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.10.0-0.nightly-s390x-2021-12-24-235654 True False True 35m APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-oauth-apiserver ()... baremetal 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 100m cloud-controller-manager 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 105m cloud-credential 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 104m cluster-autoscaler 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 100m config-operator 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 102m console 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 49m csi-snapshot-controller 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 101m dns 4.10.0-0.nightly-s390x-2021-12-24-235654 True True True 100m DNS default is degraded etcd 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 100m image-registry 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 93m ingress 4.10.0-0.nightly-s390x-2021-12-24-235654 True False True 92m The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: PodsScheduled=False (PodsNotScheduled: Some pods are not scheduled: Pod "router-default-cd6bdf7dd-h9nrx" cannot be scheduled: 0/5 nodes are available: 1 node(s) didn't have free ports for the requested pod ports, 2 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 2 node(s) were unschedulable. Make sure you have sufficient worker nodes.) insights 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 94m kube-apiserver 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 97m kube-controller-manager 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 99m kube-scheduler 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 100m kube-storage-version-migrator 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 37m machine-api 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 101m machine-approver 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 101m machine-config 4.9.11 True True True 101m Unable to apply 4.10.0-0.nightly-s390x-2021-12-24-235654: timed out waiting for the condition during syncRequiredMachineConfigPools: pool master has not progressed to latest configuration: controller version mismatch for rendered-master-091362904afdd033e03a72cdede84f52 expected f8249fc84f1a7dfd655c88fae80811ee9c76c34c has ddd96b04ede2eba72afea1355468a9985aacafe6: 0 (ready 0) out of 3 nodes are updating to latest configuration rendered-master-108b993eeddf5639db907c553a9834dc, retrying marketplace 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 100m monitoring 4.10.0-0.nightly-s390x-2021-12-24-235654 False True True 22m Rollout of the monitoring stack failed and is degraded. Please investigate the degraded status error. network 4.10.0-0.nightly-s390x-2021-12-24-235654 True True True 102m DaemonSet "openshift-ovn-kubernetes/ovnkube-node" rollout is not making progress - pod ovnkube-node-2fhwv is in CrashLoopBackOff State... node-tuning 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 93m openshift-apiserver 4.10.0-0.nightly-s390x-2021-12-24-235654 True False True 94m APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-apiserver () openshift-controller-manager 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 97m openshift-samples 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 61m operator-lifecycle-manager 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 101m operator-lifecycle-manager-catalog 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 101m operator-lifecycle-manager-packageserver 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 58m service-ca 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 102m storage 4.10.0-0.nightly-s390x-2021-12-24-235654 True False False 102m 5. For the network cluster operator, the ovnkube pods are in a CrashLoopBackOff state. For example: DaemonSet "openshift-ovn-kubernetes/ovnkube-node" rollout is not making progress - pod ovnkube-node-2fhwv is in CrashLoopBackOff State... Version-Release number of selected component (if applicable): This issue has been consistently found with the following tested builds: 1. 4.10.0-0.nightly-s390x-2021-12-18-034912 2. 4.10.0-0.nightly-s390x-2021-12-20-215258 3. 4.10.0-0.nightly-s390x-2021-12-21-231942 4. 4.10.0-0.nightly-s390x-2021-12-22-053640 5. 4.10.0-0.nightly-s390x-2021-12-23-063012 6. 4.10.0-0.nightly-s390x-2021-12-24-010839 7. 4.10.0-0.nightly-s390x-2021-12-24-154536 8. 4.10.0-0.nightly-s390x-2021-12-24-235654 9. 4.10.0-0.nightly-s390x-2022-01-02-012917 How reproducible: Consistently reproducible. Steps to Reproduce: 1. In a zVM environment, using network type OVNKubernetes, attempt to upgrade from OCP 4.9.11 or 4.9.12 to any OCP 4.10 nightly build between 4.10.0-0.nightly-s390x-2021-12-18-034912 and 4.10.0-0.nightly-s390x-2022-01-02-012917. Actual results: Upgrade fails with above network cluster operator ovnkube pod CrashLoopBackOff issues. Expected results: Upgrades should succeed as they consistently do for network type openshiftSDN (OVS). Additional info: Thank you.
can you get the logs from the crashing pod? like this: oc logs -n openshift-ovn-kubernetes ovnkube-node-2fhwv -c ovnkube-node
Prashanth, Thank you for your assistance and Happy New Year :) Please see below the requested information from a recreate using OCP 4.10 nightly build 4.10.0-0.nightly-s390x-2022-01-02-012917, using the command: "oc logs -n openshift-ovn-kubernetes ovnkube-node-hrlgv -c ovnkube-node" Thank you, Kyle + [[ -f /env/worker-0.pok-99.ocptest.pok.stglabs.ibm.com ]] ++ date '+%m%d %H:%M:%S.%N' + echo 'I0103 17:18:47.674099253 - waiting for db_ip addresses' I0103 17:18:47.674099253 - waiting for db_ip addresses + cp -f /usr/libexec/cni/ovn-k8s-cni-overlay /cni-bin-dir/ + ovn_config_namespace=openshift-ovn-kubernetes ++ date '+%m%d %H:%M:%S.%N' I0103 17:18:48.022119631 - disable conntrack on geneve port + echo 'I0103 17:18:48.022119631 - disable conntrack on geneve port' + iptables -t raw -A PREROUTING -p udp --dport 6081 -j NOTRACK + iptables -t raw -A OUTPUT -p udp --dport 6081 -j NOTRACK + ip6tables -t raw -A PREROUTING -p udp --dport 6081 -j NOTRACK + ip6tables -t raw -A OUTPUT -p udp --dport 6081 -j NOTRACK + retries=0 + true ++ timeout 30 kubectl get ep -n openshift-ovn-kubernetes ovnkube-db -o 'jsonpath={.subsets[0].addresses[0].ip}' + db_ip=10.20.116.211 + [[ -n 10.20.116.211 ]] + break ++ date '+%m%d %H:%M:%S.%N' I0103 17:18:48.267568989 - starting ovnkube-node db_ip 10.20.116.211 + echo 'I0103 17:18:48.267568989 - starting ovnkube-node db_ip 10.20.116.211' + '[' shared == shared ']' + gateway_mode_flags='--gateway-mode shared --gateway-interface br-ex' + export_network_flows_flags= + [[ -n '' ]] + [[ -n '' ]] + [[ -n '' ]] + [[ -n '' ]] + [[ -n '' ]] + [[ -n '' ]] + gw_interface_flag= + '[' -d /sys/class/net/br-ex1 ']' + node_mgmt_port_netdev_flags= + [[ -n '' ]] + exec /usr/bin/ovnkube --init-node worker-0.pok-99.ocptest.pok.stglabs.ibm.com --nb-address ssl:10.20.116.211:9641,ssl:10.20.116.212:9641,ssl:10.20.116.213:9641 --sb-address ssl:10.20.116.211:9642,ssl:10.20.116.212:9642,ssl:10.20.116.213:9642 --nb-client-privkey /ovn-cert/tls.key --nb-client-cert /ovn-cert/tls.crt --nb-client-cacert /ovn-ca/ca-bundle.crt --nb-cert-common-name ovn --sb-client-privkey /ovn-cert/tls.key --sb-client-cert /ovn-cert/tls.crt --sb-client-cacert /ovn-ca/ca-bundle.crt --sb-cert-common-name ovn --config-file=/run/ovnkube-config/ovnkube.conf --loglevel 4 --inactivity-probe=180000 --gateway-mode shared --gateway-interface br-ex --metrics-bind-address 127.0.0.1:29103 --ovn-metrics-bind-address 127.0.0.1:29105 --metrics-enable-pprof I0103 17:18:48.393572 186388 ovs.go:93] Maximum command line arguments set to: 191102 I0103 17:18:48.396652 186388 config.go:1674] Parsed config file /run/ovnkube-config/ovnkube.conf I0103 17:18:48.396676 186388 config.go:1675] Parsed config: {Default:{MTU:1400 RoutableMTU:0 ConntrackZone:64000 EncapType:geneve EncapIP: EncapPort:6081 InactivityProbe:100000 OpenFlowProbe:180 MonitorAll:true LFlowCacheEnable:true LFlowCacheLimit:0 LFlowCacheLimitKb:1048576 RawClusterSubnets:10.128.0.0/14/23 ClusterSubnets:[]} Logging:{File: CNIFile: Level:4 LogFileMaxSize:100 LogFileMaxBackups:5 LogFileMaxAge:5 ACLLoggingRateLimit:20} Monitoring:{RawNetFlowTargets: RawSFlowTargets: RawIPFIXTargets: NetFlowTargets:[] SFlowTargets:[] IPFIXTargets:[]} IPFIX:{Sampling:400 CacheActiveTimeout:60 CacheMaxFlows:0} CNI:{ConfDir:/etc/cni/net.d Plugin:ovn-k8s-cni-overlay} OVNKubernetesFeature:{EnableEgressIP:true EnableEgressFirewall:true} Kubernetes:{Kubeconfig: CACert: CAData:[] APIServer:https://api-int.pok-99.ocptest.pok.stglabs.ibm.com:6443 Token: CompatServiceCIDR: RawServiceCIDRs:172.30.0.0/16 ServiceCIDRs:[] OVNConfigNamespace:openshift-ovn-kubernetes MetricsBindAddress: OVNMetricsBindAddress: MetricsEnablePprof:false OVNEmptyLbEvents:false PodIP: RawNoHostSubnetNodes: NoHostSubnetNodes:nil HostNetworkNamespace:openshift-host-network PlatformType:None} OvnNorth:{Address: PrivKey: Cert: CACert: CertCommonName: Scheme: ElectionTimer:0 northbound:false exec:<nil>} OvnSouth:{Address: PrivKey: Cert: CACert: CertCommonName: Scheme: ElectionTimer:0 northbound:false exec:<nil>} Gateway:{Mode:shared Interface: EgressGWInterface: NextHop: VLANID:0 NodeportEnable:true DisableSNATMultipleGWs:false V4JoinSubnet:100.64.0.0/16 V6JoinSubnet:fd98::/64 DisablePacketMTUCheck:false RouterSubnet:} MasterHA:{ElectionLeaseDuration:60 ElectionRenewDeadline:30 ElectionRetryPeriod:20} HybridOverlay:{Enabled:false RawClusterSubnets: ClusterSubnets:[] VXLANPort:4789} OvnKubeNode:{Mode:full MgmtPortNetdev: DisableOVNIfaceIdVer:false}} I0103 17:18:48.398965 186388 node.go:330] OVN Kube Node initialization, Mode: full I0103 17:18:48.399217 186388 reflector.go:219] Starting reflector *v1.Pod (0s) from k8s.io/client-go/informers/factory.go:134 I0103 17:18:48.399234 186388 reflector.go:255] Listing and watching *v1.Pod from k8s.io/client-go/informers/factory.go:134 I0103 17:18:48.399246 186388 reflector.go:219] Starting reflector *v1.Node (0s) from k8s.io/client-go/informers/factory.go:134 I0103 17:18:48.399260 186388 reflector.go:255] Listing and watching *v1.Node from k8s.io/client-go/informers/factory.go:134 I0103 17:18:48.399270 186388 reflector.go:219] Starting reflector *v1.Service (0s) from k8s.io/client-go/informers/factory.go:134 I0103 17:18:48.399280 186388 reflector.go:255] Listing and watching *v1.Service from k8s.io/client-go/informers/factory.go:134 I0103 17:18:48.399975 186388 reflector.go:219] Starting reflector *v1.Endpoints (0s) from k8s.io/client-go/informers/factory.go:134 I0103 17:18:48.399992 186388 reflector.go:255] Listing and watching *v1.Endpoints from k8s.io/client-go/informers/factory.go:134 I0103 17:18:48.499337 186388 shared_informer.go:270] caches populated I0103 17:18:48.499371 186388 shared_informer.go:270] caches populated I0103 17:18:48.499378 186388 shared_informer.go:270] caches populated I0103 17:18:48.499384 186388 shared_informer.go:270] caches populated I0103 17:18:48.519823 186388 config.go:1216] exec: /usr/bin/ovs-vsctl --timeout=15 del-ssl I0103 17:18:48.545175 186388 config.go:1216] exec: /usr/bin/ovs-vsctl --timeout=15 set-ssl /ovn-cert/tls.key /ovn-cert/tls.crt /ovn-ca/ca-bundle.crt I0103 17:18:48.568785 186388 config.go:1216] exec: /usr/bin/ovs-vsctl --timeout=15 set Open_vSwitch . external_ids:ovn-remote="ssl:10.20.116.211:9642,ssl:10.20.116.212:9642,ssl:10.20.116.213:9642" I0103 17:18:48.573402 186388 ovs.go:204] exec(1): /usr/bin/ovs-vsctl --timeout=15 set Open_vSwitch . external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=10.20.116.214 external_ids:ovn-remote-probe-interval=180000 external_ids:ovn-openflow-probe-interval=180 external_ids:hostname="worker-0.pok-99.ocptest.pok.stglabs.ibm.com" external_ids:ovn-monitor-all=true external_ids:ovn-enable-lflow-cache=true external_ids:ovn-limit-lflow-cache-kb=1048576 I0103 17:18:48.576954 186388 ovs.go:207] exec(1): stdout: "" I0103 17:18:48.576985 186388 ovs.go:208] exec(1): stderr: "" I0103 17:18:48.576999 186388 ovs.go:204] exec(2): /usr/bin/ovs-vsctl --timeout=15 -- clear bridge br-int netflow -- clear bridge br-int sflow -- clear bridge br-int ipfix I0103 17:18:48.584825 186388 ovs.go:207] exec(2): stdout: "" I0103 17:18:48.584836 186388 ovs.go:208] exec(2): stderr: "" I0103 17:18:48.594692 186388 node.go:386] Node worker-0.pok-99.ocptest.pok.stglabs.ibm.com ready for ovn initialization with subnet 10.128.2.0/23 I0103 17:18:48.594714 186388 ovs.go:204] exec(3): /usr/bin/ovn-sbctl --private-key=/ovn-cert/tls.key --certificate=/ovn-cert/tls.crt --bootstrap-ca-cert=/ovn-ca/ca-bundle.crt --db=ssl:10.20.116.211:9642,ssl:10.20.116.212:9642,ssl:10.20.116.213:9642 --timeout=15 --columns=up list Port_Binding I0103 17:18:48.656502 186388 ovs.go:207] exec(3): stdout: "up : true\n\nup : true\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : true\n\nup : true\n\nup : false\n\nup : false\n\nup : false\n\nup : true\n\nup : false\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : false\n\nup : true\n\nup : true\n\nup : true\n\nup : false\n\nup : false\n\nup : false\n\nup : false\n\nup : true\n\nup : false\n\nup : false\n\nup : true\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : false\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : false\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : false\n\nup : false\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : false\n\nup : false\n\nup : false\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : false\n\nup : false\n\nup : false\n\nup : true\n\nup : true\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : false\n\nup : true\n\nup : true\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : true\n\nup : true\n\nup : false\n\nup : false\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : true\n\nup : true\n\nup : true\n\nup : false\n\nup : true\n\nup : true\n\nup : false\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : false\n\nup : true\n\nup : true\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : false\n\nup : false\n\nup : true\n\nup : false\n\nup : false\n\nup : false\n\nup : false\n\nup : false\n\nup : false\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : false\n\nup : true\n\nup : true\n\nup : true\n\nup : false\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : false\n\nup : false\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : false\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : false\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : false\n\nup : false\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : false\n\nup : true\n\nup : true\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : true\n\nup : true\n\nup : true\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : true\n\nup : false\n\nup : false\n\nup : false\n" I0103 17:18:48.656635 186388 ovs.go:208] exec(3): stderr: "" I0103 17:18:48.656650 186388 node.go:315] Detected support for port binding with external IDs I0103 17:18:48.656751 186388 ovs.go:204] exec(4): /usr/bin/ovs-vsctl --timeout=15 -- --if-exists del-port br-int k8s-worker-0.po -- --may-exist add-port br-int ovn-k8s-mp0 -- set interface ovn-k8s-mp0 type=internal mtu_request=1400 external-ids:iface-id=k8s-worker-0.pok-99.ocptest.pok.stglabs.ibm.com I0103 17:18:48.662638 186388 ovs.go:207] exec(4): stdout: "" I0103 17:18:48.662650 186388 ovs.go:208] exec(4): stderr: "" I0103 17:18:48.662658 186388 ovs.go:204] exec(5): /usr/bin/ovs-vsctl --timeout=15 --if-exists get interface ovn-k8s-mp0 mac_in_use I0103 17:18:48.666108 186388 ovs.go:207] exec(5): stdout: "\"66:46:54:36:9e:40\"\n" I0103 17:18:48.666118 186388 ovs.go:208] exec(5): stderr: "" I0103 17:18:48.666129 186388 ovs.go:204] exec(6): /usr/bin/ovs-vsctl --timeout=15 set interface ovn-k8s-mp0 mac=66\:46\:54\:36\:9e\:40 I0103 17:18:48.670471 186388 ovs.go:207] exec(6): stdout: "" I0103 17:18:48.670484 186388 ovs.go:208] exec(6): stderr: "" I0103 17:18:48.719025 186388 gateway_init.go:261] Initializing Gateway Functionality I0103 17:18:48.719276 186388 gateway_localnet.go:131] Node local addresses initialized to: map[10.128.2.2:{10.128.2.0 fffffe00} 10.20.116.214:{10.20.116.0 ffffff00} 127.0.0.1:{127.0.0.0 ff000000} ::1:{::1 ffffffffffffffffffffffffffffffff} fe80::6446:54ff:fe36:9e40:{fe80:: ffffffffffffffff0000000000000000} fe80::d0a8:b5ff:fe0f:5ad3:{fe80:: ffffffffffffffff0000000000000000}] I0103 17:18:48.719426 186388 helper_linux.go:74] Found default gateway interface enc2e0 10.20.116.247 F0103 17:18:48.719469 186388 ovnkube.go:133] could not find IP addresses: failed to lookup link br-ex: Link not found #
it's failing to setup br-ex: F0103 17:18:48.719469 186388 ovnkube.go:133] could not find IP addresses: failed to lookup link br-ex: Link not found Kyle, Can you log into the nodes which exhibit this problem and check this: systemctl status ovs-configuration and then restart this service? systemctl restart ovs-configuration once you restart this, can you delete the ovnkube-node pod and it should get recreated properly. I got these instructions from a slack thread exhibiting similar symptoms. Let me know if this works and we can escalate to the networking team. also for debugging could you grab the whole journalctl output? Thanks Prashanth
Prashanth, Thanks. Please see the requested information below. Thank you, Kyle [core@worker-0 ~]$ sudo bash [systemd] Failed Units: 1 ovs-configuration.service [root@worker-0 core]# systemctl status ovs-configuration ● ovs-configuration.service - Configures OVS with proper host networking configuration Loaded: loaded (/etc/systemd/system/ovs-configuration.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Mon 2022-01-03 11:08:08 UTC; 1 day 6h ago Main PID: 1694 (code=exited, status=1/FAILURE) CPU: 806ms Jan 03 11:08:08 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: default via 10.20.116.247 dev enc2e0 proto static metric 100 Jan 03 11:08:08 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: 10.20.116.0/24 dev enc2e0 proto kernel scope link src 10.20.116.214 metric 100 Jan 03 11:08:08 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: + ip -6 route show Jan 03 11:08:08 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: ::1 dev lo proto kernel metric 256 pref medium Jan 03 11:08:08 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: fe80::/64 dev genev_sys_6081 proto kernel metric 256 pref medium Jan 03 11:08:08 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: + exit 1 Jan 03 11:08:08 worker-0.pok-99.ocptest.pok.stglabs.ibm.com systemd[1]: ovs-configuration.service: Main process exited, code=exited, status=1/FAILURE Jan 03 11:08:08 worker-0.pok-99.ocptest.pok.stglabs.ibm.com systemd[1]: ovs-configuration.service: Failed with result 'exit-code'. Jan 03 11:08:08 worker-0.pok-99.ocptest.pok.stglabs.ibm.com systemd[1]: Failed to start Configures OVS with proper host networking configuration. Jan 03 11:08:08 worker-0.pok-99.ocptest.pok.stglabs.ibm.com systemd[1]: ovs-configuration.service: Consumed 806ms CPU time [root@worker-0 core]# systemctl restart ovs-configuration Job for ovs-configuration.service failed because the control process exited with error code. See "systemctl status ovs-configuration.service" and "journalctl -xe" for details. [root@worker-0 core]# [root@worker-0 core]# journalctl -xe Jan 04 18:08:01 worker-0.pok-99.ocptest.pok.stglabs.ibm.com systemd[1]: crio-conmon-e709273713b3386247e178bacbeba6f923c932b9e6858052b18bf45bbb8dcec5.scope: Succeeded. -- Subject: Unit succeeded -- Defined-By: systemd -- Support: https://access.redhat.com/support -- -- The unit crio-conmon-e709273713b3386247e178bacbeba6f923c932b9e6858052b18bf45bbb8dcec5.scope has successfully entered the 'dead' state. Jan 04 18:08:01 worker-0.pok-99.ocptest.pok.stglabs.ibm.com systemd[1]: crio-conmon-e709273713b3386247e178bacbeba6f923c932b9e6858052b18bf45bbb8dcec5.scope: Consumed 36ms CPU time -- Subject: Resources consumed by unit runtime -- Defined-By: systemd -- Support: https://access.redhat.com/support -- -- The unit crio-conmon-e709273713b3386247e178bacbeba6f923c932b9e6858052b18bf45bbb8dcec5.scope completed and consumed the indicated resources. Jan 04 18:08:02 worker-0.pok-99.ocptest.pok.stglabs.ibm.com hyperkube[2896]: I0104 18:08:02.133053 2896 logs.go:319] "Finished parsing log file" path="/var/log/pods/openshift-ovn-ku> Jan 04 18:08:02 worker-0.pok-99.ocptest.pok.stglabs.ibm.com hyperkube[2896]: I0104 18:08:02.134238 2896 logs.go:319] "Finished parsing log file" path="/var/log/pods/openshift-ovn-ku> Jan 04 18:08:02 worker-0.pok-99.ocptest.pok.stglabs.ibm.com hyperkube[2896]: I0104 18:08:02.135942 2896 generic.go:296] "Generic (PLEG): container finished" podID=a73bf963-c4b4-4525> Jan 04 18:08:02 worker-0.pok-99.ocptest.pok.stglabs.ibm.com hyperkube[2896]: I0104 18:08:02.135993 2896 kubelet.go:2115] "SyncLoop (PLEG): event for pod" pod="openshift-ovn-kubernet> Jan 04 18:08:02 worker-0.pok-99.ocptest.pok.stglabs.ibm.com hyperkube[2896]: I0104 18:08:02.136033 2896 scope.go:110] "RemoveContainer" containerID="c2ef48dbfd18701667ee459252beaf10> Jan 04 18:08:02 worker-0.pok-99.ocptest.pok.stglabs.ibm.com hyperkube[2896]: I0104 18:08:02.137687 2896 scope.go:110] "RemoveContainer" containerID="e709273713b3386247e178bacbeba6f9> Jan 04 18:08:02 worker-0.pok-99.ocptest.pok.stglabs.ibm.com hyperkube[2896]: E0104 18:08:02.139536 2896 pod_workers.go:836] "Error syncing pod, skipping" err="failed to \"StartConta> Jan 04 18:08:02 worker-0.pok-99.ocptest.pok.stglabs.ibm.com crio[2848]: time="2022-01-04 18:08:02.140576557Z" level=info msg="Removing container: c2ef48dbfd18701667ee459252beaf101bf25c> Jan 04 18:08:02 worker-0.pok-99.ocptest.pok.stglabs.ibm.com systemd[1]: var-lib-containers-storage-overlay-a12fbb526ac9d51b062500745cdf962ad0a5867a832b8003023d822d8dff34d0-merged.mount> -- Subject: Unit succeeded -- Defined-By: systemd -- Support: https://access.redhat.com/support -- -- The unit var-lib-containers-storage-overlay-a12fbb526ac9d51b062500745cdf962ad0a5867a832b8003023d822d8dff34d0-merged.mount has successfully entered the 'dead' state. Jan 04 18:08:02 worker-0.pok-99.ocptest.pok.stglabs.ibm.com systemd[910833]: var-lib-containers-storage-overlay-a12fbb526ac9d51b062500745cdf962ad0a5867a832b8003023d822d8dff34d0-merged.> -- Subject: Unit succeeded -- Defined-By: systemd -- Support: https://access.redhat.com/support -- -- The unit UNIT has successfully entered the 'dead' state. Jan 04 18:08:02 worker-0.pok-99.ocptest.pok.stglabs.ibm.com systemd[1]: var-lib-containers-storage-overlay-a12fbb526ac9d51b062500745cdf962ad0a5867a832b8003023d822d8dff34d0-merged.mount> -- Subject: Resources consumed by unit runtime -- Defined-By: systemd -- Support: https://access.redhat.com/support -- -- The unit var-lib-containers-storage-overlay-a12fbb526ac9d51b062500745cdf962ad0a5867a832b8003023d822d8dff34d0-merged.mount completed and consumed the indicated resources. Jan 04 18:08:02 worker-0.pok-99.ocptest.pok.stglabs.ibm.com crio[2848]: time="2022-01-04 18:08:02.451982390Z" level=info msg="Removed container c2ef48dbfd18701667ee459252beaf101bf25cf5> Jan 04 18:08:03 worker-0.pok-99.ocptest.pok.stglabs.ibm.com hyperkube[2896]: I0104 18:08:03.142584 2896 logs.go:319] "Finished parsing log file" path="/var/log/pods/openshift-ovn-ku> Jan 04 18:08:03 worker-0.pok-99.ocptest.pok.stglabs.ibm.com hyperkube[2896]: I0104 18:08:03.149015 2896 scope.go:110] "RemoveContainer" containerID="e709273713b3386247e178bacbeba6f9> Jan 04 18:08:03 worker-0.pok-99.ocptest.pok.stglabs.ibm.com hyperkube[2896]: E0104 18:08:03.163965 2896 pod_workers.go:836] "Error syncing pod, skipping" err="failed to \"StartConta> [root@worker-0 core]#
Thanks Kyle. I think the network manager journal output might be more helpful: journalctl -b -u NetworkManager also could you follow the steps i mentioned above to see if that resolves the problem? Also is this problem intermittent or is it happening on every install/upgrade? Thanks Prashanth
Prashanth, Upon deleting 1 of the 2 ovnkube pods in CrashLoopBackOff loop mode, the pod recreated and proceeded looping CrashLoopBackOff again. Thank you, Kyle [root@ospbmgr7 ~]# oc get pods -A | grep ovnkube openshift-ovn-kubernetes ovnkube-master-dmvkt 6/6 Running 0 31h openshift-ovn-kubernetes ovnkube-master-hdrvp 6/6 Running 0 31h openshift-ovn-kubernetes ovnkube-master-xbcrk 6/6 Running 6 31h openshift-ovn-kubernetes ovnkube-node-2r4v6 5/5 Running 0 31h openshift-ovn-kubernetes ovnkube-node-hrlgv 4/5 CrashLoopBackOff 373 (4m39s ago) 31h openshift-ovn-kubernetes ovnkube-node-r2pvj 5/5 Running 0 31h openshift-ovn-kubernetes ovnkube-node-wmm6c 4/5 CrashLoopBackOff 373 (3m15s ago) 31h openshift-ovn-kubernetes ovnkube-node-xmzjn 5/5 Running 0 31h [root@ospbmgr7 ~]# oc delete pod ovnkube-node-hrlgv -n openshift-ovn-kubernetes pod "ovnkube-node-hrlgv" deleted [root@ospbmgr7 ~]# oc get pods -A | grep ovnkube openshift-ovn-kubernetes ovnkube-master-dmvkt 6/6 Running 0 31h openshift-ovn-kubernetes ovnkube-master-hdrvp 6/6 Running 0 31h openshift-ovn-kubernetes ovnkube-master-xbcrk 6/6 Running 6 31h openshift-ovn-kubernetes ovnkube-node-2r4v6 5/5 Running 0 31h openshift-ovn-kubernetes ovnkube-node-mwlm4 4/5 Error 1 (2s ago) 6s openshift-ovn-kubernetes ovnkube-node-r2pvj 5/5 Running 0 31h openshift-ovn-kubernetes ovnkube-node-wmm6c 4/5 CrashLoopBackOff 373 (4m27s ago) 31h openshift-ovn-kubernetes ovnkube-node-xmzjn 5/5 Running 0 31h [root@ospbmgr7 ~]# oc get pods -A | grep ovnkube openshift-ovn-kubernetes ovnkube-master-dmvkt 6/6 Running 0 31h openshift-ovn-kubernetes ovnkube-master-hdrvp 6/6 Running 0 31h openshift-ovn-kubernetes ovnkube-master-xbcrk 6/6 Running 6 31h openshift-ovn-kubernetes ovnkube-node-2r4v6 5/5 Running 0 31h openshift-ovn-kubernetes ovnkube-node-mwlm4 4/5 CrashLoopBackOff 1 (3s ago) 8s openshift-ovn-kubernetes ovnkube-node-r2pvj 5/5 Running 0 31h openshift-ovn-kubernetes ovnkube-node-wmm6c 4/5 CrashLoopBackOff 373 (4m29s ago) 31h openshift-ovn-kubernetes ovnkube-node-xmzjn 5/5 Running 0 31h [root@ospbmgr7 ~]# oc get pods -A | grep ovnkube openshift-ovn-kubernetes ovnkube-master-dmvkt 6/6 Running 0 31h openshift-ovn-kubernetes ovnkube-master-hdrvp 6/6 Running 0 31h openshift-ovn-kubernetes ovnkube-master-xbcrk 6/6 Running 6 31h openshift-ovn-kubernetes ovnkube-node-2r4v6 5/5 Running 0 31h openshift-ovn-kubernetes ovnkube-node-mwlm4 4/5 CrashLoopBackOff 4 (58s ago) 2m30s openshift-ovn-kubernetes ovnkube-node-r2pvj 5/5 Running 0 31h openshift-ovn-kubernetes ovnkube-node-wmm6c 4/5 CrashLoopBackOff 374 (107s ago) 31h openshift-ovn-kubernetes ovnkube-node-xmzjn 5/5 Running 0 31h [root@ospbmgr7 ~]#
Kyle, did you restart the ovs-configuration service on the node, make sure it succeeds and that the br-ex inteface is up and then try killing the pod? Prashanth
Prashanth, Please see comment #4 where the br-ex interface restart does not succeed. Thank you, Kyle
Prashanth, 1. This consistently occurs for every OCP 4.9.11 and 4.9.12 upgrade to OCP 4.10 starting with the 4.10.0-0.nightly-s390x-2021-12-18-034912 nightly build. 2. Installs of these OCP 4.10 nightly builds succeed (barring any other issues). 3. Working to provide the requested "journalctl -b -u NetworkManager" information. Thank you, Kyle
thanks Kyle. Could you also get the ovs-configuration logs? journalctl -b -u ovs-configuration
Created attachment 1848922 [details] worker-0 journalctl -b -u NetworkManager output Prashanth, This attachment contains the requested "journalctl -b -u NetworkManager" for worker-0. Thank you, Kyle
Created attachment 1848927 [details] worker-0 journalctl -b -u ovs-configuration output
i see this in the ovs-configuration logs: Jan 03 11:08:07 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: ipv4.method: manual Jan 03 11:08:07 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: + echo 'Static IP addressing detected on default gateway connection: 5a5fed82-e1bd-4caa-ba14-3dbc812edc26' Jan 03 11:08:07 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: Static IP addressing detected on default gateway connection: 5a5fed82-e1bd-4caa-ba14-3dbc812edc26 Jan 03 11:08:07 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: + egrep -l '^uuid=5a5fed82-e1bd-4caa-ba14-3dbc812edc26' '/etc/NetworkManager/systemConnectionsMerged/*' Jan 03 11:08:07 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: grep: /etc/NetworkManager/systemConnectionsMerged/*: No such file or directory Jan 03 11:08:07 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: + echo 'WARN: unable to find NM configuration file for conn: 5a5fed82-e1bd-4caa-ba14-3dbc812edc26. Attempting to clone conn' Jan 03 11:08:07 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: WARN: unable to find NM configuration file for conn: 5a5fed82-e1bd-4caa-ba14-3dbc812edc26. Attempting to clone conn Jan 03 11:08:07 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: + nmcli conn clone 5a5fed82-e1bd-4caa-ba14-3dbc812edc26 5a5fed82-e1bd-4caa-ba14-3dbc812edc26-clone Jan 03 11:08:07 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: Wired Connection (5a5fed82-e1bd-4caa-ba14-3dbc812edc26) cloned as 5a5fed82-e1bd-4caa-ba14-3dbc812edc26-clone (666a52d4-0c19-4659-8099-bd2b753d0a5a). Jan 03 11:08:07 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: + shopt -s nullglob Jan 03 11:08:07 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: + old_conn_files=(${NM_CONN_PATH}/"${old_conn}"-clone*) Jan 03 11:08:07 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: + shopt -u nullglob Jan 03 11:08:07 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: + '[' 0 -ne 1 ']' Jan 03 11:08:07 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: + echo 'ERROR: unable to locate cloned conn file for 5a5fed82-e1bd-4caa-ba14-3dbc812edc26-clone' Jan 03 11:08:07 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: ERROR: unable to locate cloned conn file for 5a5fed82-e1bd-4caa-ba14-3dbc812edc26-clone Jan 03 11:08:07 worker-0.pok-99.ocptest.pok.stglabs.ibm.com configure-ovs.sh[1694]: + exit 1 looks like it couldn't find the cloned file to restore the config after boot. in 4.10 the systemConnectionsMerged directory has been removed through https://github.com/openshift/machine-config-operator/pull/2742. Kyle, could you also get the output of journalctl -b -u ovs-configuration on a node where the ovnkube-node pod successfully deployed? Thanks
Created attachment 1848952 [details] master-0 journalctl -b -u ovs-configuration output Prashanth, Here's the journalctl -b -u ovs-configuration output for master-0, which is successful for it's ovnkube pod operation. Thank you, Kyle
Created attachment 1848953 [details] worker-1 journalctl -b -u ovs-configuration output Prashanth, Here's the journalctl -b -u ovs-configuration output for worker-1, which is successful for it's ovnkube pod operation. Thank you, Kyle
Hmm...that looks like the old script..looks like worker-1 hasn't started updating yet... worker-0 was the first to upgrade and encountered the error Which was the last 4.10 build to succeed in upgrading with OVNKubernetes? could you let me know the exact version so i can look at the changes?
hi Kyle, could you also try making a slight modification to the ovs configuration script(/usr/local/bin/configure-ovs.sh) on the node to see if it works? this is the modification: replace this section at the top of the script: NM_CONN_OVERLAY="/etc/NetworkManager/systemConnectionsMerged" NM_CONN_UNDERLAY="/etc/NetworkManager/system-connections" if [ -d "$NM_CONN_OVERLAY" ]; then NM_CONN_PATH="$NM_CONN_OVERLAY" else NM_CONN_PATH="$NM_CONN_UNDERLAY" fi with: NM_CONN_UNDERLAY="/etc/NetworkManager/system-connections" NM_CONN_PATH="$NM_CONN_UNDERLAY" and then run the script to see if it succeeds ? Thanks Prashanth
Prashanth, 1. The last OCP 4.10 nightly build for which the OCP 4.9.11 and 4.9.12 upgrades worked for network type OVNKubernetes is 4.10.0-0.nightly-s390x-2021-12-16-185334. 2. The OCP 4.9.11 and 4.9.12 upgrades to OCP 4.10 nightly build 4.10.0-0.nightly-s390x-2021-12-17-144433 are broken with the (different) issue documented in bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=2036571. 3. Given this (different) issue with the OCP 4.10 nightly build 4.10.0-0.nightly-s390x-2021-12-17-144433, the first OCP 4.10 nightly build that we see this OVNKubernetes network type upgrade issue is 4.10.0-0.nightly-s390x-2021-12-18-034912. Thank you, Kyle
Hi Prashanth, can we assign this bug to you since you have already started the conversation with Kyle?
(In reply to Dan Li from comment #19) > Hi Prashanth, can we assign this bug to you since you have already started > the conversation with Kyle? sounds good Dan
Thanks Prashanth! Changing the assignee. Would you provide or set a "Priority" level for this bug as a part of the triage process? Also adding reviewed-in-sprint flag as we are still investigating and seems unlikely (if there are any PRs) that this will be resolved before the end of this sprint.
(In reply to Prashanth Sundararaman from comment #17) > hi Kyle, > > could you also try making a slight modification to the ovs configuration > script(/usr/local/bin/configure-ovs.sh) on the node to see if it works? this > is the modification: > > replace this section at the top of the script: > > NM_CONN_OVERLAY="/etc/NetworkManager/systemConnectionsMerged" > NM_CONN_UNDERLAY="/etc/NetworkManager/system-connections" > if [ -d "$NM_CONN_OVERLAY" ]; then > NM_CONN_PATH="$NM_CONN_OVERLAY" > else > NM_CONN_PATH="$NM_CONN_UNDERLAY" > fi > > > with: > > NM_CONN_UNDERLAY="/etc/NetworkManager/system-connections" > NM_CONN_PATH="$NM_CONN_UNDERLAY" > > > and then run the script to see if it succeeds ? > > Thanks > Prashanth This is good information Kyle. the commit diff between the build on 16th and the one on 18th includes this PR: https://github.com/openshift/machine-config-operator/pull/2864 is probably causing this issue. Could you try the workaround mentioned in comment#17? thanks!
Prashanth, Thanks for the update. This is good news. I tried the workaround mentioned in comment#17 last night and it did not seem to work, and will try again today and provide an update. Thank you, Kyle
Kyle, When you have some time today can you reach out to me on slack so we can try debugging this together. the zVM setups we have here do not support OVN Kubernetes as they are not vxlan aware. Thanks Prashanth
Pranshath, Thanks for all the information we exchanged on slack. Here's some additional information and logs to help with debug of this issue. 1. Attempted to upgrade from OCP 4.9.13 to 4.10.0-0.nightly-s390x-2022-01-07-024817. 2. Same network operator ovnkube CrashLoopBackOff issues seen when at "machine-config" stage of upgrade from the "oc get co" command output. Here is the "oc get nodes" output for the master-2 node's network co: DaemonSet "openshift-ovn-kubernetes/ovnkube-node" rollout is not making progress - pod ovnkube-node-927lm is in CrashLoopBackOff State... 3. Here are the 2 CrashLoopBackOff ovnkube-node pods (I had already deleted the worker-1 ovnkube-node pod for recreate purposes in the output below): [root@ospbmgr7 bin]# oc get pods -o wide -n openshift-ovn-kubernetes NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ovnkube-master-bmz85 6/6 Running 0 55m 10.20.116.211 master-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-master-r58v2 6/6 Running 6 53m 10.20.116.213 master-2.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-master-vbsdb 6/6 Running 0 57m 10.20.116.212 master-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-927lm 4/5 CrashLoopBackOff 7 (4m32s ago) 16m 10.20.116.213 master-2.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-cmqmd 5/5 Running 0 59m 10.20.116.214 worker-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-jbn9q 5/5 Running 0 58m 10.20.116.212 master-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-wq6hx 4/5 CrashLoopBackOff 8 (34s ago) 17m 10.20.116.215 worker-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-xvvjf 5/5 Running 0 59m 10.20.116.211 master-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> 4. Here is the requested modified start of the worker-1 /usr/local/bin/configure-ovs.sh script: [root@worker-1 ~]# cd /usr/local/bin [root@worker-1 bin]# [root@worker-1 bin]# ls -al total 44 drwxr-xr-x. 2 root root 79 Jan 7 11:30 . drwxr-xr-x. 11 root root 114 Jan 7 03:32 .. -rwxr-xr-x. 1 root root 20023 Jan 7 11:21 configure-ovs.sh -rwxr-xr-x. 1 root root 20161 Jan 7 10:55 configure-ovs.sh.save -rwxr-xr-x. 1 root root 2275 Jan 7 10:55 mco-hostname [root@worker-1 bin]# [root@worker-1 bin]# more configure-ovs.sh #!/bin/bash set -eux # This file is not needed anymore in 4.7+, but when rolling back to 4.6 # the ovs pod needs it to know ovs is running on the host. touch /var/run/ovs-config-executed NM_CONN_UNDERLAY="/etc/NetworkManager/system-connections" NM_CONN_PATH="$NM_CONN_UNDERLAY" MANAGED_NM_CONN_SUFFIX="-slave-ovs-clone" # Workaround to ensure OVS is installed due to bug in systemd Requires: # https://bugzilla.redhat.com/show_bug.cgi?id=1888017 copy_nm_conn_files() { local src_path="$NM_CONN_PATH" local dst_path="$1" if [ "$src_path" = "$dst_path" ]; then echo "No need to copy configuration files, source and destination are the same" return fi if [ -d "$src_path" ]; then echo "$src_path exists" local files=("${MANAGED_NM_CONN_FILES[@]}") shopt -s nullglob files+=($src_path/*${MANAGED_NM_CONN_SUFFIX}.nmconnection $src_path/*${MANAGED_NM_CONN_SUFFIX}) shopt -u nullglob for file in "${files[@]}"; do file="$(basename $file)" if [ -f "$src_path/$file" ]; then if [ ! -f "$dst_path/$file" ]; then echo "Copying configuration $file" cp "$src_path/$file" "$dst_path/$file" elif ! cmp --silent "$src_path/$file" "$dst_path/$file"; then echo "Copying updated configuration $file" cp -f "$src_path/$file" "$dst_path/$file" else echo "Skipping $file since it's equal at destination" fi else [root@worker-1 bin]# 5. After updating the worker-1 /usr/local/bin/configure-ovs.sh script with the requested change, it returns an "exit 1". I'll be attaching a log for this. 6. After deleting the worker-1 ovnkube-node pod, ovnkube-node-wq6hx, and then waiting for it's recreate, we see the following: [root@ospbmgr7 bin]# oc delete pod ovnkube-node-wq6hx -n openshift-ovn-kubernetes pod "ovnkube-node-wq6hx" deleted [root@ospbmgr7 bin]# oc get pods -o wide -n openshift-ovn-kubernetes NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ovnkube-master-bmz85 6/6 Running 0 56m 10.20.116.211 master-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-master-r58v2 6/6 Running 6 54m 10.20.116.213 master-2.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-master-vbsdb 6/6 Running 0 58m 10.20.116.212 master-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-927lm 4/5 CrashLoopBackOff 8 (34s ago) 17m 10.20.116.213 master-2.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-cmqmd 5/5 Running 0 60m 10.20.116.214 worker-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-jbn9q 5/5 Running 0 59m 10.20.116.212 master-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-tkzvg 0/5 ContainerCreating 0 3s 10.20.116.215 worker-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-xvvjf 5/5 Running 0 61m 10.20.116.211 master-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> [root@ospbmgr7 bin]# oc get pods -o wide -n openshift-ovn-kubernetes NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ovnkube-master-bmz85 6/6 Running 0 56m 10.20.116.211 master-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-master-r58v2 6/6 Running 6 54m 10.20.116.213 master-2.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-master-vbsdb 6/6 Running 0 58m 10.20.116.212 master-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-927lm 4/5 CrashLoopBackOff 8 (42s ago) 17m 10.20.116.213 master-2.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-cmqmd 5/5 Running 0 60m 10.20.116.214 worker-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-jbn9q 5/5 Running 0 60m 10.20.116.212 master-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-tkzvg 4/5 Running 1 (3s ago) 11s 10.20.116.215 worker-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-xvvjf 5/5 Running 0 61m 10.20.116.211 master-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> [root@ospbmgr7 bin]# oc get pods -o wide -n openshift-ovn-kubernetes NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ovnkube-master-bmz85 6/6 Running 0 56m 10.20.116.211 master-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-master-r58v2 6/6 Running 6 54m 10.20.116.213 master-2.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-master-vbsdb 6/6 Running 0 59m 10.20.116.212 master-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-927lm 4/5 CrashLoopBackOff 8 (45s ago) 17m 10.20.116.213 master-2.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-cmqmd 5/5 Running 0 60m 10.20.116.214 worker-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-jbn9q 5/5 Running 0 60m 10.20.116.212 master-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-tkzvg 4/5 Running 1 (6s ago) 14s 10.20.116.215 worker-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-xvvjf 5/5 Running 0 61m 10.20.116.211 master-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> [root@ospbmgr7 bin]# oc get pods -o wide -n openshift-ovn-kubernetes NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ovnkube-master-bmz85 6/6 Running 0 56m 10.20.116.211 master-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-master-r58v2 6/6 Running 6 54m 10.20.116.213 master-2.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-master-vbsdb 6/6 Running 0 59m 10.20.116.212 master-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-927lm 4/5 CrashLoopBackOff 8 (47s ago) 17m 10.20.116.213 master-2.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-cmqmd 5/5 Running 0 60m 10.20.116.214 worker-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-jbn9q 5/5 Running 0 60m 10.20.116.212 master-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-tkzvg 4/5 Error 1 (8s ago) 16s 10.20.116.215 worker-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-xvvjf 5/5 Running 0 61m 10.20.116.211 master-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> [root@ospbmgr7 bin]# oc get pods -o wide -n openshift-ovn-kubernetes NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ovnkube-master-bmz85 6/6 Running 0 56m 10.20.116.211 master-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-master-r58v2 6/6 Running 6 54m 10.20.116.213 master-2.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-master-vbsdb 6/6 Running 0 59m 10.20.116.212 master-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-927lm 4/5 CrashLoopBackOff 8 (49s ago) 17m 10.20.116.213 master-2.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-cmqmd 5/5 Running 0 60m 10.20.116.214 worker-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-jbn9q 5/5 Running 0 60m 10.20.116.212 master-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-tkzvg 4/5 Error 1 (10s ago) 18s 10.20.116.215 worker-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-xvvjf 5/5 Running 0 61m 10.20.116.211 master-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> [root@ospbmgr7 bin]# oc get pods -o wide -n openshift-ovn-kubernetes NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ovnkube-master-bmz85 6/6 Running 0 56m 10.20.116.211 master-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-master-r58v2 6/6 Running 6 54m 10.20.116.213 master-2.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-master-vbsdb 6/6 Running 0 59m 10.20.116.212 master-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-927lm 4/5 CrashLoopBackOff 8 (51s ago) 17m 10.20.116.213 master-2.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-cmqmd 5/5 Running 0 60m 10.20.116.214 worker-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-jbn9q 5/5 Running 0 60m 10.20.116.212 master-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-tkzvg 4/5 Error 1 (12s ago) 20s 10.20.116.215 worker-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-xvvjf 5/5 Running 0 61m 10.20.116.211 master-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> [root@ospbmgr7 bin]# oc get pods -o wide -n openshift-ovn-kubernetes NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ovnkube-master-bmz85 6/6 Running 0 56m 10.20.116.211 master-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-master-r58v2 6/6 Running 6 54m 10.20.116.213 master-2.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-master-vbsdb 6/6 Running 0 59m 10.20.116.212 master-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-927lm 4/5 CrashLoopBackOff 8 (55s ago) 17m 10.20.116.213 master-2.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-cmqmd 5/5 Running 0 60m 10.20.116.214 worker-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-jbn9q 5/5 Running 0 60m 10.20.116.212 master-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-tkzvg 4/5 Error 1 (16s ago) 24s 10.20.116.215 worker-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-xvvjf 5/5 Running 0 61m 10.20.116.211 master-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> [root@ospbmgr7 bin]# oc get pods -o wide -n openshift-ovn-kubernetes NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ovnkube-master-bmz85 6/6 Running 0 56m 10.20.116.211 master-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-master-r58v2 6/6 Running 6 54m 10.20.116.213 master-2.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-master-vbsdb 6/6 Running 0 59m 10.20.116.212 master-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-927lm 4/5 CrashLoopBackOff 8 (58s ago) 17m 10.20.116.213 master-2.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-cmqmd 5/5 Running 0 60m 10.20.116.214 worker-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-jbn9q 5/5 Running 0 60m 10.20.116.212 master-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-tkzvg 4/5 Error 1 (19s ago) 27s 10.20.116.215 worker-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-xvvjf 5/5 Running 0 61m 10.20.116.211 master-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> [root@ospbmgr7 bin]# oc get pods -o wide -n openshift-ovn-kubernetes NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ovnkube-master-bmz85 6/6 Running 0 57m 10.20.116.211 master-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-master-r58v2 6/6 Running 6 54m 10.20.116.213 master-2.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-master-vbsdb 6/6 Running 0 59m 10.20.116.212 master-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-927lm 4/5 CrashLoopBackOff 8 (60s ago) 17m 10.20.116.213 master-2.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-cmqmd 5/5 Running 0 60m 10.20.116.214 worker-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-jbn9q 5/5 Running 0 60m 10.20.116.212 master-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-tkzvg 4/5 CrashLoopBackOff 1 (16s ago) 29s 10.20.116.215 worker-1.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> ovnkube-node-xvvjf 5/5 Running 0 61m 10.20.116.211 master-0.pok-99.ocptest.pok.stglabs.ibm.com <none> <none> 7. I'll be attaching a log of the "journalctl -b -u ovs-configuration" command from worker-1. Thank you, Kyle
Created attachment 1849426 [details] worker-1 journalctl -b -u ovs-configuration output Per comment 25, here is the "journalctl -b -u ovs-configuration" command output from worker-1.
Created attachment 1849427 [details] worker-1 /usr/local/bin/configure-ovs.sh output Per comment 25, here is the /usr/local/bin/configure-ovs.sh updated script command output from worker-1.
Prashanth, Please see the pending attachments for the requested OCP 4.9.13 master-0 node contents of these 2 directories: 1. /etc/NetworkManager/system-connections 2. /etc/NetworkManager/systemConnectionsMerged Thank you, Kyle
Created attachment 1849428 [details] OCP 4.9.13 master-0 node /etc/NetworkManager/system-connections ouuput Per comment 28, here is the /etc/NetworkManager/system-connections requested output.
Created attachment 1849429 [details] OCP 4.9.13 master-0 node /etc/NetworkManager/system-connectionsMerged output Per comment 28, here is the /etc/NetworkManager/systemConnectionsMerged requested output.
Issue is reproducible on Power as well. Upgrade from 4.9.12 --> 4.10.0-0.nightly-ppc64le-2022-01-07-115230 # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.9.12 True True 106m Unable to apply 4.10.0-0.nightly-ppc64le-2022-01-07-115230: an unknown error has occurred: MultipleErrors # oc get pods -A -owide | grep ovn openshift-ovn-kubernetes ovnkube-master-crxjh 6/6 Running 0 75m 9.114.97.83 master-1 <none> <none> openshift-ovn-kubernetes ovnkube-master-hkrzt 6/6 Running 2 (80m ago) 80m 9.114.97.99 master-0 <none> <none> openshift-ovn-kubernetes ovnkube-master-qptwx 6/6 Running 6 77m 9.114.97.88 master-2 <none> <none> openshift-ovn-kubernetes ovnkube-node-5sgrc 5/5 Running 0 81m 9.114.97.99 master-0 <none> <none> openshift-ovn-kubernetes ovnkube-node-blwbn 5/5 Running 0 80m 9.114.97.96 worker-1 <none> <none> openshift-ovn-kubernetes ovnkube-node-h4fmt 4/5 CrashLoopBackOff 21 (2m44s ago) 81m 9.114.97.100 worker-0 <none> <none> openshift-ovn-kubernetes ovnkube-node-k8xmx 4/5 CrashLoopBackOff 20 (88s ago) 81m 9.114.97.88 master-2 <none> <none> openshift-ovn-kubernetes ovnkube-node-zxqwm 5/5 Running 0 81m 9.114.97.83 master-1 <none> <none> State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Message: : true\n\nup : true\n\nup : true\n\nup : true\n\nup : false\n\nup : false\n\nup : true\n\nup : false\n\nup : false\n\nup : true\n" I0107 14:05:48.106171 62247 ovs.go:208] exec(3): stderr: "" I0107 14:05:48.106185 62247 node.go:315] Detected support for port binding with external IDs I0107 14:05:48.106300 62247 ovs.go:204] exec(4): /usr/bin/ovs-vsctl --timeout=15 -- --if-exists del-port br-int k8s-master-2 -- --may-exist add-port br-int ovn-k8s-mp0 -- set interface ovn-k8s-mp0 type=internal mtu_request=1400 external-ids:iface-id=k8s-master-2 I0107 14:05:48.113384 62247 ovs.go:207] exec(4): stdout: "" I0107 14:05:48.113401 62247 ovs.go:208] exec(4): stderr: "" I0107 14:05:48.113424 62247 ovs.go:204] exec(5): /usr/bin/ovs-vsctl --timeout=15 --if-exists get interface ovn-k8s-mp0 mac_in_use I0107 14:05:48.118643 62247 ovs.go:207] exec(5): stdout: "\"62:d7:b8:1f:c3:42\"\n" I0107 14:05:48.118663 62247 ovs.go:208] exec(5): stderr: "" I0107 14:05:48.118702 62247 ovs.go:204] exec(6): /usr/bin/ovs-vsctl --timeout=15 set interface ovn-k8s-mp0 mac=62\:d7\:b8\:1f\:c3\:42 I0107 14:05:48.124704 62247 ovs.go:207] exec(6): stdout: "" I0107 14:05:48.124728 62247 ovs.go:208] exec(6): stderr: "" I0107 14:05:48.172487 62247 gateway_init.go:261] Initializing Gateway Functionality I0107 14:05:48.172720 62247 gateway_localnet.go:131] Node local addresses initialized to: map[10.129.0.2:{10.129.0.0 fffffe00} 127.0.0.1:{127.0.0.0 ff000000} 9.114.97.88:{9.114.96.0 fffffc00} ::1:{::1 ffffffffffffffffffffffffffffffff} fe80::60d7:b8ff:fe1f:c342:{fe80:: ffffffffffffffff0000000000000000} fe80::bc4a:e1ff:fec1:62fa:{fe80:: ffffffffffffffff0000000000000000}] I0107 14:05:48.172868 62247 helper_linux.go:74] Found default gateway interface env32 9.114.96.1 F0107 14:05:48.172917 62247 ovnkube.go:133] could not find IP addresses: failed to lookup link br-ex: Link not found Exit Code: 1
Prashanth, Thanks for the OCP 4.10 on Z build with the fix and you assistance yesterday. Just an update since we spoke on slack yesterday that in addition to OCP 4.9.12 on Z, OCP 4.9.11 and 4.9.13 on Z also successfully upgrade to your fix build based on OCP 4.10 4.10.0-0.nightly-s390x-2022-01-07-024817. Specifically: 1. OCP 4.9.11 on Z successfully upgrades to the fix build based on OCP 4.10.0-0.nightly-s390x-2022-01-07-024817 on Z. 2. OCP 4.9.12 on Z successfully upgrades to the fix build based on OCP 4.10.0-0.nightly-s390x-2022-01-07-024817 on Z. 3. OCP 4.9.13 on Z successfully upgrades to the fix build based on OCP 4.10.0-0.nightly-s390x-2022-01-07-024817 on Z. Thank you, Kyle
Kyle, The latest nightly has the fix: https://mirror.openshift.com/pub/openshift-v4/s390x/clients/ocp-dev-preview/4.10.0-0.nightly-s390x-2022-01-12-163931/. if you could test that and confirm it works, we can close this. Thanks Prashanth
Prashanth, Thanks for the updated build. I'll test today and provide an update. Thank you, Kyle
sorry Kyle, but could you actually test with this build too : https://mirror.openshift.com/pub/openshift-v4/s390x/clients/ocp-dev-preview/4.10.0-0.nightly-s390x-2022-01-12-163931 . it has kubernetes bumped to 1.23 and that would be good to test as well
Prashanth, Thanks for the update. Yes, will do. Thank you, Kyle
Verified upgrade from 4.9.12 to 4.10.0-0.nightly-ppc64le-2022-01-13-022003 on Power. # oc version Client Version: 4.9.12 Server Version: 4.10.0-0.nightly-ppc64le-2022-01-13-022003 Kubernetes Version: v1.23.0+50f645e # oc get network.config/cluster -o jsonpath='{.status.networkType}{"\n"}' OVNKubernetes # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False 18m Cluster version is 4.10.0-0.nightly-ppc64le-2022-01-13-022003 # oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 29m baremetal 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h cloud-controller-manager 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h cloud-credential 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h cluster-autoscaler 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h config-operator 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h console 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 38m csi-snapshot-controller 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h dns 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h etcd 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h image-registry 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 39m ingress 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 39m insights 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h kube-apiserver 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h kube-controller-manager 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h kube-scheduler 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h kube-storage-version-migrator 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 39m machine-api 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h machine-approver 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h machine-config 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 109m marketplace 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h monitoring 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h network 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h node-tuning 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 72m openshift-apiserver 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h openshift-controller-manager 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 73m openshift-samples 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 73m operator-lifecycle-manager 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h operator-lifecycle-manager-catalog 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h operator-lifecycle-manager-packageserver 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h service-ca 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h storage 4.10.0-0.nightly-ppc64le-2022-01-13-022003 True False False 11h
Prashanth, Thanks for the updates and builds. 1. For the OCP 4,10 nightly build 4.10.0-0.nightly-s390x-2022-01-12-163931, with kubernetes 1.23.0, the upgrade from OCP 4.9.14 was successful. OCP 4.9.14 was previously upgraded from OCP 4.9.13. # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-s390x-2022-01-12-163931 True False 38m Cluster version is 4.10.0-0.nightly-s390x-2022-01-12-163931 # oc get nodes NAME STATUS ROLES AGE VERSION master-0.pok-25.ocptest.pok.stglabs.ibm.com Ready master 4h40m v1.22.1+6859754 master-1.pok-25.ocptest.pok.stglabs.ibm.com Ready master 4h44m v1.22.1+6859754 master-2.pok-25.ocptest.pok.stglabs.ibm.com Ready master 4h43m v1.22.1+6859754 worker-0.pok-25.ocptest.pok.stglabs.ibm.com Ready worker 4h29m v1.22.1+6859754 worker-1.pok-25.ocptest.pok.stglabs.ibm.com Ready worker 4h29m v1.22.1+6859754 # oc get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME master-0.pok-25.ocptest.pok.stglabs.ibm.com Ready master 4h41m v1.22.1+6859754 10.20.116.11 <none> Red Hat Enterprise Linux CoreOS 410.84.202201120003-0 (Ootpa) 4.18.0-305.30.1.el8_4.s390x cri-o://1.23.0-100.rhaos4.10.git77d20b2.el8 master-1.pok-25.ocptest.pok.stglabs.ibm.com Ready master 4h45m v1.22.1+6859754 10.20.116.12 <none> Red Hat Enterprise Linux CoreOS 410.84.202201120003-0 (Ootpa) 4.18.0-305.30.1.el8_4.s390x cri-o://1.23.0-100.rhaos4.10.git77d20b2.el8 master-2.pok-25.ocptest.pok.stglabs.ibm.com Ready master 4h44m v1.22.1+6859754 10.20.116.13 <none> Red Hat Enterprise Linux CoreOS 410.84.202201120003-0 (Ootpa) 4.18.0-305.30.1.el8_4.s390x cri-o://1.23.0-100.rhaos4.10.git77d20b2.el8 worker-0.pok-25.ocptest.pok.stglabs.ibm.com Ready worker 4h30m v1.22.1+6859754 10.20.116.14 <none> Red Hat Enterprise Linux CoreOS 410.84.202201120003-0 (Ootpa) 4.18.0-305.30.1.el8_4.s390x cri-o://1.23.0-100.rhaos4.10.git77d20b2.el8 worker-1.pok-25.ocptest.pok.stglabs.ibm.com Ready worker 4h30m v1.22.1+6859754 10.20.116.15 <none> Red Hat Enterprise Linux CoreOS 410.84.202201120003-0 (Ootpa) 4.18.0-305.30.1.el8_4.s390x cri-o://1.23.0-100.rhaos4.10.git77d20b2.el8 # 2. For kubernetes reporting versions by the OCP CLI "oc get nodes" command, please note that: 1. When upgrading from OCP 4.9.13 to OCP 4.9.14, the "oc get nodes" command correctly reports the kubernetes version of 1.22.3, which corresponds to the kubernetes version of 1.22.3 listed in the OCP 4.9.14 build's release.txt file. 2. When upgrading from OCP 4.9.14 to OCP 4.10.0-0.nightly-s390x-2022-01-12-163931, the "oc get nodes" command incorrectly reports the kubernetes version as 1.22.1, instead of the 1.23.0 version listed in the OCP 4.10.0-0.nightly-s390x-2022-01-12-163931 build's release.txt file. The "oc get nodes -o wide" command does report the Container Runtime as "cri-o://1.23.0-100.rhaos4.10.git77d20b2.el8". 3. We are conducting additional OCP 4.9.x to OCP 4.10 upgrade tests for the new builds with kubernetes 1.23.0, including with the currently latest available OCP 4.10 nightly build 4.10.0-0.nightly-s390x-2022-01-13-022003, and will post the results here, including the "oc version", "oc clusterversion", "oc get nodes", "oc get nodes -o wide", and "oc co" output. 4. OCP 4.9.14 was released yesterday, with OCP 4.9.15 released several hours later yesterday. Thank you, Kyle
Prashanth, 1. The upgrade from OCP 4.9.14 to OCP 4.10 nightly build 4.10.0-0.nightly-s390x-2022-01-13-022003 was successful. # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-s390x-2022-01-13-022003 True False 3m49s Cluster version is 4.10.0-0.nightly-s390x-2022-01-13-022003 # oc get nodes NAME STATUS ROLES AGE VERSION master-0.pok-99.ocptest.pok.stglabs.ibm.com Ready master 143m v1.23.0+50f645e master-1.pok-99.ocptest.pok.stglabs.ibm.com Ready master 143m v1.23.0+50f645e master-2.pok-99.ocptest.pok.stglabs.ibm.com Ready master 143m v1.23.0+50f645e worker-0.pok-99.ocptest.pok.stglabs.ibm.com Ready worker 128m v1.23.0+50f645e worker-1.pok-99.ocptest.pok.stglabs.ibm.com Ready worker 128m v1.23.0+50f645e # oc get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME master-0.pok-99.ocptest.pok.stglabs.ibm.com Ready master 143m v1.23.0+50f645e 10.20.116.211 <none> Red Hat Enterprise Linux CoreOS 410.84.202201121602-0 (Ootpa) 4.18.0-305.30.1.el8_4.s390x cri-o://1.23.0-100.rhaos4.10.git77d20b2.el8 master-1.pok-99.ocptest.pok.stglabs.ibm.com Ready master 143m v1.23.0+50f645e 10.20.116.212 <none> Red Hat Enterprise Linux CoreOS 410.84.202201121602-0 (Ootpa) 4.18.0-305.30.1.el8_4.s390x cri-o://1.23.0-100.rhaos4.10.git77d20b2.el8 master-2.pok-99.ocptest.pok.stglabs.ibm.com Ready master 143m v1.23.0+50f645e 10.20.116.213 <none> Red Hat Enterprise Linux CoreOS 410.84.202201121602-0 (Ootpa) 4.18.0-305.30.1.el8_4.s390x cri-o://1.23.0-100.rhaos4.10.git77d20b2.el8 worker-0.pok-99.ocptest.pok.stglabs.ibm.com Ready worker 128m v1.23.0+50f645e 10.20.116.214 <none> Red Hat Enterprise Linux CoreOS 410.84.202201121602-0 (Ootpa) 4.18.0-305.30.1.el8_4.s390x cri-o://1.23.0-100.rhaos4.10.git77d20b2.el8 worker-1.pok-99.ocptest.pok.stglabs.ibm.com Ready worker 128m v1.23.0+50f645e 10.20.116.215 <none> Red Hat Enterprise Linux CoreOS 410.84.202201121602-0 (Ootpa) 4.18.0-305.30.1.el8_4.s390x cri-o://1.23.0-100.rhaos4.10.git77d20b2.el8 # oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 16m baremetal 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 136m cloud-controller-manager 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 143m cloud-credential 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 143m cluster-autoscaler 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 136m config-operator 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 138m console 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 19m csi-snapshot-controller 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 137m dns 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 136m etcd 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 136m image-registry 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 130m ingress 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 22m insights 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 131m kube-apiserver 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 133m kube-controller-manager 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 135m kube-scheduler 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 135m kube-storage-version-migrator 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 23m machine-api 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 136m machine-approver 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 136m machine-config 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 135m marketplace 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 137m monitoring 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 124m network 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 138m node-tuning 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 126m openshift-apiserver 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 131m openshift-controller-manager 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 134m openshift-samples 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 82m operator-lifecycle-manager 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 137m operator-lifecycle-manager-catalog 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 137m operator-lifecycle-manager-packageserver 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 131m service-ca 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 138m storage 4.10.0-0.nightly-s390x-2022-01-13-022003 True False False 139m # oc version Client Version: 4.9.14 Server Version: 4.10.0-0.nightly-s390x-2022-01-13-022003 Kubernetes Version: v1.23.0+50f645e 2. The OCP CLI "oc get nodes" and "oc get nodes -o wide" commands correctly report the kubernetes version 1.23.0, as shown in the above output. 3. We are conducting some additional OCP 4.9.x to OCP 4.10 upgrade tests for the new builds with kubernetes 1.23.0, and will post the results here. Thank you, Kyle
This was shipped in the GA advisory, but was not automatically transitioned because it lacked the new subcomponent field. Closing manually.