Version: ./openshift-install 4.10.0-0.nightly-2022-02-17-234353 built from commit 5349373764f2957b75448b17005bcc1c1b9a9e8e release image registry.ci.openshift.org/ocp/release@sha256:5a958e2cea284e33c391dd15383821dd4cfefa747a0fd811f1ea702f1d147870 release architecture amd64 Platform: alibabacloud Please specify: * IPI What happened? After scaleup with 2 RHEL compute nodes and setting ingress replicas to 4, the 2 new pods failed to be scheduled and stay Pending. What did you expect to happen? The 2 new router-default pods should turn into Running and scheduled onto the 2 RHEL compute nodes, which should be added into the vservergroups of ingress LB. How to reproduce it (as minimally and precisely as possible)? Always. Anything else we need to know? >FYI the flexy-install job and the rhel-scaleup job: https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/78364/ https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp4-rhel-scaleup/13679/ >after IPI installation: $ oc get nodes NAME STATUS ROLES AGE VERSION jiwei-103-9b2qw-master-0 Ready master 30m v1.23.3+5642c2c jiwei-103-9b2qw-master-1 Ready master 32m v1.23.3+5642c2c jiwei-103-9b2qw-master-2 Ready master 30m v1.23.3+5642c2c jiwei-103-9b2qw-worker-us-east-1a-kqjrz Ready worker 19m v1.23.3+5642c2c jiwei-103-9b2qw-worker-us-east-1b-ntwlp Ready worker 20m v1.23.3+5642c2c $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2022-02-17-234353 True False 9m4s Cluster version is 4.10.0-0.nightly-2022-02-17-234353 $ oc -n openshift-ingress get service router-default NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE router-default LoadBalancer 172.30.168.75 47.253.103.23 80:31988/TCP,443:31045/TCP 29m $ oc -n openshift-ingress get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES router-default-bbb87465-fsn7z 1/1 Running 0 29m 10.128.2.5 jiwei-103-9b2qw-worker-us-east-1a-kqjrz <none> <none> router-default-bbb87465-vr6wf 1/1 Running 0 29m 10.131.0.11 jiwei-103-9b2qw-worker-us-east-1b-ntwlp <none> <none> $ >after scaleup with 2 RHEL nodes: $ oc get nodes NAME STATUS ROLES AGE VERSION jiwei-103-9b2qw-master-0 Ready master 42m v1.23.3+5642c2c jiwei-103-9b2qw-master-1 Ready master 43m v1.23.3+5642c2c jiwei-103-9b2qw-master-2 Ready master 42m v1.23.3+5642c2c jiwei-103-9b2qw-rhel-worker-0 Ready worker 5m v1.23.3+5642c2c jiwei-103-9b2qw-rhel-worker-1 Ready worker 4m59s v1.23.3+5642c2c jiwei-103-9b2qw-worker-us-east-1a-kqjrz Ready worker 31m v1.23.3+5642c2c jiwei-103-9b2qw-worker-us-east-1b-ntwlp Ready worker 32m v1.23.3+5642c2c $ oc -n openshift-ingress get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES router-default-bbb87465-fsn7z 1/1 Running 0 40m 10.128.2.5 jiwei-103-9b2qw-worker-us-east-1a-kqjrz <none> <none> router-default-bbb87465-vr6wf 1/1 Running 0 40m 10.131.0.11 jiwei-103-9b2qw-worker-us-east-1b-ntwlp <none> <none> $ oc get -o yaml deployment/router-default -n openshift-ingress | grep replicas replicas: 2 replicas: 2 $ >after setting ingress replicas to 4, the 2 new pods stay Pending: $ oc patch -n openshift-ingress-operator ingresscontroller/default --patch '{"spec":{"replicas": 4}}' --type=merge ingresscontroller.operator.openshift.io/default patched $ oc get -o yaml deployment/router-default -n openshift-ingress | grep replicas replicas: 4 replicas: 4 $ oc -n openshift-ingress get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES router-default-bbb87465-fsn7z 1/1 Running 0 45m 10.128.2.5 jiwei-103-9b2qw-worker-us-east-1a-kqjrz <none> <none> router-default-bbb87465-tf6rs 0/1 Pending 0 3m44s <none> <none> <none> <none> router-default-bbb87465-tg5z8 0/1 Pending 0 3m44s <none> <none> <none> <none> router-default-bbb87465-vr6wf 1/1 Running 0 45m 10.131.0.11 jiwei-103-9b2qw-worker-us-east-1b-ntwlp <none> <none> $ $ oc -n openshift-ingress get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES router-default-bbb87465-fsn7z 1/1 Running 0 111m 10.128.2.5 jiwei-103-9b2qw-worker-us-east-1a-kqjrz <none> <none> router-default-bbb87465-tf6rs 0/1 Pending 0 69m <none> <none> <none> <none> router-default-bbb87465-tg5z8 0/1 Pending 0 69m <none> <none> <none> <none> router-default-bbb87465-vr6wf 1/1 Running 0 111m 10.131.0.11 jiwei-103-9b2qw-worker-us-east-1b-ntwlp <none> <none> $ oc -n openshift-ingress describe pod router-default-bbb87465-tf6rs ...... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 69m default-scheduler 0/7 nodes are available: 2 node(s) didn't match pod anti-affinity rules, 2 node(s) had taint {node.cloudprovider.kubernetes.io/uninitialized: true}, that the pod didn't tolerate, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. Warning FailedScheduling 12m (x65 over 68m) default-scheduler 0/7 nodes are available: 2 node(s) didn't match pod anti-affinity rules, 2 node(s) had taint {node.cloudprovider.kubernetes.io/uninitialized: true}, that the pod didn't tolerate, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. $ >FYI if scaleup using machineset and RHCOS compute node, the new pod does be scheduled onto new node as expected: $ oc get nodes NAME STATUS ROLES AGE VERSION jiwei-102-8zh8g-master-0 Ready master 34m v1.23.3+5642c2c jiwei-102-8zh8g-master-1 Ready master 34m v1.23.3+5642c2c jiwei-102-8zh8g-master-2 Ready master 33m v1.23.3+5642c2c jiwei-102-8zh8g-worker-us-east-1a-lswcd Ready worker 23m v1.23.3+5642c2c jiwei-102-8zh8g-worker-us-east-1b-qvtml Ready worker 20m v1.23.3+5642c2c $ oc -n openshift-ingress get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES router-default-6bd4b85466-9nqkc 1/1 Running 0 31m 10.131.0.7 jiwei-102-8zh8g-worker-us-east-1a-lswcd <none> <none> router-default-6bd4b85466-vtzfp 1/1 Running 0 31m 10.128.2.6 jiwei-102-8zh8g-worker-us-east-1b-qvtml <none> <none> $ oc scale machineset jiwei-102-8zh8g-worker-us-east-1a --replicas=2 -n openshift-machine-api machineset.machine.openshift.io/jiwei-102-8zh8g-worker-us-east-1a scaled $ oc get machineset -n openshift-machine-api NAME DESIRED CURRENT READY AVAILABLE AGE jiwei-102-8zh8g-worker-us-east-1a 2 2 2 2 45m jiwei-102-8zh8g-worker-us-east-1b 1 1 1 1 45m $ oc get nodes NAME STATUS ROLES AGE VERSION jiwei-102-8zh8g-master-0 Ready master 44m v1.23.3+5642c2c jiwei-102-8zh8g-master-1 Ready master 44m v1.23.3+5642c2c jiwei-102-8zh8g-master-2 Ready master 43m v1.23.3+5642c2c jiwei-102-8zh8g-worker-us-east-1a-2lmvp Ready worker 50s v1.23.3+5642c2c jiwei-102-8zh8g-worker-us-east-1a-lswcd Ready worker 33m v1.23.3+5642c2c jiwei-102-8zh8g-worker-us-east-1b-qvtml Ready worker 30m v1.23.3+5642c2c $ oc patch -n openshift-ingress-operator ingresscontroller/default --patch '{"spec":{"replicas": 3}}' --type=merge ingresscontroller.operator.openshift.io/default patched $ oc -n openshift-ingress get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES router-default-6bd4b85466-9nqkc 1/1 Running 0 43m 10.131.0.7 jiwei-102-8zh8g-worker-us-east-1a-lswcd <none> <none> router-default-6bd4b85466-vtzfp 1/1 Running 0 43m 10.128.2.6 jiwei-102-8zh8g-worker-us-east-1b-qvtml <none> <none> router-default-6bd4b85466-z9nsn 1/1 Running 0 26s 10.129.2.6 jiwei-102-8zh8g-worker-us-east-1a-2lmvp <none> <none> $
# wrong providerID $ kubectl get node jiwei-509-bvl6f-rhel-0 -oyaml|grep -i providerid providerID: alicloud:// # correct providerID $ kubectl get node jiwei-509-bvl6f-worker-us-east-1a-q4d4s -oyaml|grep -i providerid providerID: alicloud://us-east-1.i-0xi4lwm4mnibodrbga84 The rhel nodes has a wrong providerID, and the expected format is like node jiwei-509-bvl6f-worker-us-east-1a-q4d4s.
$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2022-03-03-115552 True False 47m Error while reconciling 4.10.0-0.nightly-2022-03-03-115552: the cluster operator ingress has not yet successfully rolled out $ oc get nodes NAME STATUS ROLES AGE VERSION jiwei-509-bvl6f-master-0 Ready master 66m v1.23.3+e419edf jiwei-509-bvl6f-master-1 Ready master 65m v1.23.3+e419edf jiwei-509-bvl6f-master-2 Ready master 65m v1.23.3+e419edf jiwei-509-bvl6f-rhel-0 Ready worker 31m v1.23.3+e419edf jiwei-509-bvl6f-rhel-1 Ready worker 31m v1.23.3+e419edf jiwei-509-bvl6f-worker-us-east-1a-q4d4s Ready worker 56m v1.23.3+e419edf jiwei-509-bvl6f-worker-us-east-1b-lr22m Ready worker 53m v1.23.3+e419edf $ oc get -o yaml deployment/router-default -n openshift-ingress | grep replicas replicas: 4 replicas: 4 $ oc -n openshift-ingress get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES router-default-5d5c4466b8-4sc9t 1/1 Running 0 63m 10.128.2.5 jiwei-509-bvl6f-worker-us-east-1b-lr22m <none> <none> router-default-5d5c4466b8-bkskv 0/1 Pending 0 28m <none> <none> <none> <none> router-default-5d5c4466b8-j8ktf 1/1 Running 0 63m 10.131.0.11 jiwei-509-bvl6f-worker-us-east-1a-q4d4s <none> <none> router-default-5d5c4466b8-n2rmx 0/1 Pending 0 28m <none> <none> <none> <none> $ oc get node jiwei-509-bvl6f-worker-us-east-1a-q4d4s -oyaml | grep -i providerid providerID: alicloud://us-east-1.i-0xi4lwm4mnibodrbga84 $ oc get node jiwei-509-bvl6f-rhel-0 -oyaml | grep -i providerid providerID: alicloud:// $ FYI the QE flexy jobs: https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/81943/ https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp4-rhel-scaleup/13874/
I'm not quite sure what do you expect here to happen? If the problem is with ingress not bumping the replicas after adding the node, then that's a question for the networking team why this happened? If you're asking about the scheduling issues, this warning: Warning FailedScheduling 12m (x65 over 68m) default-scheduler 0/7 nodes are available: 2 node(s) didn't match pod anti-affinity rules, 2 node(s) had taint {node.cloudprovider.kubernetes.io/uninitialized: true}, that the pod didn't tolerate, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. provides the reason. It looks like the new nodes were not properly initialized.
(In reply to Maciej Szulik from comment #3) > I'm not quite sure what do you expect here to happen? If the problem is with > ingress not bumping the replicas after adding the node, then that's a > question for the networking team why this happened? > If you're asking about the scheduling issues, this warning: > > Warning FailedScheduling 12m (x65 over 68m) default-scheduler 0/7 nodes > are available: 2 node(s) didn't match pod anti-affinity rules, 2 node(s) had > taint {node.cloudprovider.kubernetes.io/uninitialized: true}, that the pod > didn't tolerate, 3 node(s) had taint {node-role.kubernetes.io/master: }, > that the pod didn't tolerate. > > provides the reason. It looks like the new nodes were not properly > initialized. The expectation is, the 2 new pods should be scheduled onto the 2 new RHEL compute nodes and turn into Running status within reasonable time.
In your (In reply to Jianli Wei from comment #4) > The expectation is, the 2 new pods should be scheduled onto the 2 new RHEL > compute nodes and turn into Running status within reasonable time. In that case you'd need to figure out why the 2 new RHEL nodes had node.cloudprovider.kubernetes.io/uninitialized taint set to true, as from this message: Warning FailedScheduling 12m (x65 over 68m) default-scheduler 0/7 nodes are available: 2 node(s) didn't match pod anti-affinity rules, 2 node(s) had taint {node.cloudprovider.kubernetes.io/uninitialized: true}, that the pod didn't tolerate, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate. My guess it has something to do with cloud provider initialization, I'm moving this over to cloud team, but they will ask you for logs from cloud controller, for sure.
I have pinged the Alibaba CCM team to take a look at this issue, seems that the CCM is not working as expected Do we have CCM logs that we can share with the Alibaba team?
Setting this to blocker+ right now as it appears we can't add workers to clusters after bootstrapping
@jspeed curious if there is an update here, also should this be assigned to an Alibaba engineer?
@gausingh Could you please help us to get the right eyes on this bug? Seems it will need some attention from both Alibaba and maybe MCO/Splat
@brlu can you please take a look at this bug . i am assigning this bug to you
(In reply to jigu from comment #1) > # wrong providerID > $ kubectl get node jiwei-509-bvl6f-rhel-0 -oyaml|grep -i providerid > providerID: alicloud:// > > # correct providerID > $ kubectl get node jiwei-509-bvl6f-worker-us-east-1a-q4d4s -oyaml|grep -i > providerid > providerID: alicloud://us-east-1.i-0xi4lwm4mnibodrbga84 > > The rhel nodes has a wrong providerID, and the expected format is like node > jiwei-509-bvl6f-worker-us-east-1a-q4d4s. The rhel nodes has a wrong providerID. Node providerID is set by kubelet.service. see prs: https://github.com/openshift/machine-config-operator/pull/2777 https://github.com/openshift/machine-config-operator/pull/2814 It seems that the providerid is not set correctly in rhel nodes. @
@jiwei Could you please help us to create a cluster with rhel nodes so that we can debug? It would be good to reproduce this and either share a kubeconfig or gather a must gather and SOS reports from the nodes that are broken so that we can inspect the state of the instances and check the files on the disk
(In reply to Joel Speed from comment #13) > @jiwei Could you please help us to create a cluster with rhel > nodes so that we can debug? It would be good to reproduce this and either > share a kubeconfig or gather a must gather and SOS reports from the nodes > that are broken so that we can inspect the state of the instances and check > the files on the disk Sorry for the late reply. https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/112560/artifact/workdir/install-dir/auth/kubeconfig $ oc get nodes NAME STATUS ROLES AGE VERSION jiwei-0616-12-56g6k-master-0 Ready master 89m v1.24.0+cb71478 jiwei-0616-12-56g6k-master-1 Ready master 89m v1.24.0+cb71478 jiwei-0616-12-56g6k-master-2 Ready master 89m v1.24.0+cb71478 jiwei-0616-12-56g6k-rhel-0 Ready worker 25m v1.24.0+25f9057 jiwei-0616-12-56g6k-worker-us-east-1a-9fq8h Ready worker 80m v1.24.0+cb71478 jiwei-0616-12-56g6k-worker-us-east-1b-8srfw Ready worker 78m v1.24.0+cb71478 $ oc -n openshift-ingress get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES router-default-78c788bf44-2tzbt 1/1 Running 0 86m 10.131.0.5 jiwei-0616-12-56g6k-worker-us-east-1a-9fq8h <none> <none> router-default-78c788bf44-vfhjm 0/1 Pending 0 5m10s <none> <none> <none> <none> router-default-78c788bf44-xbd6l 1/1 Running 0 86m 10.128.2.7 jiwei-0616-12-56g6k-worker-us-east-1b-8srfw <none> <none> $ oc -n openshift-ingress describe pod router-default-78c788bf44-vfhjm | grep Warning Warning FailedScheduling 5m1s default-scheduler 0/6 nodes are available: 1 node(s) had untolerated taint {node.cloudprovider.kubernetes.io/uninitialized: true}, 2 node(s) didn't match pod anti-affinity rules, 3 node(s) didn't match Pod's node affinity/selector, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/6 nodes are available: 2 node(s) didn't match pod anti-affinity rules, 4 Preemption is not helpful for scheduling. Warning FailedScheduling 43s default-scheduler 0/6 nodes are available: 1 node(s) had untolerated taint {node.cloudprovider.kubernetes.io/uninitialized: true}, 2 node(s) didn't match pod anti-affinity rules, 3 node(s) didn't match Pod's node affinity/selector, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/6 nodes are available: 2 node(s) didn't match pod anti-affinity rules, 4 Preemption is not helpful for scheduling. $
FYI $ oc get node jiwei-0616-12-56g6k-worker-us-east-1a-9fq8h -oyaml | grep -i providerid providerID: alicloud://us-east-1.i-0xiffvfapqsq6b2pknlz $ oc get node jiwei-0616-12-56g6k-rhel-0 | grep -i providerid $
Sorry, pls. ignore comment#15, see below instead, thanks $ oc get node jiwei-0616-12-56g6k-worker-us-east-1a-9fq8h -oyaml | grep -i providerid providerID: alicloud://us-east-1.i-0xiffvfapqsq6b2pknlz $ oc get node jiwei-0616-12-56g6k-rhel-0 -oyaml | grep -i providerid providerID: alicloud:// $
we talked about this issue in our team standup today and we are curious about the relationship to RHEL and if we need to have someone from the node or RHEL team join this conversation. we aren't quite sure why there would be a difference between RHEL and RHCOS.
@brlu @gausingh , the cloud team is discussing this bug and it seems like we have some new data but we aren't sure if this requires a change on the Red Hat side or the Alibaba side, any guidance?
Fixed in https://github.com/openshift/machine-config-operator/pull/3338
Tested with a build having the PR (see https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-launch-aws-2-modern/1575677383736823808), and it can work well. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.12.0-0.ci.test-2022-09-30-025100-ci-ln-sqd1vnb-latest True False 25m Cluster version is 4.12.0-0.ci.test-2022-09-30-025100-ci-ln-sqd1vnb-latest $ $ oc get nodes NAME STATUS ROLES AGE VERSION jiwei-0930-01-dck5b-master-0 Ready control-plane,master 66m v1.24.0+8c7c967 jiwei-0930-01-dck5b-master-1 Ready control-plane,master 62m v1.24.0+8c7c967 jiwei-0930-01-dck5b-master-2 Ready control-plane,master 65m v1.24.0+8c7c967 jiwei-0930-01-dck5b-rhel-0 Ready worker 13m v1.24.0+8c7c967 jiwei-0930-01-dck5b-worker-us-east-1a-82zdp Ready worker 32m v1.24.0+8c7c967 jiwei-0930-01-dck5b-worker-us-east-1b-fz98j Ready worker 39m v1.24.0+8c7c967 $ $ oc get nodes jiwei-0930-01-dck5b-rhel-0 -oyaml | grep -i providerid providerID: alicloud://us-east-1.i-0xi9e42kn4hjz1j3353t $ $ oc get nodes jiwei-0930-01-dck5b-worker-us-east-1a-82zdp -oyaml | grep -i providerid providerID: alicloud://us-east-1.i-0xi9e42kn4hjypowet86 $ $ oc -n openshift-ingress get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES router-default-5fb74f8b7b-5m87k 1/1 Running 0 53m 10.131.0.8 jiwei-0930-01-dck5b-worker-us-east-1b-fz98j <none> <none> router-default-5fb74f8b7b-kwtrk 1/1 Running 0 53m 10.128.2.8 jiwei-0930-01-dck5b-worker-us-east-1a-82zdp <none> <none> $ $ oc get -o yaml deployment/router-default -n openshift-ingress | grep replicas replicas: 2 replicas: 2 $ $ oc patch -n openshift-ingress-operator ingresscontroller/default --patch '{"spec":{"replicas": 3}}' --type=merge ingresscontroller.operator.openshift.io/default patched $ oc get -o yaml deployment/router-default -n openshift-ingress | grep replicas replicas: 3 replicas: 3 $ $ oc -n openshift-ingress get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES router-default-5fb74f8b7b-4h6cg 1/1 Running 0 20s 10.129.2.6 jiwei-0930-01-dck5b-rhel-0 <none> <none> router-default-5fb74f8b7b-5m87k 1/1 Running 0 54m 10.131.0.8 jiwei-0930-01-dck5b-worker-us-east-1b-fz98j <none> <none> router-default-5fb74f8b7b-kwtrk 1/1 Running 0 54m 10.128.2.8 jiwei-0930-01-dck5b-worker-us-east-1a-82zdp <none> <none> $
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7399
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days