Description of problem: This was not an issue on OCP 4.3. This issue is happening on OCP 4.4 when trying to add a new workload node for automation by scaling an existing machineset on AWS IPI cluster, with the providerSpec value for publicIP set to true. oc debug node/<newly_added_node> fails with: # oc debug node/ip-10-0-7-124.us-west-2.compute.internal Starting pod/ip-10-0-7-124us-west-2computeinternal-debug ... To use host binaries, run `chroot /host` Pod IP: 10.0.7.124 If you don't see a command prompt, try pressing enter. Removing debug pod ... Error from server: error dialing backend: remote error: tls: internal error # oc logs -n openshift-cluster-machine-approver machine-approver-7b9ffbdbd5-67k4x -c machine-approver-controller | grep "ip-10-0-7-124.us-west-2.compute.internal" I0128 18:02:41.296779 1 csr_check.go:418] retrieving serving cert from ip-10-0-7-124.us-west-2.compute.internal (10.0.7.124:10250) I0128 18:02:41.300586 1 csr_check.go:183] Falling back to machine-api authorization for ip-10-0-7-124.us-west-2.compute.internal I0128 18:02:41.300600 1 main.go:181] CSR csr-7z7f6 not authorized: DNS name 'ec2-54-203-167-77.us-west-2.compute.amazonaws.com' not in machine names: ip-10-0-7-124.us-west-2.compute.internal ip-10-0-7-124.us-west-2.compute.internal I0128 18:02:41.300609 1 main.go:217] Error syncing csr csr-7z7f6: DNS name 'ec2-54-203-167-77.us-west-2.compute.amazonaws.com' not in machine names: ip-10-0-7-124.us-west-2.compute.internal ip-10-0-7-124.us-west-2.compute.internal Manually approving the pending certs for that cluster will restore oc debug node functionality on that cluster: oc adm certificate approve certificatesigningrequest.certificates.k8s.io/csr-zhp5l `oc debug node/<node_ip>` now will be successful after manual approval of certs Version-Release number of selected component (if applicable): # oc version Client Version: 4.4.0-0.nightly-2020-01-24-141203 Server Version: 4.4.0-0.nightly-2020-01-24-141203 Kubernetes Version: v1.17.1 root@ip-172-31-40-229: ~/oc_clients # How reproducible: All the time on OCP 4.4 Steps to Reproduce: 1. AWS IPI Cluster ( 3 master, 3 worker nodes) install of OCP 4.4 with 4.4.0-0.nightly-2020-01-24-141203 payload 2. oc get machineset -n openshift-machine-api -o yaml > first_worker_node_machineset.yaml 3. cp first_worker_node_machineset.yaml new_workload_node_machineset.yaml 4. vim machineset new_workload_node_machineset.yaml. Edit machineset name and labels and ensure that providerSpec value publicIP: true 5. oc create -f first_worker_node_machineset.yaml Actual results: Several CSRs pending approval, and `oc debug node/<new_workload_node_ip>` fails with: Expected results: CSRs should be automatically approved when scaling the cluster and adding a new workload node for automation. Additional info: Links to msut-gather logs, machineset, machine-auto-approver logs will be provided in next comment
This was a regression introduced by: https://github.com/openshift/cluster-api-provider-aws/pull/285 It has been fixed with: https://github.com/openshift/cluster-api-provider-aws/pull/288
Verified in 4.4.0-0.nightly-2020-02-05-181112 A machine with publicIp set to true is provisioned and approved. apiVersion: machine.openshift.io/v1beta1 kind: Machine metadata: annotations: machine.openshift.io/instance-state: running creationTimestamp: "2020-02-06T05:38:39Z" finalizers: - machine.machine.openshift.io generateName: qe-jhou06-f8dr2-worker-us-east-2a-publicip- generation: 2 labels: machine.openshift.io/cluster-api-cluster: qe-jhou06-f8dr2 machine.openshift.io/cluster-api-machine-role: worker machine.openshift.io/cluster-api-machine-type: worker machine.openshift.io/cluster-api-machineset: qe-jhou06-f8dr2-worker-us-east-2a machine.openshift.io/instance-type: m4.large machine.openshift.io/region: us-east-2 machine.openshift.io/zone: us-east-2a name: qe-jhou06-f8dr2-worker-us-east-2a-publicip-ltrpk namespace: openshift-machine-api ownerReferences: - apiVersion: machine.openshift.io/v1beta1 blockOwnerDeletion: true controller: true kind: MachineSet name: qe-jhou06-f8dr2-worker-us-east-2a-publicip uid: cbf4296b-948a-43ca-bc74-2e1c69b6ea3a resourceVersion: "58548" selfLink: /apis/machine.openshift.io/v1beta1/namespaces/openshift-machine-api/machines/qe-jhou06-f8dr2-worker-us-east-2a-publicip-ltrpk uid: b2c395ac-b0e5-45c7-819b-1056b34c8c39 spec: metadata: creationTimestamp: null providerID: aws:///us-east-2a/i-002b70030a8d0af6c providerSpec: value: ami: id: ami-0a8ba019bc9d4bd64 apiVersion: awsproviderconfig.openshift.io/v1beta1 blockDevices: - ebs: iops: 0 volumeSize: 120 volumeType: gp2 credentialsSecret: name: aws-cloud-credentials deviceIndex: 0 iamInstanceProfile: id: qe-jhou06-f8dr2-worker-profile instanceType: m4.large kind: AWSMachineProviderConfig metadata: creationTimestamp: null placement: availabilityZone: us-east-2a region: us-east-2 publicIp: true securityGroups: - filters: - name: tag:Name values: - qe-jhou06-f8dr2-worker-sg subnet: filters: - name: tag:Name values: - qe-jhou06-f8dr2-private-us-east-2a tags: - name: kubernetes.io/cluster/qe-jhou06-f8dr2 value: owned userDataSecret: name: worker-user-data status: addresses: - address: 10.0.131.133 type: InternalIP - address: 3.135.218.125 type: ExternalIP - address: ip-10-0-131-133.us-east-2.compute.internal type: InternalDNS - address: ip-10-0-131-133.us-east-2.compute.internal type: Hostname - address: ec2-3-135-218-125.us-east-2.compute.amazonaws.com type: ExternalDNS lastUpdated: "2020-02-06T05:43:21Z" nodeRef: kind: Node name: ip-10-0-131-133.us-east-2.compute.internal uid: cecd777b-8291-46fd-8a43-a41a77b3a24a phase: Running providerStatus: apiVersion: awsproviderconfig.openshift.io/v1beta1 conditions: - lastProbeTime: "2020-02-06T05:38:41Z" lastTransitionTime: "2020-02-06T05:38:41Z" message: machine successfully created reason: MachineCreationSucceeded status: "True" type: MachineCreation instanceId: i-002b70030a8d0af6c instanceState: running kind: AWSMachineProviderStatus oc get csr NAME AGE REQUESTOR CONDITION csr-hxw7z 9m35s system:node:ip-10-0-131-133.us-east-2.compute.internal Approved,Issued csr-vmk49 9m48s system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued