Description of problem: Version-Release number of selected component (if applicable): 4.2.0-0.nightly-2019-11-24-111327 How reproducible: Always Steps to Reproduce: 1. Drop internet connection from the private subnets in this VPC 2. Launch a proxy server in public subnets of this VPC 3. Trigger a UPI install on aws with proxy enabled. Actual results: Installation failed. $ ./openshift-install wait-for install-complete --dir '/home/installer2/workspace/Launch Environment Flexy/workdir/install-dir' level=info msg="Waiting up to 30m0s for the cluster at https://api.jialiu425.qe.devcluster.openshift.com:6443 to initialize..." level=fatal msg="failed to initialize the cluster: Cluster operator machine-config is reporting a failure: Failed to resync 4.2.0-0.nightly-2019-11-24-111327 because: timed out waiting for the condition during syncRequiredMachineConfigPools: pool master has not progressed to latest configuration: configuration status for pool master is empty: pool is degraded because nodes fail with \"3 nodes are reporting degraded status on sync\": \"Node ip-10-0-72-7.us-east-2.compute.internal is reporting: \\\"machineconfig.machineconfiguration.openshift.io \\\\\\\"rendered-master-a3be962e3ac25c82a501d894dc950be5\\\\\\\" not found\\\", Node ip-10-0-61-75.us-east-2.compute.internal is reporting: \\\"machineconfig.machineconfiguration.openshift.io \\\\\\\"rendered-master-a3be962e3ac25c82a501d894dc950be5\\\\\\\" not found\\\", Node ip-10-0-56-33.us-east-2.compute.internal is reporting: \\\"machineconfig.machineconfiguration.openshift.io \\\\\\\"rendered-master-a3be962e3ac25c82a501d894dc950be5\\\\\\\" not found\\\"\", retrying" After the installation failed, check clusteroperators, only machine-config get to degrade state. # oc describe co machine-config Name: machine-config Namespace: Labels: <none> Annotations: <none> API Version: config.openshift.io/v1 Kind: ClusterOperator Metadata: Creation Timestamp: 2019-11-26T09:42:04Z Generation: 1 Resource Version: 29724 Self Link: /apis/config.openshift.io/v1/clusteroperators/machine-config UID: 00c49323-1031-11ea-9aab-02a0248741b0 Spec: Status: Conditions: Last Transition Time: 2019-11-26T09:42:04Z Message: Cluster not available for 4.2.0-0.nightly-2019-11-24-111327 Status: False Type: Available Last Transition Time: 2019-11-26T09:42:04Z Message: Cluster is bootstrapping 4.2.0-0.nightly-2019-11-24-111327 Status: True Type: Progressing Last Transition Time: 2019-11-26T09:52:30Z Message: Failed to resync 4.2.0-0.nightly-2019-11-24-111327 because: timed out waiting for the condition during syncRequiredMachineConfigPools: pool master has not progressed to latest configuration: configuration status for pool master is empty: pool is degraded because nodes fail with "3 nodes are reporting degraded status on sync": "Node ip-10-0-72-7.us-east-2.compute.internal is reporting: \"machineconfig.machineconfiguration.openshift.io \\\"rendered-master-a3be962e3ac25c82a501d894dc950be5\\\" not found\", Node ip-10-0-61-75.us-east-2.compute.internal is reporting: \"machineconfig.machineconfiguration.openshift.io \\\"rendered-master-a3be962e3ac25c82a501d894dc950be5\\\" not found\", Node ip-10-0-56-33.us-east-2.compute.internal is reporting: \"machineconfig.machineconfiguration.openshift.io \\\"rendered-master-a3be962e3ac25c82a501d894dc950be5\\\" not found\"", retrying Reason: RequiredPoolsFailed Status: True Type: Degraded Last Transition Time: 2019-11-26T09:52:30Z Reason: AsExpected Status: True Type: Upgradeable Extension: Last Sync Error: pool master has not progressed to latest configuration: configuration status for pool master is empty: pool is degraded because nodes fail with "3 nodes are reporting degraded status on sync": "Node ip-10-0-72-7.us-east-2.compute.internal is reporting: \"machineconfig.machineconfiguration.openshift.io \\\"rendered-master-a3be962e3ac25c82a501d894dc950be5\\\" not found\", Node ip-10-0-61-75.us-east-2.compute.internal is reporting: \"machineconfig.machineconfiguration.openshift.io \\\"rendered-master-a3be962e3ac25c82a501d894dc950be5\\\" not found\", Node ip-10-0-56-33.us-east-2.compute.internal is reporting: \"machineconfig.machineconfiguration.openshift.io \\\"rendered-master-a3be962e3ac25c82a501d894dc950be5\\\" not found\"", retrying Worker: all 2 nodes are at latest configuration rendered-worker-711e924795d1c0192f461a1c551f621f Related Objects: Group: Name: openshift-machine-config-operator Resource: namespaces Group: machineconfiguration.openshift.io Name: master Resource: machineconfigpools Group: machineconfiguration.openshift.io Name: worker Resource: machineconfigpools Group: machineconfiguration.openshift.io Name: machine-config-controller Resource: controllerconfigs Versions: Name: operator Version: 4.2.0-0.nightly-2019-11-24-111327 Events: <none> Expected results: installation should be passed. Additional info: 1. The released version 4.2.8 + proxy works well. 2. Nightly build 4.2.0-0.nightly-2019-11-24-111327 + proxy, failed. 3. Nightly build 4.2.0-0.nightly-2019-11-24-111327 + no proxy, passed.
The MCO didn't change between the nightlies that you mentioned: 12:19:01 [~] oc adm release info --commits registry.svc.ci.openshift.org/ocp/release:4.2.0-0.nightly-2019-11-24-111327 | grep machine-config-operator machine-config-operator https://github.com/openshift/machine-config-operator d780d197a9c5848ba786982c0c4aaa7487297046 12:19:13 [~] oc adm release info --commits registry.svc.ci.openshift.org/ocp/release:4.2.0-0.nightly-2019-11-24-111327 | grep machine-config-operator machine-config-operator https://github.com/openshift/machine-config-operator d780d197a9c5848ba786982c0c4aaa7487297046
(In reply to Antonio Murdaca from comment #3) > The MCO didn't change between the nightlies that you mentioned: > > 12:19:01 [~] oc adm release info --commits > registry.svc.ci.openshift.org/ocp/release:4.2.0-0.nightly-2019-11-24-111327 > | grep machine-config-operator > machine-config-operator > https://github.com/openshift/machine-config-operator > d780d197a9c5848ba786982c0c4aaa7487297046 > 12:19:13 [~] oc adm release info --commits > registry.svc.ci.openshift.org/ocp/release:4.2.0-0.nightly-2019-11-24-111327 > | grep machine-config-operator > machine-config-operator > https://github.com/openshift/machine-config-operator > d780d197a9c5848ba786982c0c4aaa7487297046 ok, forget that, didn't read "no proxy"
Sound like some side effect introduced by Bug 1770223. In the fix of 1770223, api.jialiu425.qe.devcluster.openshift.com is removed, inside cluster, should use api-int to instead api.
(In reply to Johnny Liu from comment #6) > Sound like some side effect introduced by Bug 1770223. > > > In the fix of 1770223, api.jialiu425.qe.devcluster.openshift.com is removed, > inside cluster, should use api-int to instead api. likely, yeah