Description of problem: When setup cluster with proxy, installation failed due to machine-config operator does not get ready. And also failed to run commands(logs, exec, port-forward, ) which need api proxyconnect to node. After research, found that recently we merge https://github.com/openshift/installer/pull/2829, but Network-Operator is not aware of this. so it flushes the noProxy with the one without machineNetwork, https://github.com/openshift/cluster-network-operator/blob/master/pkg/util/proxyconfig/no_proxy.go#L29 Known Affects: all api to nodes networking will be blocked, includes the following: 1. Installation will never success 2. all commands need api proxyconnect nodes will failed like oc logs, oc exec, etc 3. some node metrics target should be in RED Version-Release number of the following components: ./openshift-install 4.4.0-0.nightly-2020-02-03-224632 built from commit 725b71dce1d41c98e368ad9277e14c7ce9a9cb25 release image registry.svc.ci.openshift.org/ocp/release@sha256:5a51afee81638f559a92a7a1d910c24af8c4f458ea5baf8075fc3d81cf35f6fe How reproducible: Always Steps to Reproduce: 1. Setup a IPI cluster with proxy in install-config.yaml 2. try to run oc logs 3. Actual results: $ oc -n openshift-machine-config-operator logs -f machine-config-controller-6965dbc744-bpt98 Error from server: Get https://192.168.0.20:10250/containerLogs/openshift-machine-config-operator/machine-config-controller-6965dbc744-bpt98/machine-config-controller?follow=true: proxyconnect tcp: x509: certificate signed by unknown authority Expected results: Should not get such an error. Additional info:
Also this this issue in upi on aws install with proxy enabled. `machineNetwork` filed in install-config.yaml: proxy: httpProxy: http://proxy-user1:xxx@QE_PROXY_PLACEHOLDER:3128 httpsProxy: http://proxy-user1:xxx@QE_PROXY_PLACEHOLDER:3128 noProxy: test.no-proxy.com networking: clusterNetwork: - cidr: 10.128.0.0/14 hostPrefix: 23 serviceNetwork: - 172.30.0.0/16 networkType: OVNKubernetes machineNetwork: - cidr: 10.0.0.0/16 Trigger installation, failed. $ ./openshift-install wait-for install-complete --dir '/home/installer3/workspace/Launch Environment Flexy/workdir/install-dir' level=info msg="Waiting up to 30m0s for the cluster at https://api.jialiu-25822.qe.devcluster.openshift.com:6443 to initialize..." level=info msg="Cluster operator insights Disabled is False with : " level=info msg="Cluster operator machine-config Available is False with : Cluster not available for 4.4.0-0.nightly-2020-02-03-081920" level=error msg="Cluster operator machine-config Degraded is True with RequiredPoolsFailed: Failed to resync 4.4.0-0.nightly-2020-02-03-081920 because: timed out waiting for the condition during syncRequiredMachineConfigPools: pool master has not progressed to latest configuration: configuration status for pool master is empty: pool is degraded because nodes fail with \"3 nodes are reporting degraded status on sync\": \"Node ip-10-0-61-87.us-east-2.compute.internal is reporting: \\\"machineconfig.machineconfiguration.openshift.io \\\\\\\"rendered-master-fdb913d94892563827998728eb2d3557\\\\\\\" not found\\\", Node ip-10-0-59-238.us-east-2.compute.internal is reporting: \\\"machineconfig.machineconfiguration.openshift.io \\\\\\\"rendered-master-fdb913d94892563827998728eb2d3557\\\\\\\" not found\\\", Node ip-10-0-70-4.us-east-2.compute.internal is reporting: \\\"machineconfig.machineconfiguration.openshift.io \\\\\\\"rendered-master-fdb913d94892563827998728eb2d3557\\\\\\\" not found\\\"\", retrying" level=fatal msg="failed to initialize the cluster: Cluster operator machine-config is reporting a failure: Failed to resync 4.4.0-0.nightly-2020-02-03-081920 because: timed out waiting for the condition during syncRequiredMachineConfigPools: pool master has not progressed to latest configuration: configuration status for pool master is empty: pool is degraded because nodes fail with \"3 nodes are reporting degraded status on sync\": \"Node ip-10-0-61-87.us-east-2.compute.internal is reporting: \\\"machineconfig.machineconfiguration.openshift.io \\\\\\\"rendered-master-fdb913d94892563827998728eb2d3557\\\\\\\" not found\\\", Node ip-10-0-59-238.us-east-2.compute.internal is reporting: \\\"machineconfig.machineconfiguration.openshift.io \\\\\\\"rendered-master-fdb913d94892563827998728eb2d3557\\\\\\\" not found\\\", Node ip-10-0-70-4.us-east-2.compute.internal is reporting: \\\"machineconfig.machineconfiguration.openshift.io \\\\\\\"rendered-master-fdb913d94892563827998728eb2d3557\\\\\\\" not found\\\"\", retrying" After installation failure, compare noProxy list between bootstrap and cluster, found some difference. # sdiff b.log c.log .cluster.local .cluster.local .svc .svc .us-east-2.compute.internal .us-east-2.compute.internal 10.0.0.0/16 < 10.128.0.0/14 10.128.0.0/14 127.0.0.1 127.0.0.1 169.254.169.254 169.254.169.254 172.30.0.0/16 172.30.0.0/16 api-int.jialiu-25822.qe.devcluster.openshift.com api-int.jialiu-25822.qe.devcluster.openshift.com etcd-0.jialiu-25822.qe.devcluster.openshift.com etcd-0.jialiu-25822.qe.devcluster.openshift.com etcd-1.jialiu-25822.qe.devcluster.openshift.com etcd-1.jialiu-25822.qe.devcluster.openshift.com etcd-2.jialiu-25822.qe.devcluster.openshift.com etcd-2.jialiu-25822.qe.devcluster.openshift.com localhost localhost test.no-proxy.com test.no-proxy.com b.log is the noProxy list captured by running `env |grep -i proxy`, c.log is the noProxy list captured by running `oc get proxy cluster -o yaml`.
Verified this bug on 4.5.0-0.nightly-2020-03-06-190457 # oc get cm cluster-config-v1 -n kube-system -o yaml | grep cidr -A 2 - cidr: 10.128.0.0/14 hostPrefix: 23 machineNetwork: - cidr: 10.0.0.0/16 networkType: OpenShiftSDN serviceNetwork: # oc get proxy cluster -o yaml apiVersion: config.openshift.io/v1 kind: Proxy metadata: creationTimestamp: "2020-03-09T06:40:15Z" generation: 1 name: cluster resourceVersion: "680" selfLink: /apis/config.openshift.io/v1/proxies/cluster uid: 6d53c4fd-ddc3-4ad1-a6d0-3b3f4f83d5fc spec: httpProxy: http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@ec2-3-12-160-4.us-east-2.compute.amazonaws.com:3128 httpsProxy: http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@ec2-3-12-160-4.us-east-2.compute.amazonaws.com:3128 noProxy: test.no-proxy.com trustedCA: name: "" status: httpProxy: http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@ec2-3-12-160-4.us-east-2.compute.amazonaws.com:3128 httpsProxy: http://proxy-user1:JYgU8qRZV4DY4PXJbxJK@ec2-3-12-160-4.us-east-2.compute.amazonaws.com:3128 noProxy: .cluster.local,.svc,.us-east-2.compute.internal,10.0.0.0/16,10.128.0.0/14,127.0.0.1,169.254.169.254,172.30.0.0/16,api-int.zzhao45.qe.devcluster.openshift.com,etcd-0.zzhao45.qe.devcluster.openshift.com,etcd-1.zzhao45.qe.devcluster.openshift.com,etcd-2.zzhao45.qe.devcluster.o
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409