1906321 – 4.7.0-0.nightly-ppc64le : machine-config co is in degraded state.

Bug 1906321 - 4.7.0-0.nightly-ppc64le : machine-config co is in degraded state.

Summary: 4.7.0-0.nightly-ppc64le : machine-config co is in degraded state.

Keywords:
Status:	CLOSED DUPLICATE of bug 1901034
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Multi-Arch
Sub Component:
Version:	4.7
Hardware:	ppc64le
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	4.7.0
Assignee:	Douglas Slavens
QA Contact:	Barry Donahue
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-12-10 09:23 UTC by Alisha
Modified:	2023-09-18 00:23 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-01-07 13:35:58 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Alisha 2020-12-10 09:23:41 UTC

Description of problem:
Used a OCP 4.7 build. Observed that machine-config co is in degraded(=true) state

Version-Release number of selected component (if applicable):

# oc version
Client Version: 4.7.0-0.nightly-ppc64le-2020-12-08-141649
Server Version: 4.7.0-0.nightly-ppc64le-2020-12-08-141649
Kubernetes Version: v1.19.2+ad738ba

# oc get co
NAME                                       VERSION                                     AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         True       4h17m
baremetal                                  4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      17h
cloud-credential                           4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      18h
cluster-autoscaler                         4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      17h
config-operator                            4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      17h
console                                    4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      4h25m
csi-snapshot-controller                    4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      16h
dns                                        4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      17h
etcd                                       4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      17h
image-registry                             4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      17h
ingress                                    4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      4h30m
insights                                   4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      17h
kube-apiserver                             4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      17h
kube-controller-manager                    4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      17h
kube-scheduler                             4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      17h
kube-storage-version-migrator              4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      4h30m
machine-api                                4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      17h
machine-approver                           4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      17h
machine-config                                                                         False       True          True       17h
marketplace                                4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      16h
monitoring                                 4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      4h27m
network                                    4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      4h29m
node-tuning                                4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      17h
openshift-apiserver                        4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      16h
openshift-controller-manager               4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        True          False      17h
openshift-samples                          4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      17h
operator-lifecycle-manager                 4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      17h
operator-lifecycle-manager-catalog         4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      17h
operator-lifecycle-manager-packageserver   4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      4h23m
service-ca                                 4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      17h
storage                                    4.7.0-0.nightly-ppc64le-2020-12-08-141649   True        False         False      17h

# oc get pods --all-namespaces | grep -v "Running\|Completed"
NAMESPACE                                          NAME                                                      READY   STATUS                 RESTARTS   AGE
openshift-kube-apiserver                           installer-4-master-2                                      0/1     CreateContainerError   0          17h
openshift-kube-apiserver                           kube-apiserver-master-1                                   0/5     Init:0/1               0          34s


# oc describe co machine-config
Name:         machine-config
Namespace:
Labels:       <none>
Annotations:  exclude.release.openshift.io/internal-openshift-hosted: true
              include.release.openshift.io/self-managed-high-availability: true
              include.release.openshift.io/single-node-developer: true
API Version:  config.openshift.io/v1
Kind:         ClusterOperator
Metadata:
  Creation Timestamp:  2020-12-09T14:31:30Z
  Generation:          1
  Managed Fields:
    API Version:  config.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:exclude.release.openshift.io/internal-openshift-hosted:
          f:include.release.openshift.io/self-managed-high-availability:
          f:include.release.openshift.io/single-node-developer:
      f:spec:
      f:status:
        .:
        f:versions:
    Manager:      cluster-version-operator
    Operation:    Update
    Time:         2020-12-09T14:31:30Z
    API Version:  config.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        f:conditions:
        f:extension:
          .:
          f:master:
          f:worker:
        f:relatedObjects:
    Manager:         machine-config-operator
    Operation:       Update
    Time:            2020-12-10T09:19:27Z
  Resource Version:  456067
  Self Link:         /apis/config.openshift.io/v1/clusteroperators/machine-config
  UID:               f050562f-9ca7-4047-81aa-334885cbcc0b
Spec:
Status:
  Conditions:
    Last Transition Time:  2020-12-09T15:38:12Z
    Message:               Working towards 4.7.0-0.nightly-ppc64le-2020-12-08-141649
    Status:                True
    Type:                  Progressing
    Last Transition Time:  2020-12-09T15:41:44Z
    Reason:                One or more machine config pool is degraded, please see `oc get mcp` for further details and resolve before upgrading
    Status:                False
    Type:                  Upgradeable
    Last Transition Time:  2020-12-09T15:52:21Z
    Message:               Unable to apply 4.7.0-0.nightly-ppc64le-2020-12-08-141649: timed out waiting for the condition during syncRequiredMachineConfigPools: error pool master is not ready, retrying. Status: (pool degraded: true total: 3, ready 0, updated: 0, unavailable: 3)
    Reason:                RequiredPoolsFailed
    Status:                True
    Type:                  Degraded
    Last Transition Time:  2020-12-09T15:52:21Z
    Message:               Cluster not available for 4.7.0-0.nightly-ppc64le-2020-12-08-141649
    Status:                False
    Type:                  Available
  Extension:
    Master:  pool is degraded because nodes fail with "3 nodes are reporting degraded status on sync": "Node master-0 is reporting: \"machineconfig.machineconfiguration.openshift.io \\\"rendered-master-f577f6a0e84fd5116925f21e284d2d3b\\\" not found\", Node master-2 is reporting: \"machineconfig.machineconfiguration.openshift.io \\\"rendered-master-f577f6a0e84fd5116925f21e284d2d3b\\\" not found\", Node master-1 is reporting: \"machineconfig.machineconfiguration.openshift.io \\\"rendered-master-f577f6a0e84fd5116925f21e284d2d3b\\\" not found\""
    Worker:  all 2 nodes are at latest configuration rendered-worker-40b7876d3bc63acdabb74b34a23b4cf2
  Related Objects:
    Group:
    Name:      openshift-machine-config-operator
    Resource:  namespaces
    Group:     machineconfiguration.openshift.io
    Name:
    Resource:  machineconfigpools
    Group:     machineconfiguration.openshift.io
    Name:
    Resource:  controllerconfigs
    Group:     machineconfiguration.openshift.io
    Name:
    Resource:  kubeletconfigs
    Group:     machineconfiguration.openshift.io
    Name:
    Resource:  containerruntimeconfigs
    Group:     machineconfiguration.openshift.io
    Name:
    Resource:  machineconfigs
    Group:
    Name:
    Resource:  nodes
Events:        <none>

Comment 1 Sinny Kumari 2020-12-10 12:37:23 UTC

Detail provided above is not enough to analyze what's going on. Can you please provide must-gather log?

Comment 2 Alisha 2020-12-11 06:38:54 UTC

Getting the foll. error with the must-gather output : 
# oc adm must-gather
[must-gather      ] OUT Using must-gather plug-in image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e18b2922002f09bd3a367ec760fa8974625adbf30e2338339a2fb7d2c4030d37
[must-gather      ] OUT namespace/openshift-must-gather-d9dfn created
[must-gather      ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-kqbxb created
[must-gather      ] OUT pod for plug-in image quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e18b2922002f09bd3a367ec760fa8974625adbf30e2338339a2fb7d2c4030d37 created
[must-gather-n979w] OUT gather did not start: timed out waiting for the condition
[must-gather      ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-kqbxb deleted
[must-gather      ] OUT namespace/openshift-must-gather-d9dfn deleted
error: gather did not start for pod must-gather-n979w: timed out waiting for the condition

Let me know if any specific logs are required.

Comment 3 Amit Ghatwal 2020-12-11 10:20:57 UTC

Hi Sinny,

Am getting this same behavior as above mentioned ( machine-config cluster operator being in degraded state)

# oc version
Client Version: 4.7.0-0.nightly-ppc64le-2020-12-04-050650
Server Version: 4.7.0-0.nightly-ppc64le-2020-12-04-050650
Kubernetes Version: v1.19.2+ad738ba


# oc get pods --all-namespaces | grep -v "Running\|Completed"
NAMESPACE                                          NAME                                                     READY   STATUS      RESTARTS   
AGE


# oc describe co machine-config
Name:         machine-config
Namespace:
Labels:       <none>
Annotations:  exclude.release.openshift.io/internal-openshift-hosted: true
              include.release.openshift.io/self-managed-high-availability: true
              include.release.openshift.io/single-node-developer: true
API Version:  config.openshift.io/v1
Kind:         ClusterOperator
Metadata:
  Creation Timestamp:  2020-12-10T13:46:14Z
  Generation:          1
  Managed Fields:
    API Version:  config.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:exclude.release.openshift.io/internal-openshift-hosted:
          f:include.release.openshift.io/self-managed-high-availability:
          f:include.release.openshift.io/single-node-developer:
      f:spec:
      f:status:
        .:
        f:versions:
    Manager:      cluster-version-operator
    Operation:    Update
    Time:         2020-12-10T13:46:15Z
    API Version:  config.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        f:conditions:
        f:extension:
          .:
          f:master:
          f:worker:
        f:relatedObjects:
    Manager:         machine-config-operator
    Operation:       Update
    Time:            2020-12-11T10:17:51Z
  Resource Version:  506473
  Self Link:         /apis/config.openshift.io/v1/clusteroperators/machine-config
  UID:               dcc0400e-e48a-47e6-af2c-5d4bd37e96f9
Spec:
Status:
  Conditions:
    Last Transition Time:  2020-12-10T13:55:00Z
    Message:               Working towards 4.7.0-0.nightly-ppc64le-2020-12-04-050650
    Status:                True
    Type:                  Progressing
    Last Transition Time:  2020-12-10T13:56:59Z
    Reason:                One or more machine config pool is degraded, please see `oc get mcp` for further details and resolve before upgrading
    Status:                False
    Type:                  Upgradeable
    Last Transition Time:  2020-12-10T14:19:03Z
    Message:               Unable to apply 4.7.0-0.nightly-ppc64le-2020-12-04-050650: timed out waiting for the condition during syncRequiredMachineConfigPools: error pool master is not ready, retrying. Status: (pool degraded: true total: 3, ready 0, updated: 0, unavailable: 3)
    Reason:                RequiredPoolsFailed
    Status:                True
    Type:                  Degraded
    Last Transition Time:  2020-12-10T14:19:03Z
    Message:               Cluster not available for 4.7.0-0.nightly-ppc64le-2020-12-04-050650
    Status:                False
    Type:                  Available
  Extension:
    Master:  pool is degraded because nodes fail with "3 nodes are reporting degraded status on sync": "Node master-0 is reporting: \"machineconfig.machineconfiguration.openshift.io \\\"rendered-master-4815cfdfa836bd807b316b48bbd134a6\\\" not found\", Node master-1 is reporting: \"machineconfig.machineconfiguration.openshift.io \\\"rendered-master-4815cfdfa836bd807b316b48bbd134a6\\\" not found\", Node master-2 is reporting: \"machineconfig.machineconfiguration.openshift.io \\\"rendered-master-4815cfdfa836bd807b316b48bbd134a6\\\" not found\""
    Worker:  all 2 nodes are at latest configuration rendered-worker-516f195712733d60a1a38afba9946dbe
  Related Objects:
    Group:
    Name:      openshift-machine-config-operator
    Resource:  namespaces
    Group:     machineconfiguration.openshift.io
    Name:
    Resource:  machineconfigpools
    Group:     machineconfiguration.openshift.io
    Name:
    Resource:  controllerconfigs
    Group:     machineconfiguration.openshift.io
    Name:
    Resource:  kubeletconfigs
    Group:     machineconfiguration.openshift.io
    Name:
    Resource:  containerruntimeconfigs
    Group:     machineconfiguration.openshift.io
    Name:
    Resource:  machineconfigs
    Group:
    Name:
    Resource:  nodes
Events:        <none>

Must gather logs shared here - https://drive.google.com/file/d/1JGO1nYWqrdDj87Vk5PXS4XCm1a29-sDi/view?usp=sharing ( have shared with both - sinny and alisha).

Let me know if anything else is required to debug this issue.

Regards,
Amit

Comment 5 Yu Qi Zhang 2020-12-14 18:09:45 UTC

Hi,

Could you give general view access to that must-gather? I've also requested access to help take a look.

For rendered-xxx not found issues, we also need the "day 1 config" on the nodes to check for the mismatch. Could you follow the steps in https://github.com/openshift/machine-config-operator/issues/2114#issuecomment-700122866 and gather that for us as well?

Comment 6 Amit Ghatwal 2020-12-15 06:40:09 UTC

Hi,

Have added you to must-gather logs + access given for day1 config logs from all my cluster nodes - https://drive.google.com/drive/folders/1W_5kN_NAcbWmlHzFoTE3OFw38TqBCpVo?usp=sharing .

Comment 7 Hiro Miyamoto 2020-12-15 14:16:12 UTC

Possibly another data point. I installed 4.7.0-0.nightly-ppc64le-2020-12-14-080110, and got `machine-config` operator Degraded, while `authentication` is fine (not Degraded). If you want something from this cluster, let me know.

Comment 8 Kirsten Garrison 2020-12-15 21:48:43 UTC

@Amit

Can you please update the must gather to have general viewer permission?

I just launched a cluster successfully using 4.7.0-0.nightly-2020-12-14-165231, so it's unclear to me whether this is a persistent problem or not.

Looking at the nightly page: the ones reported here aren't available...

I'm not super familiar with 4.7.0-0.nightly-ppc64le... they seem to be specific builds? And all of the complaints here seem to be using those? Where are those images coming from? How does it differ from whats in https://openshift-release.apps.ci.l2s4.p1.openshiftapps.com/#4.7.0-0.nightly  What type of cluster is this?

Comment 9 Kirsten Garrison 2020-12-15 21:53:46 UTC

Based on the slack threads: https://coreos.slack.com/archives/CH76YSYSC/p1608039447264800

This seems to be something called Power and the failures seem specific to that?  Is there someone who owns it?

Comment 10 Yu Qi Zhang 2020-12-15 23:11:58 UTC

To supplement a bit, your cluster looks very odd. For one, there are three rendered-masters being referenced in various places, when there is only one on the system. This is the first time I've seen this.

So in your operator status (as well as all 3 daemon pods on the masters), its looking for something called "rendered-master-4815cfdfa836bd807b316b48bbd134a6", which should have been the rendered config these masters booted with, generated by the bootstrap machine-config-controller. This is what's supposed to be in "/etc/mcs-machine-config-content.json", BUT if you look at the file contents you gave for /etc/machine-config-content.json you'll see instead its rendered-master-7aaca65a4c6888af4ad9b13674624cd4, something different. And finally, the one on the system generated by master MCC is named rendered-master-29d0c4d6ab430c17be92453ecc000935, yet another config. So I don't even know what would have been in rendered-master-4815cfdfa836bd807b316b48bbd134a6.

On top of that, if you compare the contents of the two machineconfigs, you'll see that there is quite a big difference. The one on the system:
1. has the CA (which the bootstrap doesnt)
2. is missing /etc/kubernetes/apiserver-url.env
3. is missing /etc/systemd/system/kubelet.service.d/20-logging.conf

so depending what added those 2, might be a good place to start checking.

Again to reiterate:
1. rendered-master-4815cfdfa836bd807b316b48bbd134a6 -> what the master node thinks its desired content is. Did you modify that yourself?
2. rendered-master-7aaca65a4c6888af4ad9b13674624cd4 -> what the master was served (its config at the beginning) -> we have this but its not referenced anywhere
3. rendered-master-29d0c4d6ab430c17be92453ecc000935 -> the actual running config, which is missing the above stanzas

I feel like how you're deploying the MCO is probably causing the above drift

Comment 11 Yu Qi Zhang 2020-12-15 23:21:14 UTC

Ah I just noticed something:

in-cluster:
      "machineconfiguration.openshift.io/generated-by-controller-version": "bb2630cee2ee42a39638e968da9601c726467494"

bootstrap:
      "machineconfiguration.openshift.io/generated-by-controller-version": "34d1cf050f528b220e95a32b747c48a5004ce1f0"

Normally these are the same. Sounds like you have a different controller version deployed in bootstrap and in-cluster?

Comment 12 Kirsten Garrison 2020-12-16 00:07:50 UTC

Noting that this is not a standard deployment:

 uid: 2cccd45e-232a-474c-ad82-5be1c6a2d58a
spec:
  cloudConfig:
    name: ""
  platformSpec:
    type: None
status:
  ...
  platform: None
  platformStatus:
    type: None


Aside from Yu Qi's points above: 
Can you please give us info about these deployments/how does an install for this differ from an OCP install? 

Can you give us steps to reproduce?

When did you start seeing this failure? How often is it failing? 

Poking around I found some jobs that seem to use the same nightlies (4.7.0-0.nightly-ppc64le...): https://testgrid.k8s.io/redhat-openshift-ocp-release-4.7-informing#release-openshift-origin-installer-e2e-remote-libvirt - but I'm not seeing the MCP failure reflected there (the master pools are not degraded) Could someone explain why?

Moving over to multi-arch to get more info about this..

Comment 13 Dan Li 2020-12-16 15:05:34 UTC

We don't think this bug will be resolved before the end of the current sprint (Dec 26th). So I'm adding UpcomingSprint for this sprint.

Comment 14 Prashanth Sundararaman 2020-12-16 15:06:57 UTC

Just adding that after a conversation with Hiro, this is only seen on a powerVS/VM deployment...we have not seen it here in a libvirt environment or a plain baremetal env

Comment 15 Hiro Miyamoto 2020-12-16 15:44:57 UTC

My cluster doesn't show `authentication` operator as Degraded, but still shows the same (no version and Degraded) for `machine-config` operator. And given #10, there may be something that Power automation does differently than x86 during install? Yussuf @yshaikh, can you take a look?

Comment 16 Hiro Miyamoto 2020-12-16 15:52:54 UTC

(Sorry, new to BZ, reposting my previous comment with better link/cc, etc..)


My cluster doesn't show `authentication` operator as Degraded, but still shows the same (no version and Degraded) for `machine-config` operator. And given https://bugzilla.redhat.com/show_bug.cgi?id=1906321#c10, there may be something that Power automation does differently than x86 during install? Yussuf @yshaikh, can you take a look?

Comment 17 Yussuf Shaikh 2020-12-17 13:45:56 UTC

I do not think that anything changed in between these builds to the automation we have. Also I think new Power VM deploys are working fine from past couple of days. May be @aprabhu can confirm this?

Comment 18 pdsilva 2020-12-22 04:27:40 UTC

After running some more tests on PowerVM, we see that this issue occurs only when cluster proxy is used for the installation. Let me know if you need any additional logs for this.

# oc get co machine-config
NAME             VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
machine-config             False       True          True       4d20h


# oc get proxy cluster -oyaml
apiVersion: config.openshift.io/v1
kind: Proxy
metadata:
  creationTimestamp: "2020-12-17T07:39:41Z"
  generation: 1
  managedFields:
  - apiVersion: config.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        .: {}
        f:httpProxy: {}
        f:httpsProxy: {}
        f:noProxy: {}
        f:trustedCA:
          .: {}
          f:name: {}
      f:status:
        .: {}
        f:httpProxy: {}
        f:httpsProxy: {}
    manager: cluster-bootstrap
    operation: Update
    time: "2020-12-17T07:39:41Z"
  - apiVersion: config.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        f:noProxy: {}
    manager: cluster-network-operator
    operation: Update
    time: "2020-12-17T07:48:15Z"
  name: cluster
  resourceVersion: "3248"
  uid: 0eaa8cae-cf1f-4564-8e79-c0015ad9c5dd
spec:
  httpProxy: http://pravind-47test-bastion-0:3128
  httpsProxy: http://pravind-47test-bastion-0:3128
  noProxy: .pravind-47test.redhat.com,192.168.26.0/24
  trustedCA:
    name: ""
status:
  httpProxy: http://pravind-47test-bastion-0:3128
  httpsProxy: http://pravind-47test-bastion-0:3128
  noProxy: .cluster.local,.pravind-47test.redhat.com,.svc,10.0.0.0/16,10.128.0.0/14,127.0.0.1,172.30.0.0/16,192.168.26.0/24,api-int.pravind-47test.redhat.com,etcd-0.,etcd-1.,etcd-2.,localhost

Comment 19 Dan Li 2021-01-06 13:35:40 UTC

After chatting with the Power team, it was determined that this bug is a "partial" blocker, as the bug blocks the team's testing when external proxy is enabled. Therefore, I'm changing the "blocker" flag to "Blocker+"

Comment 21 Dan Li 2021-01-06 14:30:08 UTC

Relaying a message from Prashanth. 

Hi all, could someone check if this bug is the same issue as BZ 1901034?

Comment 22 pdsilva 2021-01-07 07:02:34 UTC

Yes, it does appear to be similar. Below details are on a cluster on PowerVM which is identical to the findings in BZ 1901034.

1. The value for etcdDiscoveryDomain is empty:

# oc get infrastructures.config.openshift.io cluster -o yaml
apiVersion: config.openshift.io/v1
kind: Infrastructure
metadata:
  creationTimestamp: "2020-12-31T06:51:25Z"
  generation: 1
  managedFields:
  - apiVersion: config.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        .: {}
        f:cloudConfig:
          .: {}
          f:name: {}
        f:platformSpec:
          .: {}
          f:type: {}
      f:status:
        .: {}
        f:apiServerInternalURI: {}
        f:apiServerURL: {}
        f:etcdDiscoveryDomain: {}
        f:infrastructureName: {}
        f:platform: {}
        f:platformStatus:
          .: {}
          f:type: {}
    manager: cluster-bootstrap
    operation: Update
    time: "2020-12-31T06:51:25Z"
  name: cluster
  resourceVersion: "541"
  uid: ef112996-3e44-4cae-bcd6-e1ffa941d0b9
spec:
  cloudConfig:
    name: ""
  platformSpec:
    type: None
status:
  apiServerInternalURI: https://api-int.satwsin-latest.169.48.22.245.nip.io:6443
  apiServerURL: https://api.satwsin-latest.169.48.22.245.nip.io:6443
  etcdDiscoveryDomain: ""
  infrastructureName: satwsin-latest-97p77
  platform: None
  platformStatus:
    type: None



2. The etcd fqdn entries for noProxy are missing the domain.

#  oc get proxies.config.openshift.io cluster  -o yaml
apiVersion: config.openshift.io/v1
kind: Proxy
metadata:
  creationTimestamp: "2020-12-31T06:51:26Z"
  generation: 1
  managedFields:
  - apiVersion: config.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        .: {}
        f:httpProxy: {}
        f:httpsProxy: {}
        f:noProxy: {}
        f:trustedCA:
          .: {}
          f:name: {}
      f:status:
        .: {}
        f:httpProxy: {}
        f:httpsProxy: {}
    manager: cluster-bootstrap
    operation: Update
    time: "2020-12-31T06:51:26Z"
  - apiVersion: config.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        f:noProxy: {}
    manager: cluster-network-operator
    operation: Update
    time: "2020-12-31T07:01:01Z"
  name: cluster
  resourceVersion: "3409"
  uid: 2cae1f6c-0576-4cf1-87c8-1a76b426a7ac
spec:
  httpProxy: http://satwsin-latest-bastion-0:3128
  httpsProxy: http://satwsin-latest-bastion-0:3128
  noProxy: .satwsin-latest.169.48.22.245.nip.io,192.168.25.0/24
  trustedCA:
    name: ""
status:
  httpProxy: http://satwsin-latest-bastion-0:3128
  httpsProxy: http://satwsin-latest-bastion-0:3128
  noProxy: .cluster.local,.satwsin-latest.169.48.22.245.nip.io,.svc,10.0.0.0/16,10.128.0.0/14,127.0.0.1,172.30.0.0/16,192.168.25.0/24,api-int.satwsin-latest.169.48.22.245.nip.io,etcd-0.,etcd-1.,etcd-2.,localhost

Comment 23 Dan Li 2021-01-07 13:35:58 UTC

Closing this bug as evident in Comment 22, that this bug shares the identical findings as BZ 1901034. 

The explanation of the regression that caused this bug can be found here: https://bugzilla.redhat.com/show_bug.cgi?id=1909502#c3

Please feel free to re-open if the issue still occurs after the other bug has been fixed, or if similar bug with variant errors occurs.

*** This bug has been marked as a duplicate of bug 1901034 ***

Comment 24 Satwinder Singh 2021-01-25 13:31:44 UTC

Verified the bug on 4.7.0-0.nightly-ppc64le-2021-01-24-004926

For fresh install with global proxy enabled on 4.7.0-0.nightly-ppc64le-2021-01-24-004926, after installation completed successfully, check the noProxy list set in proxy/cluster:

Co status:
---
machine-config                             4.7.0-0.nightly-ppc64le-2021-01-24-004926   True        False         False      143m

oc get proxy cluster -o yaml
---
status:
  noProxy: .cluster.local,.satwsin1-proxy.redhat.com,.svc,10.0.0.0/16,10.128.0.0/14,127.0.0.1,172.30.0.0/16,9.114.96.0/22,api-int.satwsin1-proxy.redhat.com,localhost

Comment 25 Red Hat Bugzilla 2023-09-18 00:23:52 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days

Note You need to log in before you can comment on or make changes to this bug.