Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2085335

Summary:

Update from 4.8.39 to 4.9.31 is failing on OCP with dualstack cluster network

Product:

OpenShift Container Platform

Reporter:

Ashish Vyawahare <avyawahare87>

Component:

Etcd

Assignee:

Dean West <dwest>

Status:

CLOSED DUPLICATE

QA Contact:

ge liu <geliu>

Severity:

high

Docs Contact:

Priority:

medium

Version:

4.8

CC:

smerrow, tjungblu

Target Milestone:

---

Target Release:

---

Hardware:

All

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2022-09-08 14:14:33 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
ClusterUpdateError	none
must-gather data part1 --> namespace included from assisted-installer to openshift-kni-infra	none
must-gather part2 --> Included namespace openshift-kube-apiserver and openshift-kube-apiserver-operator	none
must-gather part3 --> Included namespaces from openshift-kube-controller-manager to openshift-vsphere-infra	none

Description Ashish Vyawahare 2022-05-13 04:12:14 UTC

Created attachment 1879295 [details]
ClusterUpdateError

Description of problem:
We have created the OCP cluster (version 4.8.39) with dual stack cluster network,
we have the VM interface which is configured with ipv4 and ipv6 on all nodes(3 master node and 2 worker node).
Cluster installation is working working fine.

But cluster update from 4.8.39 to 4.8.31 is not working fine.
Cluster update is stuck at Partial state, with below failing reason,

"
EtcdCertSignerControllerDegraded: [x509: certificate is valid for 10.30.1.4, not 2101::4, x509: certificate is valid for ::1, 10.30.1.4, 127.0.0.1, ::1, not 2101::4]
"

[core@ocp-avyaw-nyu1mo-ctrl-3 ~]$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.39    True        True          12h     Unable to apply 4.9.31: wait has exceeded 40 minutes for these operators: etcd

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Create the OCP cluster (4.8.39) with dual stack network. 
2. Update the OCP cluster to 4.9.31.

Actual results:
 Cluster update stuck at Partial state with error EtcdCertSignerControllerDegraded: [x509: certificate is valid for 10.30.1.4, not 2101::4, x509: certificate is valid for ::1, 10.30.1.4, 127.0.0.1, ::1, not 2101::4]


Expected results:

Cluster update should work fine.

Additional info:

ClusterID: 9ff77ed0-e858-4b07-b30d-ab5f4692dddf
ClusterVersion: Updating to "4.9.31" from "4.8.39" for 13 hours: Working towards 4.9.31: 71 of 738 done (9% complete)
ClusterOperators:
	clusteroperator/authentication is degraded because APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-oauth-apiserver ()
OAuthServerDeploymentDegraded: 1 of 3 requested instances are unavailable for oauth-openshift.openshift-authentication ()
	clusteroperator/etcd is degraded because EtcdCertSignerControllerDegraded: [x509: certificate is valid for 10.30.1.4, not 2101::4, x509: certificate is valid for ::1, 10.30.1.4, 127.0.0.1, ::1, not 2101::4]
	clusteroperator/machine-config is degraded because Unable to apply 4.9.31: timed out waiting for the condition during syncRequiredMachineConfigPools: error pool master is not ready, retrying. Status: (pool degraded: true total: 3, ready 1, updated: 1, unavailable: 1)
	clusteroperator/openshift-apiserver is degraded because APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-apiserver ()



[core@ocp-avyaw-nyu1mo-ctrl-3 ~]$ oc describe network
Name:         cluster
Namespace:    
Labels:       <none>
Annotations:  <none>
API Version:  config.openshift.io/v1
Kind:         Network
Metadata:
  Creation Timestamp:  2022-05-12T13:19:32Z
  Generation:          2
  Managed Fields:
    API Version:  config.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
        .:
        f:clusterNetwork:
        f:externalIP:
          .:
          f:policy:
        f:networkType:
        f:serviceNetwork:
      f:status:
    Manager:      cluster-bootstrap
    Operation:    Update
    Time:         2022-05-12T13:19:32Z
    API Version:  config.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        f:clusterNetwork:
        f:networkType:
        f:serviceNetwork:
    Manager:         cluster-network-operator
    Operation:       Update
    Time:            2022-05-12T13:21:11Z
  Resource Version:  3087
  UID:               a21bdaa7-de95-4b7f-8b40-ed84946fe11d
Spec:
  Cluster Network:
    Cidr:         10.128.0.0/14
    Host Prefix:  23
    Cidr:         2001::/60
    Host Prefix:  64
  External IP:
    Policy:
  Network Type:  Contrail
  Service Network:
    172.30.0.0/16
    2222::/108
Status:
  Cluster Network:
    Cidr:         10.128.0.0/14
    Host Prefix:  23
    Cidr:         2001::/60
    Host Prefix:  64
  Network Type:   Contrail
  Service Network:
    172.30.0.0/16
    2222::/108
Events:  <none>

Comment 1 Ashish Vyawahare 2022-05-13 06:15:04 UTC

Created attachment 1879313 [details]
must-gather data part1 --> namespace included from assisted-installer to openshift-kni-infra

Comment 2 Ashish Vyawahare 2022-05-13 06:33:13 UTC

Created attachment 1879315 [details]
must-gather part2 --> Included namespace openshift-kube-apiserver  and openshift-kube-apiserver-operator

Comment 3 Ashish Vyawahare 2022-05-13 06:37:17 UTC

Created attachment 1879316 [details]
must-gather part3 --> Included namespaces from openshift-kube-controller-manager to  openshift-vsphere-infra

Comment 12 Ashish Vyawahare 2022-06-28 06:27:35 UTC

Hi,
Any update on this bug?

I see there is similar bug https://bugzilla.redhat.com/show_bug.cgi?id=2046335