Bug 2032559

Summary: CNO allows migration to dual-stack in unsupported configurations
Product: OpenShift Container Platform Reporter: Dan Winship <danw>
Component: NetworkingAssignee: Andreas Karis <akaris>
Networking sub component: ovn-kubernetes QA Contact: Ross Brattain <rbrattai>
Status: CLOSED ERRATA Docs Contact:
Severity: low    
Priority: low CC: bpickard, rbrattai
Version: 4.11   
Target Milestone: ---   
Target Release: 4.11.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-08-10 10:40:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dan Winship 2021-12-14 17:28:24 UTC
CNO will (try to) let you migrate from single-stack to dual-stack regardless of what platform you are running on; it ought to only allow it on bare metal or "none".

(That is, the `infrastructure.config.openshift.io` object named "cluster" should have a `.Status.PlatformStatus.Type` of `configv1.BareMetalPlatformType` or `configv1.NonePlatformType`.)

(At least one customer has apparently tried to migrate a vSphere cluster to dual-stack.)


Also, there's a FIXME in pkg/network/render.go:isNetworkChangeSafe():

	// FIXME: the errors here currently do not actually mention dual-stack since it's
	// not supported yet.

ie, all the errors just say "cannot change ClusterNetwork" / "cannot change ServiceNetwork", but now that we're not hiding the existence of dual-stack, we should give clearer errors like "unsupported change to ClusterNetwork" or "cannot change primary IP family when migrating to dual-stack"

Comment 2 Dan Winship 2022-01-12 12:58:23 UTC
(commented on PR)

Comment 4 Andreas Karis 2022-02-14 13:16:25 UTC
Verification steps:

Create a singlestack cluster in AWS.

Once the cluster is up, apply this patch to start a migration to dualstack:

~~~
cat <<'EOF' > patch.yaml 
- op: add
  path: /spec/clusterNetwork/-
  value: 
    cidr: fd01::/48
    hostPrefix: 64
- op: add
  path: /spec/serviceNetwork/-
  value: fd02::/112 
EOF
~~~

~~~
oc patch network.config.openshift.io cluster \
  --type='json' --patch-file patch.yaml
~~~

Make sure that you get the following error message:
~~~
$ oc get co | grep network
network                                    4.10.0-0.nightly-2022-01-18-044014   True        True          True       8h      The cluster configuration is invalid (DualStack deployments are allowed only for the BareMetal Platform type or the None Platform type). Use 'oc edit network.config.openshift.io cluster' to fix.
~~~

Make sure that migration from singlestack to dualstack works in a baremetal cluster.

Comment 7 Ross Brattain 2022-02-22 12:18:50 UTC
Migration to dualstack prevented on AWS.

network                                    4.11.0-0.nightly-2022-02-18-121223   True        False         True       29m     The cluster configuration is invalid (DualStack deployments are allowed only for the BareMetal Platform type or the None Platform type). Use 'oc edit network.config.openshift.io cluster' to fix.


  status:
    conditions:
    - lastTransitionTime: "2022-02-22T11:41:13Z"
      status: "False"
      type: ManagementStateDegraded
    - lastTransitionTime: "2022-02-22T12:13:28Z"
      message: The cluster configuration is invalid (DualStack deployments are allowed
        only for the BareMetal Platform type or the None Platform type). Use 'oc edit
        network.config.openshift.io cluster' to fix.

Testing baremetal migration.

Comment 8 Andreas Karis 2022-03-23 14:25:31 UTC
Hi Ross! Could you get to testing the baremetal migration? Thanks!

Comment 9 Ross Brattain 2022-03-31 20:52:42 UTC
Verified dual-stack conversion allowed on baremetal 4.11.0-0.nightly-2022-03-29-152521

    platform:
      baremetal:
        apiVIP: 192.168.123.5
        bootstrapProvisioningIP: 172.22.0.2
        clusterProvisioningIP: 172.22.0.3
        externalBridge: baremetal-0
        hosts:
        - bmc:
            address: redfish://192.168.123.1:8000/redfish/v1/Systems/5ec73a70-55a9-4730-9a3e-98aff549566f
            disableCertificateVerification: true


  spec:
    clusterNetwork:
    - cidr: 10.128.0.0/14
      hostPrefix: 23
    - cidr: fd01::/48
      hostPrefix: 64
    defaultNetwork:
      ovnKubernetesConfig:
        gatewayConfig:
          routingViaHost: false
        genevePort: 6081
        mtu: 1400
        policyAuditConfig:
          destination: "null"
          maxFileSize: 50
          rateLimit: 20
          syslogFacility: local0
      type: OVNKubernetes
    deployKubeProxy: false
    disableMultiNetwork: false
    disableNetworkDiagnostics: false
    logLevel: Normal
    managementState: Managed
    observedConfig: null
    operatorLogLevel: Normal
    serviceNetwork:
    - 172.30.0.0/16
    - fd02::/112

Comment 11 errata-xmlrpc 2022-08-10 10:40:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069