Bug 2005440
| Summary: | Dual-stack KubeAPI multi-node cluster with single Machine Network does not fail validation | |||
|---|---|---|---|---|
| Product: | Red Hat Advanced Cluster Management for Kubernetes | Reporter: | Mat Kowalski <mko> | |
| Component: | Infrastructure Operator | Assignee: | Mat Kowalski <mko> | |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
| Severity: | unspecified | Docs Contact: | Christopher Dawson <cdawson> | |
| Priority: | unspecified | |||
| Version: | rhacm-2.4 | CC: | asegurap, ccrum, fpercoco, mfilanov, trwest, yfirst | |
| Target Milestone: | --- | Keywords: | Triaged | |
| Target Release: | rhacm-2.5 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | AI-Team-Platform | |||
| Fixed In Version: | OCP-Metal-v1.0.27.0 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2009760 (view as bug list) | Environment: | ||
| Last Closed: | 2022-10-03 20:18:56 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2009760, 2013207 | |||
|
Description
Mat Kowalski
2021-09-17 16:28:32 UTC
There seem to be two scenarios for this validator and only one of the fails. Looks like the flow for cluster creation is bypassing the validator function and the flow for cluster update is correct.
+++ Scenario 1
Initial ACI contains
* 2 Cluster Networks (IPv4 + IPv6)
* 2 Service Networks (IPv4 + IPv6)
* 1 Machine Network (IPv4)
In this case the validation is not failing and we get the cluster in
```
- lastProbeTime: "2021-10-08T12:19:37Z"
lastTransitionTime: "2021-10-08T12:19:37Z"
message: SyncOK
reason: SyncOK
status: "True"
type: SpecSynced
- lastProbeTime: "2021-10-08T12:19:37Z"
lastTransitionTime: "2021-10-08T12:19:37Z"
message: 'The cluster''s validations are failing: Clusters must have exactly 3
dedicated masters. Please either add hosts, or disable the worker host,Hosts
have not been discovered yet,Hosts have not been discovered yet,Hosts have not
been discovered yet'
reason: ValidationsFailing
status: "False"
type: Validated
```
+++ Scenario 2
Initial ACI contains
* 2 Cluster Networks (IPv4 + IPv6)
* 2 Service Networks (IPv4 + IPv6)
* 2 Machine Networks (IPv4 + IPv6)
An updated ACI contains
* 2 Cluster Networks (IPv4 + IPv6)
* 2 Service Networks (IPv4 + IPv6)
* 1 Machine Network (IPv4)
In this case the validation fails and we see
```
- lastProbeTime: "2021-10-08T12:16:37Z"
lastTransitionTime: "2021-10-08T12:16:37Z"
message: 'The Spec could not be synced due to an input error: Expected 2 machine
networks, found 1'
reason: InputError
status: "False"
type: SpecSynced
```
Creating a cluster via ACI does not include Machine Networks in the `params.NewClusterParams`, the object looks like this
```
time="2021-10-08T12:58:12Z" level=info msg="CHOCOBOMB: Creating cluster with params: &{AdditionalNtpSource:<nil> BaseDNSDomain:hive.example.com ClusterNetworkCidr:<nil> Clu
sterNetworkHostPrefix:0 ClusterNetworks:[0xc0025f9830 0xc0025f9860] CPUArchitecture: DiskEncryption:<nil> HighAvailabilityMode:<nil> HTTPProxy:<nil> HTTPSProxy:<nil> Hypert
hreading:<nil> IngressVip:192.168.111.101 MachineNetworks:[] Name:0xc001fcd710 NetworkType:<nil> NoProxy:<nil> OcpReleaseImage: OlmOperators:[] OpenshiftVersion:0xc001fcd72
0 Platform:<nil> PullSecret:0xc001fcd730 SchedulableMasters:<nil> ServiceNetworkCidr:<nil> ServiceNetworks:[0xc0015f2ae0 0xc0015f2b00] SSHPublicKey:ssh-rsa AAAAB3NzaC1yc2EA
AAADAQABAAABgQC1b/IibQkel9sU5OYuNkoL3qda0vzgx2Sb2lmF5hFsZ3L2D+w+Ixkwjw1g0jQAsQ+00rlKYgdxVmUWYpGE2ZKLQ75kHzs4qChupTMb1rJL5YH8xVeKuCN86WkW2rn5vT7gY8r+m/odCBkL4WQDxGVXdHcevhO6
klehsb2PdhqKkbm+xNMrHSOWOnxbV2O7U4VdWgHMcPt9vlSf4ewNHMNer0cTmmqIIg9Lqbp5p8zcM20uSdMQBjar+A2PHu29CyjqVMczu7S6G/DLbTG4GnovcPJwOiNUgOLEt13kNLRbODXl610DmESS4Si4bAZvi555fXmoAgrW
4uLCZ8zOEgMaz+G6yhcMqJ47WjznhbJRJeWmqz3pjd+252SCrznAmXrbD/mpjYZulDLPIejENJzd7LRBp3DBDQtgrWeP+04CosNYD2vXWV+Xlofd/uSdVzyY+kKkuatGx7R13PHK+WlgxW3albEPEgz8T+3IRKNNfDmwtEem6R0K
AhTuC0volGk= root.lab.eng.rdu2.redhat.com UserManagedNetworking:0xc0018d5ba9 VipDhcpAllocation:0xc0018d5ba8}" func="github.com/openshift/assisted-service/inter
nal/bminventory.(*bareMetalInventory).RegisterClusterInternal" file="/go/src/github.com/openshift/origin/internal/bminventory/inventory.go:477" cluster_id=ba34e502-a244-440
6-961e-a9fd78a3c0fa go-id=592 pkg=Inventory request_id=f5508ad2-c4a7-4734-b036-cf9fffd8db9d
[...]
time="2021-10-08T12:58:12Z" level=info msg="ClusterDeployment Reconcile started" func="github.com/openshift/assisted-service/internal/controller/controllers.(*ClusterDeplo$
mentsReconciler).Reconcile" file="/go/src/github.com/openshift/origin/internal/controller/controllers/clusterdeployments_controller.go:117" cluster_deployment=dual-aci clu$
ter_deployment_namespace=assisted-installer-2 go-id=592 request_id=ed803828-2f16-476a-a868-f1bab0f5864e
time="2021-10-08T12:58:12Z" level=info msg="update cluster ba34e502-a244-4406-961e-a9fd78a3c0fa with params: &{AdditionalNtpSource:<nil> APIVip:0xc001b2a820 APIVipDNSName:$
nil> BaseDNSDomain:<nil> ClusterNetworkCidr:<nil> ClusterNetworkHostPrefix:<nil> ClusterNetworks:[] DiskEncryption:<nil> HTTPProxy:<nil> HTTPSProxy:<nil> Hyperthreading:<ni
l> IngressVip:<nil> MachineNetworkCidr:<nil> MachineNetworks:[0xc001c09dc0] Name:<nil> NetworkType:0xc001b2a810 NoProxy:<nil> OlmOperators:[] Platform:<nil> PullSecret:<nil
> SchedulableMasters:<nil> ServiceNetworkCidr:<nil> ServiceNetworks:[] SSHPublicKey:<nil> UserManagedNetworking:<nil> VipDhcpAllocation:<nil>}" func="github.com/openshift/a
ssisted-service/internal/bminventory.(*bareMetalInventory).v2UpdateClusterInternal" file="/go/src/github.com/openshift/origin/internal/bminventory/inventory.go:2375" go-id=
592 pkg=Inventory request_id=ed803828-2f16-476a-a868-f1bab0f5864e
time="2021-10-08T12:58:12Z" level=info msg="CHOCOBOMB: Updating cluster with params: &{AdditionalNtpSource:<nil> APIVip:0xc001b2a820 APIVipDNSName:<nil> BaseDNSDomain:<nil>
ClusterNetworkCidr:<nil> ClusterNetworkHostPrefix:<nil> ClusterNetworks:[] DiskEncryption:<nil> HTTPProxy:<nil> HTTPSProxy:<nil> Hyperthreading:<nil> IngressVip:<nil> Mach
ineNetworkCidr:<nil> MachineNetworks:[0xc001c09dc0] Name:<nil> NetworkType:0xc001b2a810 NoProxy:<nil> OlmOperators:[] Platform:<nil> PullSecret:<nil> SchedulableMasters:<ni
l> ServiceNetworkCidr:<nil> ServiceNetworks:[] SSHPublicKey:<nil> UserManagedNetworking:<nil> VipDhcpAllocation:<nil>}" func="github.com/openshift/assisted-service/internal
/bminventory.(*bareMetalInventory).validateAndUpdateClusterParams" file="/go/src/github.com/openshift/origin/internal/bminventory/inventory.go:2153" go-id=592 pkg=Inventory
request_id=ed803828-2f16-476a-a868-f1bab0f5864e
time="2021-10-08T12:58:12Z" level=info msg="Updated clusterDeployment assisted-installer-2/dual-aci" func="github.com/openshift/assisted-service/internal/controller/control
lers.(*ClusterDeploymentsReconciler).updateIfNeeded" file="/go/src/github.com/openshift/origin/internal/controller/controllers/clusterdeployments_controller.go:739" agent_c
luster_install=dual-aci agent_cluster_install_namespace=assisted-installer-2 cluster_deployment=dual-aci cluster_deployment_namespace=assisted-installer-2 go-id=592 request
_id=ed803828-2f16-476a-a868-f1bab0f5864e
time="2021-10-08T12:58:12Z" level=info msg="ClusterDeployment Reconcile ended" func="github.com/openshift/assisted-service/internal/controller/controllers.(*ClusterDeployme
ntsReconciler).Reconcile.func1" file="/go/src/github.com/openshift/origin/internal/controller/controllers/clusterdeployments_controller.go:114" agent_cluster_install=dual-a
ci agent_cluster_install_namespace=assisted-installer-2 cluster_deployment=dual-aci cluster_deployment_namespace=assisted-installer-2 go-id=592 request_id=ed803828-2f16-476
a-a868-f1bab0f5864e
```
G2Bsync 941430821 comment CrystalChun Tue, 12 Oct 2021 20:04:17 UTC G2Bsync PRs are already merged https://github.com/openshift/assisted-service/pull/2731 https://github.com/openshift/assisted-service/pull/2660 Verified using ACM 2.5.0-DOWNSTREAM-2022-04-11-09-21-38
on creation of dual-stack ACI with single machineNetwork:
oc get agentclusterinstalls.extensions.hive.openshift.io spoke-0 -o json | jq '.spec.networking'
{
"clusterNetwork": [
{
"cidr": "10.128.0.0/14",
"hostPrefix": 23
},
{
"cidr": "fd01::/48",
"hostPrefix": 64
}
],
"machineNetwork": [
{
"cidr": "fd2e:6f44:5dd8:5::/64"
}
],
"serviceNetwork": [
"172.30.0.0/16",
"fd02::/112"
]
}
spec is not synced with a clear messsage:
oc get agentclusterinstalls.extensions.hive.openshift.io spoke-0 -o json | jq '.status.conditions | map(select(.type=="SpecSynced"))'
[
{
"lastProbeTime": "2022-04-13T07:08:04Z",
"lastTransitionTime": "2022-04-13T07:08:04Z",
"message": "The Spec could not be synced due to an input error: Expected 2 machine networks, found 1",
"reason": "InputError",
"status": "False",
"type": "SpecSynced"
}
]
Patching in the missing machine CIDR allows the ACI to sync.
Updating the ACI to remove the second machine CIDR brings the ACI back to not-synced state, so this flow did not regress
|