Bug 1889620 - [Azure] - Machineset not scaling when publicIP:true in disconnected Azure enviroment
Summary: [Azure] - Machineset not scaling when publicIP:true in disconnected Azure env...
Keywords:
Status: VERIFIED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.6
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.7.0
Assignee: Danil Grigorev
QA Contact: Milind Yadav
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-20 08:44 UTC by Milind Yadav
Modified: 2020-11-19 11:04 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
jspeed: needinfo-


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift machine-api-operator pull 746 None closed Bug 1889620: Azure disconnected reject publicIP setting 2020-11-17 02:25:11 UTC
Github openshift machine-api-operator pull 749 None open Bug 1889620: Warn MachineSet with publicIp set in disconnected install 2020-11-17 02:24:50 UTC

Description Milind Yadav 2020-10-20 08:44:52 UTC
Description: Machine not scaling when publicIP:true in disconnected Azure enviroment

4.6.0-0.nightly-2020-10-15-121733

Always Reproducible:

Steps:
1.Copy a valid machineset, in the machineset.yaml, set publicIp: true.
Then create it
Machineset created successfully

2.Scale machineset 
Expected : Machinset should scale successfully
Actual : Machines moves to Failed state with below error-

message: 'failed to create nic machineset-clone-27609-4ck7t-nic for machine machineset-clone-27609-4ck7t: unable to create VM network interface: failed to create network interface machineset-clone-27609-4ck7t-nic in resource group qeci-9755-t59xc-rg: network.InterfacesClient#CreateOrUpdate: Failure sending request: StatusCode=400 – Original Error: Code="NicWithPublicIpCannotReferencePoolWithOutboundRule" Message="OutboundRules for VMs with public IpConfigurations (instance level publicIPs) /subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/qeci-9755-t59xc-rg/providers/Microsoft.Network/networkInterfaces/machineset-clone-27609-4ck7t-nic/ipConfigurations/pipConfig are not supported
 
 Additional Info: This is only happening for Azure Disconnected Private clusters , it is working for rest of the configurations ex - AWS , GCP disconnected private clusters etc.

Comment 1 Joel Speed 2020-10-20 09:14:18 UTC
This seems like a reasonable case to add a validation in the webhooks that prevents users from committing this configuration.

I don't think I've ever set up a disconnected Azure install, not sure if we have the ability to do that, perhaps @miyadav@redhat.com might be able to suggest how we can do that?

Do we have anything in the cluster that tells us that the cluster is using a disconnected install that we could check to understand and inform the user of this misconfiguration?

Comment 3 Danil Grigorev 2020-11-02 11:44:54 UTC
Hey @miyadav@redhat.com. I can see the template, but the job was destroyed. Could you describe, what are the main configuration differences between standard and disconnected IPI install on Azure? Can't find documented path to install such cluster. Does the disconnected installation set anything specific on the Infrastructure resource? Otherwise it would probably not be simple to identify this kind of setup. Is it enough to set `publish: Internal` on the Install Config to get such cluster? Or it is only disabling provision for public IPs for Routes?

Comment 4 Danil Grigorev 2020-11-02 12:17:00 UTC
Looking on the flexy template, on the disconnected install there is always a proxy setting. Could you please provide and example Proxy, Infrastructure and Network resources from the disconnected cluster, if they look any different from the default configuration?

Comment 6 Danil Grigorev 2020-11-09 12:06:41 UTC
Looking into the cluster resources, there is one which differentiates disconnected install from a basic cluster.

```
apiVersion: config.openshift.io/v1
kind: DNS
metadata:
  name: cluster
spec:
  baseDomain: ci-ln-phy0rrb-002ac.ci.azure.devcluster.openshift.com
  privateZone:
    id: /subscriptions/d38f1e38-4bed-438e-b227-833f997adf6a/resourceGroups/ci-ln-phy0rrb-002ac-68m5d-rg/providers/Microsoft.Network/privateDnsZones/ci-ln-phy0rrb-002ac.ci.azure.devcluster.openshift.com
  publicZone: <- this field is absent in disconnected install
    id: /subscriptions/d38f1e38-4bed-438e-b227-833f997adf6a/resourceGroups/os4-common/providers/Microsoft.Network/dnszones/ci.azure.devcluster.openshift.com
status: {}
```

Fix for this BZ is currently blocked on implementation for checking resources in webhooks - https://github.com/openshift/machine-api-operator/pull/673 @jspeed@redhat.com We need to merge it first

Comment 8 Milind Yadav 2020-11-19 11:04:41 UTC
Validated at : 

4.7.0-0.nightly-2020-11-18-203317

Steps :

Create a machineset with publicIp: true

Result :

      Error from server (providerSpec.publicIP: Forbidden: publicIP is not allowed in Azure disconnected installation): error when creating "STDIN": admission webhook "validation.machineset.machine.openshift.io" denied the request: providerSpec.publicIP: Forbidden: publicIP is not allowed in Azure disconnected installation


Additional Info:
https://mastern-jenkins-csb-openshift-qe.cloud.paas.psi.redhat.com/job/Runner-v3/209885/console

Moving to VERIFIED


Note You need to log in before you can comment on or make changes to this bug.