Bug 1613546
Summary: | Azure load balancer cannot be created | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Ryan Cook <rcook> | |
Component: | Installer | Assignee: | Ryan Cook <rcook> | |
Status: | CLOSED ERRATA | QA Contact: | Wenkai Shi <weshi> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 3.10.0 | CC: | akostadi, aos-bugs, jokerman, mmccomas, rcook, weshi | |
Target Milestone: | --- | Keywords: | TestBlocker | |
Target Release: | 3.11.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Without the fully defined azure.conf file when a load balancer service was requested through OpenShift the load balancer would never fully register and provide the external IP address. Now, the azure.conf with all the required variables allows the load balancer to be deployed and provide the external ip address.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1615903 (view as bug list) | Environment: | ||
Last Closed: | 2018-10-11 07:24:08 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1615903 |
Description
Ryan Cook
2018-08-07 20:02:09 UTC
The PR does the needful Should be in openshift-ansible-3.11.0-0.15.0 Without those added parameters, OCP with Azure could also works well. QE doesn't think those parameters are "must have". Could you please make them to be "options" parameters as doc[1] said. [1]. https://docs.openshift.com/container-platform/3.10/install_config/configuring_azure.html#azure-configuration-file Another doc to explain those parameters. https://github.com/kubernetes/cloud-provider-azure/blob/master/docs/cloud-provider-config.md#cluster-config Could you please help understand primaryAvailabilitySetName? I can't understand from here[1], would you mind give some examples? [1]. https://github.com/kubernetes/cloud-provider-azure/blob/master/docs/cloud-provider-config.md#primaryavailabilitysetname Also parameters are optional for Azure itself. When I create VMs in Azure, there might be no security groups and availability sets and the machines start/work just fine. wrt cloud name, I didn't know such thing even existed. I assume we can have a default of whatever Azure has as a default cloud. Then only users hooked to other clouds will need to know what should be put in there. @Wenkai, With primaryavailabilityset you are defining which set of nodes should be assigned the load balancer. With the current limitations of the Azure cloud provider if a primaryavailabilityset is not defined and any other load balancers are used for instances in the cluster the load balancer will error out because there is a limit on internal and external load balancers that can be assigned to a set of instances. @Aleksandar I believe they are required in raw kubernetes. I do agree that you can get away with them being optional with no security groups and availability sets but the functionality isn't complete. In regards to cloudname there are a a few options such as govcloud etc Ryan, could you clarify how exactly machines should be put into availability sets? Create one set and use it always? One set for each cluster? Something else? Another question is why need a security group? In any case we need documentation about how exactly to create availability sets and security groups to have complete functionality of the cluster. So machines have to be added to the availability set at launch. When talking with Harold from M$ we both agreed for the architectures we were suggesting. One availability set per machine type: 1 for masters 1 for infra 1 for apps So 3 per cluster For the security group I believe is just used to be assigned to the load balanacer and update any required rules for the load balancer to access those nodes Sadly, machines cannot be added to availability sets once created but security groups can be added and removed as needed Then having only one availability set parameter doesn't make sense? If we create 3 new availability sets per cluster I mean. If we create 3 sets, then OpenShift must need to know about them such that cluster can later be scaled up. I know this is not an immediate feature that we will have but given existing plans this appears to be something that we will have to support at some point. My suggestion would be to at the very least rename `primaryAvailabilitySetName` to `lbAvailabilitySetName` or `infraAvailabilitySetName` such that we don't need to rename the setting later on and make it more clear what `primary` means. Excuse me if I'm missing the original point of this. wrt security groups, "required rules for the load balancer to access those nodes" is not clear here. We need more specific instructions what exactly rules will be needed to be set. So this is a kubernetes feature rather than OpenShift. So if an availability set isn't specified and everything is in one set and a load balancer already exists then a kubernetes svc load balancer cannot be created. Due to this error E0807 19:23:42.070640 1 service_controller.go:219] error processing service test/egress-2 (will retry): failed to ensure load balancer for service test/egress-2: [ensure(test/egress-2): backendPoolID(/subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/loadBalancers/kubernetes/backendAddressPools/kubernetes) - failed to ensure host in pool: "network.InterfacesClient#CreateOrUpdate: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code=\"NetworkInterfaceUsesMultipleLoadBalancersOfSameType\" Message=\"Network interface /subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/networkInterfaces/ocp-infra-1VMNic references more than one load balancer of the same type (internal or public): /subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/loadBalancers/OcpRouterLB, /subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/loadBalancers/kubernetes. Only one internal and one public load balancer are allowed per availability set.\" Details=[]", ensure(test/egress-2): backendPoolID(/subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/loadBalancers/kubernetes/backendAddressPools/kubernetes) - failed to ensure host in pool: "network.InterfacesClient#CreateOrUpdate: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code=\"NetworkInterfaceUsesMultipleLoadBalancersOfSameType\" Message=\"Network interface /subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/networkInterfaces/ocp-infra-3VMNic references more than one load balancer of the same type (internal or public): /subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/loadBalancers/OcpRouterLB, /subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/loadBalancers/kubernetes. Only one internal and one public load balancer are allowed per availability set.\" Details=[]", ensure(test/egress-2): backendPoolID(/subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/loadBalancers/kubernetes/backendAddressPools/kubernetes) - failed to ensure host in pool: "network.InterfacesClient#CreateOrUpdate: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code=\"NetworkInterfaceUsesMultipleLoadBalancersOfSameType\" Message=\"Network interface /subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/networkInterfaces/ocp-infra-2VMNic references more than one load balancer of the same type (internal or public): /subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/loadBalancers/OcpRouterLB, /subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/loadBalancers/kubernetes. Only one internal and one public load balancer are allowed per availability set.\" Details=[]"] I am basing these variable names based specifically on what kubernetes expects. I think trying to stay as close to kubernetes allows other engineers that pick this up to know what we are actually doing. I can verify the security group items tomorrow if you need me to investigate. Like I said I am just basing this off of my experience. If all these conditions are not met then a load balancer svc cannot be created. I agree that to an extent these variables would be considered optional but then again you pretty much have to define them anyways if you do a deployment so might as well just make them mandatory. The only reason I am pushing for this I found out the hard way when trying to use an external load balanacer for the summit demo that these items were required. Here is the kubernetes variables https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/azure/azure.go#L98 I've create a PR for make those parameters as option: https://github.com/openshift/openshift-ansible/pull/9789 (In reply to Wenkai Shi from comment #6) > Could you please help understand primaryAvailabilitySetName? I can't > understand from here[1], would you mind give some examples? > > [1]. > https://github.com/kubernetes/cloud-provider-azure/blob/master/docs/cloud- > provider-config.md#primaryavailabilitysetname # VM Create az vm availability-set create \ --resource-group refarch-azr \ --name ocp-master-instances It's just a value that is placed on a set of launched instances. The above is an example of creating an availability set for masters. The master instances are then launched into the availability set like this. az vm create \ --resource-group refarch-azr \ --name ocp-master-$i \ --availability-set ocp-master-instances \ --size Standard_E2S_v3 \ --image RedHat:RHEL:7-RAW:latest \ --admin-user cloud-user \ --ssh-key /var/lib/jenkins/.ssh/id_rsa.pub \ --data-disk-sizes-gb 32 32 32 \ --no-wait \ --nics ocp-master-${i}VMNic; PR from 19 merged. In openshift-ansible-3.11.0-0.25.0 Verified with version openshift-ansible-3.11.0-0.25.0.git.0.7497e69.el7, it works good. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2652 |