Description of problem: In azure the load balancer svc cannot be created due to missing configuration parameters. Version-Release number of selected component (if applicable): atomic-openshift-excluder-3.10.14-1.git.0.ba8ae6d.el7.noarch atomic-openshift-hyperkube-3.10.14-1.git.0.ba8ae6d.el7.x86_64 atomic-openshift-node-3.10.14-1.git.0.ba8ae6d.el7.x86_64 atomic-openshift-clients-3.10.14-1.git.0.ba8ae6d.el7.x86_64 atomic-openshift-docker-excluder-3.10.14-1.git.0.ba8ae6d.el7.noarch atomic-openshift-3.10.14-1.git.0.ba8ae6d.el7.x86_64 How reproducible: Perform the installation with the the following parameters in the inventory as they relate to your environment #310 openshift_cloudprovider_kind=azure openshift_cloudprovider_azure_client_id=ID openshift_cloudprovider_azure_client_secret=SECRET openshift_cloudprovider_azure_tenant_id=TENANT openshift_cloudprovider_azure_subscription_id=SUB openshift_cloudprovider_azure_resource_group=refarch-azr openshift_cloudprovider_azure_location=eastus openshift_release=v3.10 #310end Steps to Reproduce: 1. deploy 3.10 with the above values 2. create a load balancer svc vi load.yaml apiVersion: v1 kind: Service metadata: name: egress-21 spec: ports: - name: web port: 8080 type: LoadBalancer selector: deploymentconfig: app oc create -f load.yaml Actual results: load balancer stays in pending state egress-21 LoadBalancer 172.30.245.15 <pending> 8080:32566/TCP 7m Expected results: egress-21 LoadBalancer 172.30.245.15 137.117.45.245 8080:32566/TCP 7m Additional info: The required values are actually located here https://docs.openshift.com/container-platform/3.10/install_config/configuring_azure.html#azure-configuration-file Description of problem: Version-Release number of the following components: rpm -q openshift-ansible rpm -q ansible ansible --version How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Please include the entire output from the last TASK line through the end of output if an error is generated Expected results: Additional info: Please attach logs from ansible-playbook with the -vvv flag
TESTING https://github.com/openshift/openshift-ansible/pull/9473
The PR does the needful
Should be in openshift-ansible-3.11.0-0.15.0
Without those added parameters, OCP with Azure could also works well. QE doesn't think those parameters are "must have". Could you please make them to be "options" parameters as doc[1] said. [1]. https://docs.openshift.com/container-platform/3.10/install_config/configuring_azure.html#azure-configuration-file
Another doc to explain those parameters. https://github.com/kubernetes/cloud-provider-azure/blob/master/docs/cloud-provider-config.md#cluster-config
Could you please help understand primaryAvailabilitySetName? I can't understand from here[1], would you mind give some examples? [1]. https://github.com/kubernetes/cloud-provider-azure/blob/master/docs/cloud-provider-config.md#primaryavailabilitysetname
Also parameters are optional for Azure itself. When I create VMs in Azure, there might be no security groups and availability sets and the machines start/work just fine. wrt cloud name, I didn't know such thing even existed. I assume we can have a default of whatever Azure has as a default cloud. Then only users hooked to other clouds will need to know what should be put in there.
@Wenkai, With primaryavailabilityset you are defining which set of nodes should be assigned the load balancer. With the current limitations of the Azure cloud provider if a primaryavailabilityset is not defined and any other load balancers are used for instances in the cluster the load balancer will error out because there is a limit on internal and external load balancers that can be assigned to a set of instances.
@Aleksandar I believe they are required in raw kubernetes. I do agree that you can get away with them being optional with no security groups and availability sets but the functionality isn't complete. In regards to cloudname there are a a few options such as govcloud etc
Ryan, could you clarify how exactly machines should be put into availability sets? Create one set and use it always? One set for each cluster? Something else? Another question is why need a security group? In any case we need documentation about how exactly to create availability sets and security groups to have complete functionality of the cluster.
So machines have to be added to the availability set at launch. When talking with Harold from M$ we both agreed for the architectures we were suggesting. One availability set per machine type: 1 for masters 1 for infra 1 for apps So 3 per cluster For the security group I believe is just used to be assigned to the load balanacer and update any required rules for the load balancer to access those nodes Sadly, machines cannot be added to availability sets once created but security groups can be added and removed as needed
Then having only one availability set parameter doesn't make sense? If we create 3 new availability sets per cluster I mean. If we create 3 sets, then OpenShift must need to know about them such that cluster can later be scaled up. I know this is not an immediate feature that we will have but given existing plans this appears to be something that we will have to support at some point. My suggestion would be to at the very least rename `primaryAvailabilitySetName` to `lbAvailabilitySetName` or `infraAvailabilitySetName` such that we don't need to rename the setting later on and make it more clear what `primary` means. Excuse me if I'm missing the original point of this. wrt security groups, "required rules for the load balancer to access those nodes" is not clear here. We need more specific instructions what exactly rules will be needed to be set.
So this is a kubernetes feature rather than OpenShift. So if an availability set isn't specified and everything is in one set and a load balancer already exists then a kubernetes svc load balancer cannot be created. Due to this error E0807 19:23:42.070640 1 service_controller.go:219] error processing service test/egress-2 (will retry): failed to ensure load balancer for service test/egress-2: [ensure(test/egress-2): backendPoolID(/subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/loadBalancers/kubernetes/backendAddressPools/kubernetes) - failed to ensure host in pool: "network.InterfacesClient#CreateOrUpdate: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code=\"NetworkInterfaceUsesMultipleLoadBalancersOfSameType\" Message=\"Network interface /subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/networkInterfaces/ocp-infra-1VMNic references more than one load balancer of the same type (internal or public): /subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/loadBalancers/OcpRouterLB, /subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/loadBalancers/kubernetes. Only one internal and one public load balancer are allowed per availability set.\" Details=[]", ensure(test/egress-2): backendPoolID(/subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/loadBalancers/kubernetes/backendAddressPools/kubernetes) - failed to ensure host in pool: "network.InterfacesClient#CreateOrUpdate: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code=\"NetworkInterfaceUsesMultipleLoadBalancersOfSameType\" Message=\"Network interface /subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/networkInterfaces/ocp-infra-3VMNic references more than one load balancer of the same type (internal or public): /subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/loadBalancers/OcpRouterLB, /subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/loadBalancers/kubernetes. Only one internal and one public load balancer are allowed per availability set.\" Details=[]", ensure(test/egress-2): backendPoolID(/subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/loadBalancers/kubernetes/backendAddressPools/kubernetes) - failed to ensure host in pool: "network.InterfacesClient#CreateOrUpdate: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code=\"NetworkInterfaceUsesMultipleLoadBalancersOfSameType\" Message=\"Network interface /subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/networkInterfaces/ocp-infra-2VMNic references more than one load balancer of the same type (internal or public): /subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/loadBalancers/OcpRouterLB, /subscriptions/8117c1c9-d10d-4366-86cc-e3ccaacaae2d/resourceGroups/refarch-azr/providers/Microsoft.Network/loadBalancers/kubernetes. Only one internal and one public load balancer are allowed per availability set.\" Details=[]"] I am basing these variable names based specifically on what kubernetes expects. I think trying to stay as close to kubernetes allows other engineers that pick this up to know what we are actually doing. I can verify the security group items tomorrow if you need me to investigate. Like I said I am just basing this off of my experience. If all these conditions are not met then a load balancer svc cannot be created. I agree that to an extent these variables would be considered optional but then again you pretty much have to define them anyways if you do a deployment so might as well just make them mandatory. The only reason I am pushing for this I found out the hard way when trying to use an external load balanacer for the summit demo that these items were required. Here is the kubernetes variables https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/azure/azure.go#L98
I've create a PR for make those parameters as option: https://github.com/openshift/openshift-ansible/pull/9789
(In reply to Wenkai Shi from comment #6) > Could you please help understand primaryAvailabilitySetName? I can't > understand from here[1], would you mind give some examples? > > [1]. > https://github.com/kubernetes/cloud-provider-azure/blob/master/docs/cloud- > provider-config.md#primaryavailabilitysetname # VM Create az vm availability-set create \ --resource-group refarch-azr \ --name ocp-master-instances It's just a value that is placed on a set of launched instances. The above is an example of creating an availability set for masters. The master instances are then launched into the availability set like this. az vm create \ --resource-group refarch-azr \ --name ocp-master-$i \ --availability-set ocp-master-instances \ --size Standard_E2S_v3 \ --image RedHat:RHEL:7-RAW:latest \ --admin-user cloud-user \ --ssh-key /var/lib/jenkins/.ssh/id_rsa.pub \ --data-disk-sizes-gb 32 32 32 \ --no-wait \ --nics ocp-master-${i}VMNic;
PR from 19 merged.
In openshift-ansible-3.11.0-0.25.0
Verified with version openshift-ansible-3.11.0-0.25.0.git.0.7497e69.el7, it works good.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2652