Description of problem: When a svc is created with a given (pre-created) fip, it's being ignored so it's not being assigned to the LB and the service is not reachable externally. i.e: $ oc expose deployment test1-dep --name test1-svc --type=LoadBalancer --port 80 --target-port=8080 --load-balancer-ip 10.0.0.254 By setting the --load-balancer-ip parameter the service LB should be assigned the floating ip [1], but it's being ignored. Version-Release number of selected component (if applicable): OCP 4.9.0-0.nightly-2021-08-25-010624 OSP 16.1.6 GA (RHOS-16.1-RHEL-8-20210604.n.0) How reproducible: always Steps to Reproduce: 1. Install OCP 4.9 on OSP 2. Create a floating ip in OSP for later assignment openstack floating ip create <external network> 3. Create a new project and a deployment in OCP oc new-project test oc create deployment test1-dep --image=quay.io/kuryr/demo 4. Create a service setting the desired LB IP address oc expose deployment test1-dep --name test1-svc --type=LoadBalancer --port 80 --target-port=8080 --load-balancer-ip <precreated fip> 5. Check a LB is created in OSP (wait until it's in ACTIVE status): openstack loadbalancer list +--------------------------------------+----------------------------------+----------------------------------+--------------+---------------------+----------+ | id | name | project_id | vip_address | provisioning_status | provider | +--------------------------------------+----------------------------------+----------------------------------+--------------+---------------------+----------+ | a2a0f053-7539-4758-bcdf-a33773e8fbd6 | a0f27ca88bdee4028ac6e288c76efec1 | c0316b3530e64b909f9451a857b404d0 | 10.196.2.216 | ACTIVE | amphora | +--------------------------------------+----------------------------------+----------------------------------+--------------+---------------------+----------+ 6. Check the svc in OCP: oc get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE test1-svc LoadBalancer 172.30.172.19 <pending> 80:31624/TCP 39m The external-IP is pending (and the fip should be assigned) oc get svc -o yaml apiVersion: v1 items: - apiVersion: v1 kind: Service metadata: creationTimestamp: "2021-08-25T15:35:15Z" finalizers: - service.kubernetes.io/load-balancer-cleanup labels: app: test1-dep name: test1-svc namespace: test resourceVersion: "192336" uid: 0f27ca88-bdee-4028-ac6e-288c76efec1c spec: allocateLoadBalancerNodePorts: true clusterIP: 172.30.172.19 clusterIPs: - 172.30.172.19 externalTrafficPolicy: Cluster internalTrafficPolicy: Cluster ipFamilies: - IPv4 ipFamilyPolicy: SingleStack loadBalancerIP: <precreated fip> <<<<<------- ports: - nodePort: 31624 port: 80 protocol: TCP targetPort: 8080 selector: app: test1-dep sessionAffinity: None type: LoadBalancer status: loadBalancer: {} <<<<<------- kind: List metadata: resourceVersion: "" selfLink: "" Actual results: The loadBalancerIP has been added to the spec section but not in the status section, as it's empty: ... status: loadBalancer: {} ... Expected results: Fip added to the status section ... status: loadBalancer: ingress: - ip: <precreated fip> ... Fip assigned in the external-IP oc get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE test1-svc LoadBalancer 172.30.172.19 <precreated fip> 80:31624/TCP 39m Additional info: It can be tested by creating the next resource directly: $ cat svc_resource.yaml apiVersion: v1 kind: Service metadata: labels: run: demo name: demo namespace: test spec: ports: - port: 80 protocol: TCP targetPort: 8080 selector: run: demo type: LoadBalancer loadBalancerIP: <precreated fip> [1] https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer
Just to clarify, the load balancer is not just not being updated, but also is not being created correctly, right?
Why would you want to pre-create a FIP in OpenStack? What's the use case there? I'm almost sure the problem is that cloud provider attempts to create the FIP with the given address and fails because the address is already taken (we can confirm that by looking at events in [1]). This might totally be considered an expected behavior - external networks might be shared and you don't want to mess with stuff created by others. Also cloud provider doesn't do FIP tagging and we're supposed to support running multiple clusters on a single OpenStack project. Implementing the behavior you described we could end up with clusters "stealing" FIPs from one another. [1] `oc -n test describe svc demo`
(In reply to egarcia from comment #2) > Just to clarify, the load balancer is not just not being updated, but also > is not being created correctly, right? The load balancer is being created correctly but the svc is not being assigned the external IP (floating ip).
(In reply to Michał Dulko from comment #3) > Why would you want to pre-create a FIP in OpenStack? What's the use case > there? I'm almost sure the problem is that cloud provider attempts to create > the FIP with the given address and fails because the address is already > taken (we can confirm that by looking at events in [1]). This might totally > be considered an expected behavior - external networks might be shared and > you don't want to mess with stuff created by others. Also cloud provider > doesn't do FIP tagging and we're supposed to support running multiple > clusters on a single OpenStack project. Implementing the behavior you > described we could end up with clusters "stealing" FIPs from one another. > > [1] `oc -n test describe svc demo` That was a use case supported in Kuryr if I'm not wrong, please see the related BZs [1] and [2]. I don't have a cluster with Kuryr atm but that's how it used to work. As it depends on the cloud provider [3] implementation I guess it's up to us to decide if we want to support it. The events in the svc confirms your thoughts about the issue: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning SyncLoadBalancerFailed 3h3m (x149 over 15h) service-controller Error syncing load balancer: failed to ensure load balancer: error creating LB floatingip {Description: FloatingNetworkID:4634cf2c-056f-4dee-98de-6b4e68b7af5b FloatingIP:x.x.x.x PortID:2ebaae02-43d0-4adc-bca1-9fc32b52df99 FixedIP: SubnetID: TenantID: ProjectID:}: Request forbidden: [POST https://10.0.0.101:13696/v2.0/floatingips], error message: {"NeutronError": {"type": "PolicyNotAuthorized", "message": "(rule:create_floatingip and rule:create_floatingip:floating_ip_address) is disallowed by policy", "detail": ""}} Normal EnsuringLoadBalancer 2m44s (x185 over 15h) service-controller Ensuring load balancer [1] https://bugzilla.redhat.com/show_bug.cgi?id=1875352 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1503963#c21 [3] https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer
Hm, I see. I'm not 100% this should be a behavior considered supported by Kuryr, it's subject to the risks I mentioned. I'd rather expect Kuryr to create the FIP instead. But let's focus on the cloud provider here - the error listed in `oc describe svc` is not backing my suspicion, it's a policy error. Apparently by default OSP policy does not allow non-admin users to create FIPs providing an IP. Can you try it without pre-creating the FIP? I think we'll see the same and that I'd consider to be a bug we need to document. It's also worth trying to change the policy to allow it and retry the test with pre-created FIP.
I've tried setting a fip which I haven't created previously: oc expose deployment test1-dep --name test1-svc --type=LoadBalancer --port 80 --target-port=8080 --load-balancer-ip 10.0.0.111 and it shows the same policy error: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal EnsuringLoadBalancer 60s (x10 over 23m) service-controller Ensuring load balancer Warning SyncLoadBalancerFailed 59s (x10 over 21m) service-controller Error syncing load balancer: failed to ensure load balancer: error creating LB floatingip {Description: FloatingNetworkID:4634cf2c-056f-4dee-98de-6b4e68b7af5b Floa tingIP:10.0.0.111 PortID:868389a8-92a4-4760-83cd-dd5e26171b74 FixedIP: SubnetID: TenantID: ProjectID:}: Request forbidden: [POST https://10.0.0.101:13696/v2.0/floatingips], error message: {"NeutronError": {"type": "PolicyNotAuthorized", " message": "(rule:create_floatingip and rule:create_floatingip:floating_ip_address) is disallowed by policy", "detail": ""}} So it's failing in both cases (with 1. a pre-created fip and 2. without a pre-created fip). It seems like the neutron policy doesn't allow to create a fip with a specific IP address (which on the other hand wouldn't be necessary for case 1. as it's already created). I've tried it from CLI as well (as non-admin user), and got the same policy error: $ openstack floating ip create --port 868389a8-92a4-4760-83cd-dd5e26171b74 --floating-ip-address 10.0.0.111 nova Error while executing command: HttpException: 403, (rule:create_floatingip and rule:create_floatingip:floating_ip_address) is disallowed by policy It works though as admin user. I can create fips as non-admin user if I don't specify any IP value: $ openstack floating ip create nova So the policy is limited only when you want to create a fip with a specific IP address. From openstack neutron docs [1]: create_floatingip Default (role:admin and system_scope:all) or (role:member and project_id:%(project_id)s) Operations POST /floatingips Scope Types system project Create a floating IP create_floatingip:floating_ip_address Default role:admin and system_scope:all Operations POST /floatingips Scope Types system project Create a floating IP with a specific IP address The second case is only allowed to admin user. I'm not able to find how to change the policy and allow it for non-admin users. [1] https://docs.openstack.org/neutron/latest/configuration/policy.html
Alright, so that's clearly a bug that'd require at least docs update.
+1 I think the best way forward is to document it as a known issue.
Removing the Triaged keyword because: * the QE automation assessment (flag qe_test_coverage) is missing
Verified that the issue described in this BZ has been documented at https://github.com/openshift/installer/blob/master/docs/user/openstack/known-issues.md#limitations-of-creating-external-load-balancers-using-pre-defined-fips
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069