Description of problem: After service creation in openshift/K8s a floating IP is assigned to the load balancer and it is reachable, but the value is not reflected in the ExternalIPs field of the openshift/K8s service metadata. Version-Release number of selected component (if applicable): openstack-kuryr-kubernetes-controller-0.4.3-1.el7ost.noarch How reproducible: Always Steps to Reproduce: 1. Make sure external_svc_net param is set with external network id in kuryr-config configmap (oc -n openshift-infra edit cm kuryr-config). 2. oc new-project test 3. oc run --image kuryr/demo demo 4. oc scale dc/demo --replicas=2 5. oc expose dc/demo --port 80 --target-port 8080 --type LoadBalancer service "demo" exposed 6. oc get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE demo LoadBalancer 172.30.230.80 172.29.117.222,172.29.117.222 80:32440/TCP 19s 7. Check a floating IP has been asigned for the load balancer: (overcloud) [root@undercloud stack]# openstack floating ip list | grep 172.30.230.80 | 03e9b5db-1792-4e57-9ece-f79cf8b5ad0c | 172.20.0.218 | 172.30.230.80 | 9d604dbe-e7e0-4e57-a85e-9dd0b4c9d49c | dbba197f-d28e-49be-9905-fde1fa67cd52 | 6c07532860e641989bacc5583275080a | 8. Check the floating IP is reachable: (overcloud) [root@undercloud stack]# curl 172.20.0.218 demo-1-5pn86: HELLO! I AM ALIVE!!! 9. Check externalIPs field in service metadata: [openshift@master-0 ~]$ oc get svc demo -o yaml apiVersion: v1 kind: Service metadata: annotations: openstack.org/kuryr-lbaas-spec: '{"versioned_object.data": {"ip": "172.30.230.80", "lb_ip": null, "ports": [{"versioned_object.data": {"name": null, "port": 80, "protocol": "TCP"}, "versioned_object.name": "LBaaSPortSpec", "versioned_object.namespace": "kuryr_kubernetes", "versioned_object.version": "1.0"}], "project_id": "6c07532860e641989bacc5583275080a", "security_groups_ids": ["41d0c3ad-ebcf-4c14-939f-651ec2c50fcf"], "subnet_id": "d0755040-7349-4966-8f7f-24989b0e2d56", "type": "LoadBalancer"}, "versioned_object.name": "LBaaSServiceSpec", "versioned_object.namespace": "kuryr_kubernetes", "versioned_object.version": "1.0"}' creationTimestamp: 2018-05-07T12:51:45Z labels: run: demo name: demo namespace: test resourceVersion: "442095" selfLink: /api/v1/namespaces/test/services/demo uid: 663a45f1-51f5-11e8-bca5-fa163ec71097 spec: clusterIP: 172.30.230.80 externalIPs: - 172.29.117.222 - 172.29.49.250 externalTrafficPolicy: Cluster ports: - nodePort: 32440 port: 80 protocol: TCP targetPort: 8080 selector: run: demo sessionAffinity: None type: LoadBalancer status: loadBalancer: ingress: - ip: 172.29.49.250 Actual results: The service FIP (172.20.0.218) is not reflected in externalIPs. EXTERNAL-IP param from 'oc get svc' shows an IP address twice. oc get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE demo LoadBalancer 172.30.230.80 172.29.117.222,172.29.117.222 80:32440/TCP 19s Expected results: The service FIP (172.20.0.218) should be listed in externalIPs. EXTERNAL-IP param from 'oc get svc' should not show the IP address twice. Additional info: Upstream bug: https://bugs.launchpad.net/kuryr-kubernetes/+bug/1733576
Your comment was: Seems that Kuryrcontroller is doing his job: 1. Created LB. 2. Allocate FIP from the external network 3. Attach the FIP to LB vip 4. Updates the service with the FIP under service object as follows: status: loadBalancer: ingress: - ip: 172.24.4.13 So, the issue that Openshift also allocates external IP (from the default pool 172.29.xx.xx) and overwrite Kuryr details under status/loadbalancr/ingress/ip. As a workaround, we can get the LB FIP from the endpoints annotation as follows : # Create a LoadBalancer type service oc run --image kuryr/demo test1 oc scale dc/test1 --replicas=2 oc expose dc/test1 --port 80 --target-port 8080 --type LoadBalancer # The fip could be retrieved from annotation as follows : oc get ep test1 -o yaml | grep service_pub_ip_info -A1 #? "kuryr_kubernetes", "versioned_object.version": "1.0"}], "service_pub_ip_info": {"versioned_object.data": {"alloc_method": "pool", "ip_addr": 172.20.0.219", # in this example the FIP is 172.20.0.219 I don't think that it should be a blocker. We"ll continue to investigate
More details after further investigations: A. Openshift external/ingress IP CIDR for LoadBalancer service type is defined by 'ingressIPNetworkCIDR' field at master-config.yaml file under section 'networkConfig' as follows: " networkConfig: clusterNetworkCIDR: 10.0.0.64/26 clusterNetworks: - cidr: 10.0.0.64/26 hostSubnetLength: 9 externalIPNetworkCIDRs: null hostSubnetLength: 9 ingressIPNetworkCIDR: 172.29.0.0/16 networkPluginName: "" serviceNetworkCIDR: 10.0.0.128/26 " B. In case 'ingressIPNetworkCIDR' not defined, Openshift use as default the 172.29.0.0/16 CIDR. C. From Openshift logs, it seems that Openshift has a periodic activity that verifies external IP is in 'ingressIPNetworkCIDR' range, and in case it isn't - it should allocate a new IP. The relevant part from Openshift logs appears below. D. At some point (I assume Openshift set it when reaching the maximum number of external IP's), it's forbidden to update 'LoadBalancerStatus'. The relevant section from Openshift logs: May 14 06:07:26 gtfgfg openshift[18652]: E0514 06:07:26.808014 18652 service_ingressip_controller.go:580] The ingress ip 172.24.4.13 for service default/test21 is not in the ingress range. A new ip will be allocated. May 14 06:07:26 gtfgfg openshift[18652]: E0514 06:07:26.812957 18652 service_ingressip_controller.go:385] error syncing service, it will be retried: Failed to persist updated LoadBalancerStatus to service 'default/test21': Service "test21" is invalid: spec.externalIPs: Forbidden: externalIPs have been disabled May 14 06:07:26 gtfgfg openshift[18652]: E0514 06:07:26.818129 18652 service_ingressip_controller.go:580] The ingress ip 172.24.4.13 for service default/test21 is not in the ingress range. A new ip will be allocated. E. In the bottom line, doesn't seem like a Kuryr's bug, we need to find a way to configure Openshift not to allocate External IP's for services of type LoadBalancer. And set this configuration when SDN=KURYR OpenShift logs: ---------------- May 14 06:07:26 gtfgfg openshift[18652]: E0514 06:07:26.829383 18652 service_ingressip_controller.go:580] The ingress ip 172.24.4.13 for service default/test21 is not in the ingress range. A new ip will be al located. May 14 06:07:26 gtfgfg openshift[18652]: E0514 06:07:26.830212 18652 service_ingressip_controller.go:385] error syncing service, it will be retried: Failed to persist updated LoadBalancerStatus to service 'def ault/test21': Service "test21" is invalid: spec.externalIPs: Forbidden: externalIPs have been disabled May 14 06:07:26 gtfgfg openshift[18652]: E0514 06:07:26.850656 18652 service_ingressip_controller.go:580] The ingress ip 172.24.4.13 for service default/test21 is not in the ingress range. A new ip will be al located. May 14 06:07:26 gtfgfg openshift[18652]: E0514 06:07:26.851521 18652 service_ingressip_controller.go:385] error syncing service, it will be retried: Failed to persist updated LoadBalancerStatus to service 'def ault/test21': Service "test21" is invalid: spec.externalIPs: Forbidden: externalIPs have been disabled May 14 06:07:26 gtfgfg openshift[18652]: E0514 06:07:26.891905 18652 service_ingressip_controller.go:580] The ingress ip 172.24.4.13 for service default/test21 is not in the ingress range. A new ip will be al located. May 14 06:07:26 gtfgfg openshift[18652]: E0514 06:07:26.892907 18652 service_ingressip_controller.go:385] error syncing service, it will be retried: Failed to persist updated LoadBalancerStatus to service 'def ault/test21': Service "test21" is invalid: spec.externalIPs: Forbidden: externalIPs have been disabled May 14 06:07:26 gtfgfg openshift[18652]: E0514 06:07:26.973134 18652 service_ingressip_controller.go:580] The ingress ip 172.24.4.13 for service default/test21 is not in the ingress range. A new ip will be al located. May 14 06:07:26 gtfgfg openshift[18652]: E0514 06:07:26.974343 18652 service_ingressip_controller.go:385] error syncing service, it will be retried: Failed to persist updated LoadBalancerStatus to service 'def ault/test21': Service "test21" is invalid: spec.externalIPs: Forbidden: externalIPs have been disabled May 14 06:07:27 gtfgfg openshift[18652]: E0514 06:07:27.134590 18652 service_ingressip_controller.go:580] The ingress ip 172.24.4.13 for service default/test21 is not in the ingress range. A new ip will be al located. May 14 06:07:27 gtfgfg openshift[18652]: E0514 06:07:27.135865 18652 service_ingressip_controller.go:385] error syncing service, it will be retried: Failed to persist updated LoadBalancerStatus to service 'def ault/test21': Service "test21" is invalid: spec.externalIPs: Forbidden: externalIPs have been disabled May 14 06:07:27 gtfgfg openshift[18652]: E0514 06:07:27.456056 18652 service_ingressip_controller.go:580] The ingress ip 172.24.4.13 for service default/test21 is not in the ingress range. A new ip will be al located. May 14 06:07:27 gtfgfg openshift[18652]: E0514 06:07:27.457246 18652 service_ingressip_controller.go:385] error syncing service, it will be retried: Failed to persist updated LoadBalancerStatus to service 'def ault/test21': Service "test21" is invalid: spec.externalIPs: Forbidden: externalIPs have been disabled May 14 06:07:28 gtfgfg openshift[18652]: E0514 06:07:28.097357 18652 service_ingressip_controller.go:580] The ingress ip 172.24.4.13 for service default/test21 is not in the ingress range. A new ip will be al located. May 14 06:07:28 gtfgfg openshift[18652]: E0514 06:07:28.098611 18652 service_ingressip_controller.go:385] error syncing service, it will be retried: Failed to persist updated LoadBalancerStatus to service 'def ault/test21': Service "test21" is invalid: spec.externalIPs: Forbidden: externalIPs have been disabled May 14 06:07:29 gtfgfg openshift[18652]: E0514 06:07:29.378809 18652 service_ingressip_controller.go:580] The ingress ip 172.24.4.13 for service default/test21 is not in the ingress range. A new ip will be al located. May 14 06:07:29 gtfgfg openshift[18652]: E0514 06:07:29.380048 18652 service_ingressip_controller.go:385] error syncing service, it will be retried: Failed to persist updated LoadBalancerStatus to service 'def ault/test21': Service "test21" is invalid: spec.externalIPs: Forbidden: externalIPs have been disabled May 14 06:07:31 gtfgfg openshift[18652]: E0514 06:07:31.940191 18652 service_ingressip_controller.go:580] The ingress ip 172.24.4.13 for service default/test21 is not in the ingress range. A new ip will be al located. May 14 06:07:31 gtfgfg openshift[18652]: E0514 06:07:31.941803 18652 service_ingressip_controller.go:385] error syncing service, it will be retried: Failed to persist updated LoadBalancerStatus to service 'def ault/test21': Service "test21" is invalid: spec.externalIPs: Forbidden: externalIPs have been disabled May 14 06:07:37 gtfgfg openshift[18652]: E0514 06:07:37.061999 18652 service_ingressip_controller.go:580] The ingress ip 172.24.4.13 for service default/test21 is not in the ingress range. A new ip will be al located. May 14 06:07:37 gtfgfg openshift[18652]: E0514 06:07:37.066239 18652 service_ingressip_controller.go:385] error syncing service, it will be retried: Failed to persist updated LoadBalancerStatus to service 'def ault/test21': Service "test21" is invalid: spec.externalIPs: Forbidden: externalIPs have been disabled May 14 06:07:37 gtfgfg openshift[18652]: E0514 06:07:37.232991 18652 watcher.go:208] watch chan error: etcdserver: mvcc: required revision has been compacted May 14 06:07:37 gtfgfg openshift[18652]: W0514 06:07:37.233252 18652 reflector.go:341] github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:86: watch of *v1beta1.DaemonSet ended with: The r esourceVersion for the provided watch is too old. May 14 06:07:42 gtfgfg openshift[18652]: I0514 06:07:42.444996 18652 trace.go:76] Trace[1514250673]: "GuaranteedUpdate etcd3: *core.Endpoints" (started: 2018-05-14 06:07:41.348810335 +0000 UTC m=+62733.0233744 64) (total time: 1.096155925s): May 14 06:07:42 gtfgfg openshift[18652]: Trace[1514250673]: [1.096082375s] [1.095176139s] Transaction committed May 14 06:07:42 gtfgfg openshift[18652]: I0514 06:07:42.445072 18652 trace.go:76] Trace[1398341270]: "Get /api/v1/namespaces/kube-system/configmaps/kube-scheduler" (started: 2018-05-14 06:07:41.7873538 +0000 U TC m=+62733.461917921) (total time: 657.69886ms): May 14 06:07:42 gtfgfg openshift[18652]: Trace[1398341270]: [657.637651ms] [657.631403ms] About to write a response
When setting Openshift cloud provider to "OpenStack", Openshift shouldn't allocate External IP for services of type LoadBalancer. There's an Open bug for this issue [1]. https://bugzilla.redhat.com/show_bug.cgi?id=1593662 So, when [1] is resolved, the service's external IP should be under service status/ingress/.. , and we should be able to access this service. [1] : https://bugzilla.redhat.com/show_bug.cgi?id=1593662
We'll add a doc note to the openshift-ansible documentation that using kuryr also requires the openstack cloud provider to be specified.
Merged upstream and backported in https://github.com/openshift/openshift-ansible/pull/9409
test
Should be in openshift-ansible-3.10.28-1
Verified in openshift-ansible-3.10.51-1.git.0.44a646c.el7.noarch. /usr/share/ansible/openshift-ansible/playbooks/openstack/configuration.md file includes: Finally, you *must* set up an OpenStack cloud provider as specified in [OpenStack Cloud Provider Configuration](#openstack-cloud-provider-configuration).