1491202 – [Federation] Failed to create load balancer for service federation-system/apiserver on GCE

Bug 1491202 - [Federation] Failed to create load balancer for service federation-system/apiserver on GCE

Summary: [Federation] Failed to create load balancer for service federation-system/api...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Master
Sub Component:
Version:	3.7.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	3.7.0
Assignee:	David Eads
QA Contact:	Qixuan Wang
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-09-13 09:59 UTC by Qixuan Wang
Modified:	2017-11-28 22:10 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-11-28 22:10:32 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
atomic-openshift-master-controllers (93.40 KB, text/x-vhdl) 2017-09-15 09:47 UTC, Qixuan Wang	no flags	Details
atomic-openshift-master-api (1.62 MB, text/x-vhdl) 2017-09-15 09:49 UTC, Qixuan Wang	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2017:3188	0	normal	SHIPPED_LIVE	Moderate: Red Hat OpenShift Container Platform 3.7 security, bug, and enhancement update	2017-11-29 02:34:54 UTC

Description Qixuan Wang 2017-09-13 09:59:57 UTC

Description of problem:
The apiserver service is pending and blocks subsequent process during federation control panel. The error is "Failed to create load balancer for service federation-system/qwangfed-apiserver: GCECloud.ClusterID is not ready. Call Initialize() before using"
This problem occurs in OCP 3.7.0-0.125.0 with image ose-federation:v3.7.0-0.125.0 and v3.6.173.0.30, not in OCP 3.6.173.0.30 with the same images.


Version-Release number of selected component (if applicable):
openshift v3.7.0-0.125.0
kubernetes v1.7.0+695f48a16f
etcd 3.2.1
registry.ops.openshift.com/openshift3/ose-federation:v3.7.0-0.125.0 

How reproducible:
Always

Steps to Reproduce:
1. Initialize federation control panel 
2. Check whether federation control panel works


Actual results:
1. [root@preserve-910-qe-qwang-37-federation-master-etcd-nfs-1 ~]# kubefed init qwangfed --dns-provider=google-clouddns --dns-zone-name=federation.ocpqe.com. --etcd-persistent-storage=true --image=registry.ops.openshift.com/openshift3/ose-federation:v3.7.0-0.125.0
Creating a namespace federation-system for federation system components... done
Creating federation control plane service...........................................


2. [root@preserve-910-qe-qwang-37-federation-master-etcd-nfs-1 ~]# oc get all -n federation-system
NAME                          CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE
svc/qwangfed36173-apiserver   172.30.138.23   <pending>     443:32539/TCP   56m


[root@preserve-910-qe-qwang-37-federation-master-etcd-nfs-1 ~]# oc describe svc/qwangfed36173-apiserver -n federation-system
Name:			qwangfed36173-apiserver
Namespace:		federation-system
Labels:			app=federated-cluster
Annotations:		federation.alpha.kubernetes.io/federation-name=qwangfed36173
Selector:		app=federated-cluster,module=federation-apiserver
Type:			LoadBalancer
IP:			172.30.138.23
Port:			https	443/TCP
NodePort:		https	32539/TCP
Endpoints:		<none>
Session Affinity:	None
Events:
  FirstSeen	LastSeen	Count	From			SubObjectPath	Type		Reason				Message
  ---------	--------	-----	----			-------------	--------	------				-------
  57m		2m		17	service-controller			Normal		CreatingLoadBalancer		Creating load balancer
  57m		2m		17	service-controller			Warning		CreatingLoadBalancerFailed	Error creating load balancer (will retry): Failed to create load balancer for service federation-system/qwangfed36173-apiserver: GCECloud.ClusterID is not ready. Call Initialize() before using.


Expected results:
The load balancer can be created.


Additional info:

Comment 1 Derek Carr 2017-09-13 16:37:47 UTC

This looks like an error creating a Service type LoadBalancer on GCE (not specific to federation).

Comment 2 Michal Fojtik 2017-09-14 11:40:34 UTC

Can you please provide the master logs when this happen? Also is this permanently broken or it fixes itself (as there should be retry). To me this seems like the GCE is lagging in setting the cluster ID as ready.

Comment 3 Qixuan Wang 2017-09-15 09:47:37 UTC

Created attachment 1326376 [details]
atomic-openshift-master-controllers

Comment 4 Qixuan Wang 2017-09-15 09:49:41 UTC

Created attachment 1326377 [details]
atomic-openshift-master-api

Comment 5 Qixuan Wang 2017-09-15 09:50:32 UTC

It's permanently broken. Attached the master logs.

Comment 8 Zhang Cheng 2017-09-18 03:23:10 UTC

It blocks the federation cluster setup for OpenShift 3.7 on GCP.

Comment 9 Michal Fojtik 2017-09-18 10:11:54 UTC

The https://github.com/openshift/origin/pull/16089 was merged, setting this ON_QA.

Comment 10 Qixuan Wang 2017-09-19 10:30:53 UTC

The fix has not been in OCP 3.7.0-0.126.4 yet.

Comment 11 Michal Fojtik 2017-09-26 13:22:11 UTC

Moving back to modified.

Comment 13 Qixuan Wang 2017-10-13 09:59:42 UTC

Tested on openshift v3.7.0-0.143.2 (kubernetes v1.7.0+80709908fd, etcd 3.2.1, registry.ops.openshift.com/openshift3/ose-federation:v3.7.0-0.147.1), the bug has been fixed, thanks.


[root@preserve-qw-master-etcd-nfs-1 ~]# oc get all -n federation-system
NAME                                              READY     STATUS    RESTARTS   AGE
po/qwangfed-apiserver-2147565502-7b76l            2/2       Running   0          36m
po/qwangfed-controller-manager-1909511936-s5kqx   1/1       Running   1          38m

NAME                     CLUSTER-IP     EXTERNAL-IP                     PORT(S)         AGE
svc/qwangfed-apiserver   172.30.0.140   172.29.147.174,172.29.147.174   443:32697/TCP   38m

NAME                                 DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deploy/qwangfed-apiserver            1         1         1            1           38m
deploy/qwangfed-controller-manager   1         1         1            1           38m

NAME                                        DESIRED   CURRENT   READY     AGE
rs/qwangfed-apiserver-2147565502            1         1         1         36m
rs/qwangfed-apiserver-265993101             0         0         0         38m
rs/qwangfed-controller-manager-1909511936   1         1         1         38m

Comment 17 errata-xmlrpc 2017-11-28 22:10:32 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188

Note You need to log in before you can comment on or make changes to this bug.