1389205 – Master is enforced to find neutron LBaaS extension when openstack cloud provider is enabled

Bug 1389205 - Master is enforced to find neutron LBaaS extension when openstack cloud provider is enabled

Summary: Master is enforced to find neutron LBaaS extension when openstack cloud provi...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Node
Sub Component:
Version:	3.4.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Seth Jennings
QA Contact:	DeShuai Ma
Docs Contact:
URL:
Whiteboard:	aos-scalability-34
Depends On:
Blocks:	1465722
TreeView+	depends on / blocked

Reported:	2016-10-27 07:52 UTC by Jianwei Hou
Modified:	2017-06-28 02:48 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	This bug fixes and issue with the OpenShift master when the OpenStack cloud provider is used. If the master service controller is unable to connect with the LBaaS API, it prevents the master from starting. With this fix, the failure is treated as non-fatal. Services with type LoadBalancer will not work, as the master is able to create the load balancer in the cloud provider, but the master functions normally.
Clone Of:
Clones:	1465722 (view as bug list)
Environment:
Last Closed:	2017-01-18 12:46:53 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2017:0066	0	normal	SHIPPED_LIVE	Red Hat OpenShift Container Platform 3.4 RPM Release Advisory	2017-01-18 17:23:26 UTC

Description Jianwei Hou 2016-10-27 07:52:18 UTC

Description of problem:
On OpenStack instances, configure master to enable cloud provider, master can not be started because it always tries to find neutron LBaaS extension. But in our OpenStack, the load balancer doesn't exist.

Version-Release number of selected component (if applicable):
openshift v3.4.0.16+cc70b72
kubernetes v1.4.0+776c994
etcd 3.1.0-rc.0

How reproducible:
Always

Steps to Reproduce:
1. Enable openstack cloud provider following https://docs.openshift.com/container-platform/3.3/install_config/configuring_openstack.html. In the  /etc/cloud.conf, do not configure a LoadBalancer.
2. Restart atomic-openshift-master

Actual results:
Master can't be restarted.

Error logs:

```
Oct 27 02:43:57 host-8-175-178 atomic-openshift-node: I1027 02:43:57.992963    8761 kubelet.go:2373] SyncLoop (housekeeping)
Oct 27 02:43:58 host-8-175-178 atomic-openshift-master: W1027 02:43:58.030617   18006 openstack.go:407] Failed to find neutron LBaaS extension (v1 or v2)
Oct 27 02:43:58 host-8-175-178 atomic-openshift-master: F1027 02:43:58.030650   18006 master.go:453] Unable to start service controller: the cloud provider does not support external load balancers.
Oct 27 02:43:58 host-8-175-178 systemd: atomic-openshift-master.service: main process exited, code=exited, status=255/n/a
Oct 27 02:43:58 host-8-175-178 systemd: Unit atomic-openshift-master.service entered failed state.
Oct 27 02:43:58 host-8-175-178 systemd: atomic-openshift-master.service failed.
```

Expected results:
Master should not be enforced to find load balancer when enabling cloud provider.

Additional info:

Comment 1 Seth Jennings 2016-10-27 22:24:09 UTC

The kube upstream service controller is enforcing this in init():

pkg/controller/service/servicecontroller.go:

func (s *ServiceController) init() error {
	if s.cloud == nil {
		return fmt.Errorf("WARNING: no cloud provider provided, services of type LoadBalancer will fail.")
	}

	balancer, ok := s.cloud.LoadBalancer()
	if !ok {
		return fmt.Errorf("the cloud provider does not support external load balancers.")
	}
	s.balancer = balancer

It has been the way for upstream kube since 1.2.

This is the Origin commit that introduced the LB requirement by running New() on the service controller rather than Run().

https://github.com/openshift/origin/commit/3de3eec624b410bcf7b6705133919ef98331f3f4#diff-62e1d6e1ea7f763bd41c44733aa2fba2R387

Run() doesn't call init() where New() does.

Comment 2 Seth Jennings 2016-10-28 14:29:04 UTC

Scratch that last comment.  init() was called by Run() before.

The serivce controller was added in Origin 1.3 here:
https://github.com/openshift/origin/commit/7609d9bbc439df91d170f9447e972c72ef62f558

I imagine this has been an issue every since we added it.  Confirming.

Comment 3 Seth Jennings 2016-10-28 18:17:22 UTC

Upstream PR:
https://github.com/openshift/origin/pull/11648

Turns out the fix is pretty straightforward.  In kube, the failure of New() is non-fatal.  However, due to the way we are starting it in origin, it is fatal if a cloud provider is configured.

PR changes from fatal to a warning we can proceed with the only impact being services with type LoadBalancer won't work.

Comment 4 Liang Xia 2016-10-31 10:25:24 UTC

The latest build (v3.4.0.17+b8a03bc) does not contain the fix code.

https://github.com/openshift/ose/blob/v3.4.0.17/pkg/cmd/server/kubernetes/master.go

QE will try again once the build contains the fix.

Comment 5 Liang Xia 2016-11-01 07:40:40 UTC

The master service can be restarted and running on below build. Mark the bug as verified.

# openshift version
openshift v3.4.0.18+ada983f
kubernetes v1.4.0+776c994
etcd 3.1.0-rc.0

Comment 7 errata-xmlrpc 2017-01-18 12:46:53 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0066

Note You need to log in before you can comment on or make changes to this bug.