Bug 1468607 - OPENSHIFT_MASTER & KUBERNETES_MASTER env vars in deployer pods wrong
Summary: OPENSHIFT_MASTER & KUBERNETES_MASTER env vars in deployer pods wrong
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Master
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
: 3.7.0
Assignee: Dan Mace
QA Contact: Chuan Yu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-07-07 13:58 UTC by Eduardo Minguez
Modified: 2018-04-16 14:49 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-16 14:49:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Eduardo Minguez 2017-07-07 13:58:58 UTC
Description of problem:

When you deploy a new app in OCP, the deployer pod contains a few environment variables like OPENSHIFT_MASTER & KUBERNETES_MASTER but their value is not the proper API url (in an HA environment, the LB), but just one of the masters API:

$ oc new-project foobar
$ oc new-app kubernetes/guestbook
$ oc get pod -o yaml guestbook-1-deploy
apiVersion: v1
kind: Pod
...
spec:
  activeDeadlineSeconds: 21600
  containers:
  - env:
    - name: KUBERNETES_MASTER
      value: https://master1.minwi.local:8443
    - name: OPENSHIFT_MASTER
      value: https://master1.minwi.local:8443

$ oc get nodes
NAME                    STATUS                     AGE
master1.minwi.local     Ready,SchedulingDisabled   192d
master2.minwi.local     Ready,SchedulingDisabled   192d
master3.minwi.local     Ready,SchedulingDisabled   192d
node1.minwi.local       Ready                      192d
node2.minwi.local       Ready                      192d
nodeinfra.minwi.local   Ready                      192d

master-config.yaml seems to be ok:

[master1]# grep -i url /etc/origin/master/master-config.yaml
  loggingPublicURL: https://kibana.apps.minwi.com
  logoutURL: ""
  masterPublicURL: https://ocp.minwi.com:8443
  metricsPublicURL: https://hawkular-metrics.apps.minwi.com/hawkular/metrics
  publicURL: https://ocp.minwi.com:8443/console/
  urls:
masterPublicURL: https://ocp.minwi.com:8443
  assetPublicURL: https://ocp.minwi.com:8443/console/
  masterPublicURL: https://ocp.minwi.com:8443
  masterURL: https://ocp.minwi.local:8443

Version-Release number of selected component (if applicable):
oc v3.5.5.26
kubernetes v1.5.2+43a9be4
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://emingueztst.eastus2.cloudapp.azure.com:8443
openshift v3.5.5.26
kubernetes v1.5.2+43a9be4


How reproducible:
Create a new deployment and check the deployer pod environment variables


Steps to Reproduce:
1. oc new-project foobar
2. oc new-app kubernetes/guestbook
3. oc get pod -o yaml guestbook-1-deploy
apiVersion: v1
kind: Pod
...
spec:
  activeDeadlineSeconds: 21600
  containers:
  - env:
    - name: KUBERNETES_MASTER
      value: https://master1.minwi.local:8443
    - name: OPENSHIFT_MASTER
      value: https://master1.minwi.local:8443

Actual results:
Both variables values indicates just one of the masters instead the proper load balancer URL

Expected results:
Proper value for those variables.

Additional info:

Comment 2 Seth Jennings 2017-08-08 16:52:10 UTC
I am able to reproduce.

From what I can tell, the master that the deploy pods use is determined by the which controller-manager on the HA masters wins the leader election.

I think the controller-manager uses the /etc/origin/master/openshift-master.kubeconfig as its client config which uses itself as the server, not the LB.

This means that API request from the controller-manager that won the election do not go through the LB, leading to higher stress on the master whose controller-manager is the leader.  However, it probably also leads to lower latency on those requests since the master the controller-manager is talking to is colocated.

I can see it either way, but the side effect on the deploy pods is unfortunate since the pod spec that the deployer controller generates uses the server from the controller-manager's client config as the OPENSHIFT_MASTER and KUBERNETES_MASTER.

Comment 3 Seth Jennings 2017-08-25 18:57:05 UTC
sending to Master

tl;dr OPENSHIFT_MASTER and KUBERNETES_MASTER env vars for builder pods are set to the master co-resident with the active controller-manager by the build controller, not the master LB.

Comment 4 Dan Mace 2017-09-01 18:18:12 UTC
To be clear, the supported deployer code constructs API clients using the KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT variables. The OPENSHIFT_MASTER and KUBERNETES_MASTER variables are no longer used. Did somebody observe evidence of the deployer container actually communicating with the master outside the service IP, or was that just assumed based on the values of these defunct variables?

Changing the KUBERNETES_MASTER and OPENSHIFT_MASTER variables should only have an effect on any custom deployer image code which happens to be using them (and any such usages should be ported to use KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT). Are we aware of any such usages?

We should remove OPENSHIFT_MASTER and KUBERNETES_MASTER from the deployer pod container environment. If there are any compatibility concerns regarding custom deployer images, we could provide a transitional period during 3.7 where the values are set to a URL like:

https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT

Then we could remove the variables entirely in 3.8.

Comment 5 Dan Mace 2017-09-01 19:32:15 UTC
It also seems acceptable to leave the variables as-is and mark them deprecated in release notes until we're satisfied they're not in use by custom deployer/hook code.

Comment 6 Eduardo Minguez 2017-09-04 08:20:40 UTC
(In reply to Dan Mace from comment #4)
> To be clear, the supported deployer code constructs API clients using the
> KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT variables. The
> OPENSHIFT_MASTER and KUBERNETES_MASTER variables are no longer used. Did
> somebody observe evidence of the deployer container actually communicating
> with the master outside the service IP, or was that just assumed based on
> the values of these defunct variables?

I've just seen them as env vars in the deployer pod but to be honest, I don't know if they are used or not.

> 
> Changing the KUBERNETES_MASTER and OPENSHIFT_MASTER variables should only
> have an effect on any custom deployer image code which happens to be using
> them (and any such usages should be ported to use KUBERNETES_SERVICE_HOST
> and KUBERNETES_SERVICE_PORT). Are we aware of any such usages?
> 

I have no idea.

> We should remove OPENSHIFT_MASTER and KUBERNETES_MASTER from the deployer
> pod container environment. If there are any compatibility concerns regarding
> custom deployer images, we could provide a transitional period during 3.7
> where the values are set to a URL like:
> 
> https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT
> 
> Then we could remove the variables entirely in 3.8.

I like to have a "transition period" just in case.

Thanks

Comment 7 Dan Mace 2017-09-07 15:37:33 UTC
Release note to drive deprecation of these variables: https://github.com/openshift/openshift-docs/issues/4906

Comment 8 Eduardo Minguez 2017-11-30 09:09:20 UTC
According the 3.7 release notes, this has been fixed, right?

Comment 9 Dan Mace 2018-04-16 14:49:44 UTC
KUBERNETES_MASTER and OPENSHIFT_MASTER in deployer pods were deprecated, and no further changes to their behavior will be made.


Note You need to log in before you can comment on or make changes to this bug.