Bug 1451881

Summary: headless services causes SDN initialization failure for master-controllers when network change.
Product: OpenShift Container Platform Reporter: Ryan Howe <rhowe>
Component: NetworkingAssignee: Jacob Tanenbaum <jtanenba>
Status: CLOSED ERRATA QA Contact: Meng Bo <bmeng>
Severity: high Docs Contact:
Priority: high    
Version: 3.5.0CC: aloughla, aos-bugs, yadu
Target Milestone: ---Keywords: Reopened
Target Release: 3.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: When initializing openshift sdn fail to allow a nil as a valid service IP Consequence: openshift sdn failed to initialize causing master node to not fail when using headless services Fix: Allow nil for a valid value of srv.Spec.ClusterIP Result: Openshift sdn properly starts when master node is restarted
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-11-28 21:55:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ryan Howe 2017-05-17 18:05:58 UTC
Description of problem:

When changes are made to SDN plugin, the master controller will fail to start when there are headless services in the cluster. 

Error Message: 

[run_components.go:384] SDN initialization failed: Error: Existing service with IP: None is not part of service network: 172.30.0.0/16

https://kubernetes.io/docs/concepts/services-networking/service/#headless-services


Code that checks and errors out. 

https://github.com/openshift/openshift-sdn/blob/master/plugins/osdn/master.go#L111-L120

Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. Create headless service 

# oc create -f - << EOF
apiVersion: v1
kind: Service
metadata:
  name: hello-openshift
spec:
  selector:
    name: hello-openshift
  portalIP: None
  clusterIP: None
  ports:
  - port: 8080
    protocol: TCP
    targetPort: 8080
EOF
```

2. Change sdn plugin 

Master
  # sed -i 's/openshift-ovs-subnet/openshift-ovs-multitenant/g' /etc/origin/master/master-config.yaml
  # sed -i 's/openshift-ovs-subnet/openshift-ovs-multitenant/g' /etc/origin/node/node-config.yaml
 
Node
  # sed -i 's/openshift-ovs-subnet/openshift-ovs-multitenant/g' /etc/origin/node/node-config.yaml


3. Restart masters and nodes 


Actual results:

SDN initialization failed: Error: Existing service with IP: None is not part of service network: 172.30.0.0/16


Expected results:
The SDN not to fail with hitting headless services 


Additional info:

The metrics and logging deployer configures headless services that have 

  clusterIP: None

Example: 

https://github.com/openshift/origin-metrics/blob/master/deployer/templates/hawkular-cassandra.yaml#L47

Comment 2 Jacob Tanenbaum 2017-05-31 17:54:33 UTC
This was fixed in 3.5 in PR722 (https://github.com/openshift/ose/pull/722) Commit #2

Comment 4 Yan Du 2017-06-05 10:18:46 UTC
Test on OCP 3.5 env
openshift v3.5.5.23
kubernetes v1.5.2+43a9be4

SDN works well after changing network with headless service.

Comment 8 errata-xmlrpc 2017-11-28 21:55:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188