Bug 1684047

Summary: The cluster cannot be setup successfully when set vxlanPort is not the default 4789
Product: OpenShift Container Platform Reporter: zhaozhanqi <zzhao>
Component: NetworkingAssignee: Casey Callendrello <cdc>
Status: CLOSED ERRATA QA Contact: Meng Bo <bmeng>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.1.0CC: aos-bugs, wsun
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-04 10:44:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
apiserver log none

Description zhaozhanqi 2019-02-28 10:18:08 UTC
Created attachment 1539418 [details]
apiserver log

Description of problem:
When set the vxlanPort to another port,eg 4889 in NetworkConfig yaml file, and then setup the cluster. it cannot be installed successfully and the apiserver pod cannot be running due to the Security Group in AWS did not open this port. 

Version-Release number of selected component (if applicable):
4.0.0-0.nightly-2019-02-27-213933

How reproducible:
always

Steps to Reproduce:
1. Create the install config
  ./openshift-install create install-config

2.Create the manifests

  ./openshift-install create manifests

3. Add the following file in /manifest/cluster-network-03-config.yml

apiVersion: networkoperator.openshift.io/v1
kind: NetworkConfig
metadata:
  name: cluster
spec:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  defaultNetwork:
    openshiftSDNConfig:
      mode: Multitenant
    type: OpenshiftSDN
  serviceNetwork:
  - 172.30.0.0/16

4. setup the cluster
   ./openshift-install create cluster --log-level=debug
  
5. Check the apiserver cannot be running stable

Actual results:

the cluster cannot be setup successfully.

Expected results:

The cluster can be setup successfully when set the vxlanPort to another port


Additional info:

the reason is the 4789 had been hard code in openshift-install.  when I updated the port to 4889 in Security Group during the installation. the cluster can be setup successfully.

Comment 1 zhaozhanqi 2019-02-28 10:21:53 UTC
sorry, I forgot update the vxlanPort in /manifest/cluster-network-03-config.yml.  the correct one:


apiVersion: networkoperator.openshift.io/v1
kind: NetworkConfig
metadata:
  name: cluster
spec:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  defaultNetwork:
    openshiftSDNConfig:
      mode: Multitenant
      vxlanPort: 4789
    type: OpenshiftSDN
  serviceNetwork:
  - 172.30.0.0/16

Comment 2 zhaozhanqi 2019-02-28 10:23:08 UTC
sorry, should be vxlanPort: 4889(In reply to zhaozhanqi from comment #1)
> sorry, I forgot update the vxlanPort in
> /manifest/cluster-network-03-config.yml.  the correct one:
> 
> 
> apiVersion: networkoperator.openshift.io/v1
> kind: NetworkConfig
> metadata:
>   name: cluster
> spec:
>   clusterNetwork:
>   - cidr: 10.128.0.0/14
>     hostPrefix: 23
>   defaultNetwork:
>     openshiftSDNConfig:
>       mode: Multitenant
>       vxlanPort: 4789
>     type: OpenshiftSDN
>   serviceNetwork:
>   - 172.30.0.0/16

sorry, should be vxlanPort: 4889

Comment 3 Casey Callendrello 2019-02-28 13:03:53 UTC
Ah, of course. This makes sense, because we set up a security group that blocks most ports between nodes in the installer.

And we block *all* other UDP connections. So... you won't actually be able to test this without hacking the security group rules.


Filed PR https://github.com/openshift/installer/pull/1334 to allow UDP ports as well. Then this will be testable.

Comment 4 Casey Callendrello 2019-03-01 13:41:59 UTC
Assigning this to 4.1 - we don't need it for AWS.

Comment 5 Casey Callendrello 2019-03-05 18:04:58 UTC
I made a small change to the installer - you can now try ports 9000-9999 for vxlan.

Comment 7 zhaozhanqi 2019-03-14 07:34:18 UTC
verified this bug on 4.0.0-0.nightly-2019-03-13-233958

Comment 9 errata-xmlrpc 2019-06-04 10:44:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758