Bug 1823403 - Port 9091 has to be made configurable in OCS
Summary: Port 9091 has to be made configurable in OCS
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Container Storage
Classification: Red Hat Storage
Component: ocs-operator
Version: 4.2
Hardware: Unspecified
OS: Linux
unspecified
high
Target Milestone: ---
: OCS 4.5.0
Assignee: umanga
QA Contact: Jilju Joy
URL:
Whiteboard:
Depends On:
Blocks: 1859307
TreeView+ depends on / blocked
 
Reported: 2020-04-13 15:37 UTC by akgunjal@in.ibm.com
Modified: 2020-09-23 09:04 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
.CSI ports are now configurable With the Red Hat OpenShift Container storage 4.5 release, CSI ports are configurable in OpenShift Container Storage using the `rook-ceph-operator-config` `ConfigMap`. The CSI ports can be changed to any other valid port number, providing more flexibility to the administrator. This enhancement is necessary because the default ports may be in use by other applications.
Clone Of:
Environment:
Last Closed: 2020-09-15 10:16:49 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ocs-operator pull 477 0 None closed creates default rook-ceph-operator-config ConfigMap 2021-02-01 23:42:38 UTC
Red Hat Product Errata RHBA-2020:3754 0 None None None 2020-09-15 10:17:21 UTC

Description akgunjal@in.ibm.com 2020-04-13 15:37:56 UTC
Description of problem (please be detailed as possible and provide log
snippests):
OCS has hard-coded to use port 9091 today which clashes with Calico in RedHat OpenShift kubernetes service (ROKS). This needs to be made configurable so the OCS installation can take ports which are open in ROKS.

Version of all relevant components (if applicable):


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
OCS deploy fails as port 9091 is already in use by Calico.


Is there any workaround available to the best of your knowledge?
I change port of calico component manually so I can deploy OCS.


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
3


Can this issue reproducible?
Yes

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1.Deploy OCS on RedHat OpenShift kubernetes service (ROKS)


Actual results:
The Ceph RBD and Ceph FS daemonsets fail as port 9091 is in use.


Expected results:


Additional info:
We have a GHE open for this issue https://github.com/openshift/ocs-operator/issues/451

Comment 2 Raz Tamir 2020-04-13 16:28:30 UTC
Not a blocker for the release.
Moving accordingly

Comment 3 Jose A. Rivera 2020-04-14 14:29:56 UTC
Similarly not a blocker for OCS 4.4, moving to OCS 4.5.

Travis, do you know if this is configurable in Rook-Ceph today, and if so what we'd need to do in the ocs-operator to make use of it?

Comment 4 Jose A. Rivera 2020-04-14 14:31:24 UTC
Assigning this to Umanga temporarily so he looks into this a bit more.

Comment 6 Travis Nielsen 2020-04-14 15:17:28 UTC
Yes, this is configurable in Rook today with the operator setting found here:
https://github.com/rook/rook/blob/master/cluster/examples/kubernetes/ceph/operator.yaml#L90

Madhu can answer any questions around it as well for the csi settings.

Comment 7 Eran Tamir 2020-05-07 13:54:11 UTC
@umanga can we make sure this is going into OCS 4.5?

Comment 9 Travis Nielsen 2020-05-07 23:53:04 UTC
@Umanga, The rook release-4.5 branch is already updated with the latest release-1.3 upstream changes. Is there another changes you are waiting for?

Comment 10 umanga 2020-05-14 05:00:56 UTC
(In reply to Travis Nielsen from comment #9)
> @Umanga, The rook release-4.5 branch is already updated with the latest
> release-1.3 upstream changes. Is there another changes you are waiting for?
>
I am waiting for OCS Operator dependency to be updated to v1.3. Currently it is v1.2.4.
Without this change, PR can not be merged to OCS Operator.

Comment 15 Jilju Joy 2020-07-09 12:42:44 UTC
Tested in 

OpenShift Container Storage   4.5.0-482.ci
Cluster version is 4.5.0-0.nightly-2020-07-07-210042
AWS platform.

----------------------------------------------------------------------------------------------------------------------------------

Before updating rook-ceph-operator-config configmap.

1. Output from a csi-cephfsplugin pod which shows metricsport=9091 (default) in csi-cephfsplugin container args

csi-cephfsplugin:
    Container ID:  cri-o://4ade59a41e1ec9c71655ce6428661fcec9df3fdc90c4c0699ae24755c02a9e51
    Image:         quay.io/rhceph-dev/cephcsi@sha256:d909420cf801be463e7aaaa95217cd90011d0003c021161d2eae1e640935b8b1
    Image ID:      quay.io/rhceph-dev/cephcsi@sha256:17ed0d09bddaed5f0368a9200960b9531aac1edee3c9ac318dda798279459a5f
    Port:          <none>
    Host Port:     <none>
    Args:
      --nodeid=$(NODE_ID)
      --type=cephfs
      --endpoint=$(CSI_ENDPOINT)
      --v=0
      --nodeserver=true
      --drivername=openshift-storage.cephfs.csi.ceph.com
      --metadatastorage=k8s_configmap
      --mountcachedir=/mount-cache-dir
      --pidlimit=-1
      --metricsport=9091
      --forcecephkernelclient=true
      --metricspath=/metrics
      --enablegrpcmetrics=true

2. From worker node.
   # lsof -i -P -n | grep cephcsi
cephcsi   36475     root    7u  IPv4  219549      0t0  TCP 10.0.137.45:9090 (LISTEN)
cephcsi   36476     root    7u  IPv4  218375      0t0  TCP 10.0.137.45:9091 (LISTEN)
cephcsi   36665     root    3u  IPv4  214909      0t0  TCP 10.0.137.45:9081 (LISTEN)
cephcsi   36668     root    3u  IPv4  223434      0t0  TCP 10.0.137.45:9080 (LISTEN)

----------------------------------------------------------------------------------------------------------------------------------
Verification steps performed on an existing cluster:


1. Edit configmap rook-ceph-operator-config and add this parameter under 'data'. The parameter CSI_CEPHFS_GRPC_METRICS_PORT itself is not present in rook-ceph-operator-config. The default value will be 9091 (Value --metricsport=9091 from a csi-cephfsplugin pod describe output). So add a different port which is not in use.

data:
  CSI_CEPHFS_GRPC_METRICS_PORT: "9062"
  
2. Wait for csi-cephfsplugin and csi-cephfsplugin-provisioner pods to re-spin.

3. Do oc describe of csi-cephfsplugin and csi-cephfsplugin-provisioner pods and check the value of metricsport in csi-cephfsplugin container args. The port should be updated to the value 9062 given in step 1. This proves the port is configurable.
  
  csi-cephfsplugin:
    Container ID:  cri-o://9da599079de3782ffd5e140e9d2dd0d6a735b2b5f74f1097c07e1b6f32c3e723
    Image:         quay.io/rhceph-dev/cephcsi@sha256:d909420cf801be463e7aaaa95217cd90011d0003c021161d2eae1e640935b8b1
    Image ID:      quay.io/rhceph-dev/cephcsi@sha256:17ed0d09bddaed5f0368a9200960b9531aac1edee3c9ac318dda798279459a5f
    Port:          <none>
    Host Port:     <none>
    Args:
      --nodeid=$(NODE_ID)
      --type=cephfs
      --endpoint=$(CSI_ENDPOINT)
      --v=0
      --controllerserver=true
      --drivername=openshift-storage.cephfs.csi.ceph.com
      --metadatastorage=k8s_configmap
      --pidlimit=-1
      --metricsport=9062
      --forcecephkernelclient=true
      --metricspath=/metrics
      --enablegrpcmetrics=true

4. Check the port in worker nodes. Port 9091 is now changed to 9062.
  # lsof -i -P -n | grep cephcsi
cephcsi    36475     root    7u  IPv4  219549      0t0  TCP 10.0.137.45:9090 (LISTEN)
cephcsi    36668     root    3u  IPv4  223434      0t0  TCP 10.0.137.45:9080 (LISTEN)
cephcsi   791774     root    5u  IPv4 3629877      0t0  TCP 10.0.137.45:9062 (LISTEN)
cephcsi   791863     root    3u  IPv4 3643614      0t0  TCP 10.0.137.45:9081 (LISTEN)




I will test this in a fresh installation and update here.

Comment 16 Jilju Joy 2020-07-09 12:45:51 UTC
Hi Umanga,

Please update doc text to document the steps to be performed in existing cluster and in a fresh installation.

Comment 17 umanga 2020-07-14 12:38:35 UTC
Github Issue for context on how to use it : https://github.com/openshift/ocs-operator/issues/451#issuecomment-638685472

Comment 18 Jilju Joy 2020-07-17 08:01:01 UTC
Tested in 

OpenShift Container Storage   4.5.0-493.ci
Cluster version is 4.5.0-0.nightly-2020-07-17-014709
AWS platform.

Test: Change CSI_CEPHFS_GRPC_METRICS_PORT port before installing OCS storage cluster when it is identified that the port 9091 is in use by another application
-----------------------------------------------------------------------------------------------------------------------------------------------------------

Before installing storage cluster

$ oc get csv
NAME                         DISPLAY                       VERSION        REPLACES   PHASE
ocs-operator.v4.5.0-493.ci   OpenShift Container Storage   4.5.0-493.ci              Succeeded
 
$ oc get configmap rook-ceph-operator-config -o yaml
apiVersion: v1
data:
  CSI_PLUGIN_TOLERATIONS: |2-

    - key: node.ocs.openshift.io/storage
      operator: Equal
      value: "true"
      effect: NoSchedule
  CSI_PROVISIONER_TOLERATIONS: |2-

    - key: node.ocs.openshift.io/storage
      operator: Equal
      value: "true"
      effect: NoSchedule
kind: ConfigMap
metadata:
  creationTimestamp: "2020-07-17T06:55:43Z"
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:data:
        .: {}
        f:CSI_PLUGIN_TOLERATIONS: {}
        f:CSI_PROVISIONER_TOLERATIONS: {}
    manager: ocs-operator
    operation: Update
    time: "2020-07-17T06:55:43Z"
  name: rook-ceph-operator-config
  namespace: openshift-storage
  resourceVersion: "37945"
  selfLink: /api/v1/namespaces/openshift-storage/configmaps/rook-ceph-operator-config
  uid: 58be63bc-4e71-49bb-9b70-56c4353bd861
 

Step 1: Edit configmap rook-ceph-operator-config and add this parameter under 'data'. The parameter CSI_CEPHFS_GRPC_METRICS_PORT itself is not present in rook-ceph-operator-config.
data:
  CSI_CEPHFS_GRPC_METRICS_PORT: "9061"


Step 2: Verify the value is present in rook-ceph-operator-config yaml.

$ oc get configmap rook-ceph-operator-config -o yaml
apiVersion: v1
data:
  CSI_CEPHFS_GRPC_METRICS_PORT: "9061"
  CSI_PLUGIN_TOLERATIONS: |2-

    - key: node.ocs.openshift.io/storage
      operator: Equal
      value: "true"
      effect: NoSchedule
  CSI_PROVISIONER_TOLERATIONS: |2-

    - key: node.ocs.openshift.io/storage
      operator: Equal
      value: "true"
      effect: NoSchedule
kind: ConfigMap
metadata:
  creationTimestamp: "2020-07-17T06:55:43Z"
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:data:
        .: {}
        f:CSI_PLUGIN_TOLERATIONS: {}
        f:CSI_PROVISIONER_TOLERATIONS: {}
    manager: ocs-operator
    operation: Update
    time: "2020-07-17T06:55:43Z"
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:data:
        f:CSI_CEPHFS_GRPC_METRICS_PORT: {}
    manager: oc
    operation: Update
    time: "2020-07-17T07:04:41Z"
  name: rook-ceph-operator-config
  namespace: openshift-storage
  resourceVersion: "41660"
  selfLink: /api/v1/namespaces/openshift-storage/configmaps/rook-ceph-operator-config
  uid: 58be63bc-4e71-49bb-9b70-56c4353bd861
  
  
 Step 3: Create OCS Storage Cluster
  
  
 Step 4: 
  Do oc describe of csi-cephfsplugin and csi-cephfsplugin-provisioner pods and check the value of metricsport in csi-cephfsplugin container args. The port should be updated to the value 9061 given in step 1. This proves the port is configurable.
  
  
  csi-cephfsplugin:
    Container ID:  cri-o://c5f6fbd04067f400c1475dad850a789861a4e121735cbadf0180587aea65cece
    Image:         quay.io/rhceph-dev/cephcsi@sha256:b4e7caf299762bd78f40f174a166d0d8399eef00593e6afcb9696b241cd3ceb0
    Image ID:      quay.io/rhceph-dev/cephcsi@sha256:241b67c2f2b3fe347a75e745a074d4723f6fead3631ebd560ab85d604a26d321
    Port:          <none>
    Host Port:     <none>
    Args:
      --nodeid=$(NODE_ID)
      --type=cephfs
      --endpoint=$(CSI_ENDPOINT)
      --v=0
      --nodeserver=true
      --drivername=openshift-storage.cephfs.csi.ceph.com
      --metadatastorage=k8s_configmap
      --mountcachedir=/mount-cache-dir
      --pidlimit=-1
      --metricsport=9061
      --forcecephkernelclient=true
      --metricspath=/metrics
      --enablegrpcmetrics=true
    State:          Running

  
  
  Step 5: Check the port in worker nodes. Port 9061 should be listening
  # lsof -i -P -n | grep cephcsi | grep 9061
  cephcsi   305192     root    5u  IPv4 1341096      0t0  TCP 10.0.131.15:9061 (LISTEN)

Comment 19 Jilju Joy 2020-07-17 08:16:26 UTC
Hi Umanga,

As the fix for this bug makes all the below ports configurable, I think we can mention that in doc. It will be helpful if any other port among the below default metrics ports is in use by another application.

CSI_CEPHFS_GRPC_METRICS_PORT: "9091"
CSI_CEPHFS_LIVENESS_METRICS_PORT: "9081"
CSI_RBD_GRPC_METRICS_PORT: "9090"
CSI_RBD_LIVENESS_METRICS_PORT: "9080"


Tested in 4.5.0-493.

$ oc get configmap rook-ceph-operator-config -o yaml
apiVersion: v1
data:
  CSI_CEPHFS_GRPC_METRICS_PORT: "9061"
  CSI_CEPHFS_LIVENESS_METRICS_PORT: "9050"
  CSI_PLUGIN_TOLERATIONS: |2-

    - key: node.ocs.openshift.io/storage
      operator: Equal
      value: "true"
      effect: NoSchedule
  CSI_PROVISIONER_TOLERATIONS: |2-

    - key: node.ocs.openshift.io/storage
      operator: Equal
      value: "true"
      effect: NoSchedule
  CSI_RBD_GRPC_METRICS_PORT: "9041"
  CSI_RBD_LIVENESS_METRICS_PORT: "9030"
kind: ConfigMap
metadata:
  creationTimestamp: "2020-07-17T06:55:43Z"
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:data:
        .: {}
        f:CSI_PLUGIN_TOLERATIONS: {}
        f:CSI_PROVISIONER_TOLERATIONS: {}
    manager: ocs-operator
    operation: Update
    time: "2020-07-17T06:55:43Z"
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:data:
        f:CSI_CEPHFS_GRPC_METRICS_PORT: {}
        f:CSI_CEPHFS_LIVENESS_METRICS_PORT: {}
        f:CSI_RBD_GRPC_METRICS_PORT: {}
        f:CSI_RBD_LIVENESS_METRICS_PORT: {}
    manager: oc
    operation: Update
    time: "2020-07-17T07:09:31Z"
  name: rook-ceph-operator-config
  namespace: openshift-storage
  resourceVersion: "43589"
  selfLink: /api/v1/namespaces/openshift-storage/configmaps/rook-ceph-operator-config
  uid: 58be63bc-4e71-49bb-9b70-56c4353bd861





# lsof -i -P -n | grep cephcsi
cephcsi   305192     root    5u  IPv4 1341096      0t0  TCP 10.0.131.15:9061 (LISTEN)
cephcsi   305201     root    6u  IPv4 1342019      0t0  TCP 10.0.131.15:9041 (LISTEN)
cephcsi   305402     root    3u  IPv4 1352597      0t0  TCP 10.0.131.15:9030 (LISTEN)
cephcsi   305403     root    3u  IPv4 1343862      0t0  TCP 10.0.131.15:9050 (LISTEN)

Comment 20 Jilju Joy 2020-07-19 14:58:37 UTC
Based on #comment16 and #comment18 , moving this bug to verified state.

Comment 21 Jilju Joy 2020-07-19 14:59:41 UTC
(In reply to Jilju Joy from comment #20)
> Based on #comment16 and #comment18 , moving this bug to verified state.

Correction : Based on #comment15 and #comment18 , moving this bug to verified state.

Comment 23 errata-xmlrpc 2020-09-15 10:16:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenShift Container Storage 4.5.0 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3754


Note You need to log in before you can comment on or make changes to this bug.