Bug 2212773
| Summary: | OCS Provider Server service comes up on public subnets | |||
|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Rewant <resoni> | |
| Component: | ocs-operator | Assignee: | Rewant <resoni> | |
| Status: | MODIFIED --- | QA Contact: | Elad <ebenahar> | |
| Severity: | high | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 4.13 | CC: | ikave, jijoy, muagarwa, nigoyal, odf-bz-bot | |
| Target Milestone: | --- | Keywords: | AutomationBackLog | |
| Target Release: | ODF 4.14.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | 4.14.0-60 | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2213114 2213117 2218863 2218867 (view as bug list) | Environment: | ||
| Last Closed: | Type: | Bug | ||
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2213114, 2213117, 2218863, 2218867 | |||
|
Description
Rewant
2023-06-06 10:12:33 UTC
The ocs-operator should not be responsible for adding annotations based on the cloud provider, instead the loadbalancer should be created externally based on the cloud provider and the type of network (private/public), hence adding a new field in StorageCluster CR to toggle between the service type. Giving devel ack on Rewant request The PR solves the issue we were facing with managed service by having a new field in storageCluster to toggle the nodePort/LoadBalancer ServiceType, the default being NodePort when if the field is not set. We need to create provider clusters and verify that provider server comes with both loadBalancer and nodePort depending on the field being set on storageCluster. For the backPorts, we also need to test when the operator is being updated from 4.10 to 4.11 the ocs provider service is of the same type. I tested the BZ with the following steps:
1. Deploy an AWS 4.14 cluster without ODF.
2. Disable the default Red-hat operator:
$ oc patch operatorhub.config.openshift.io/cluster -p='{"spec":{"sources":[{"disabled":true,"name":"redhat-operators"}]}}' --type=merge
3. Get and apply ICPS from catalog image using the commands(in my local):
$ oc image extract --filter-by-os linux/amd64 --registry-config /home/ikave/IBMProjects/ocs-ci/data/pull-secret quay.io/rhceph-dev/ocs-registry:4.14.0-67 --confirm --path /icsp.yaml:/home/ikave/IBMProjects/ocs-ci/iscp
$ oc apply -f ~/IBMProjects/ocs-ci/iscp/icsp.yaml
5. Wait for the MachineConfigPool to be ready.
$ oc get MachineConfigPool worker
6. Create the Namespace, CatalogSource, and Subscription using the Yaml file above: https://bugzilla.redhat.com/show_bug.cgi?id=2212773#c7.
$ oc apply -f ~/Downloads/deploy-with-olm.yaml
7. Wait until the ocs-operator pod is ready in the openshift-namespace.
8. Create the Storagecluster using the Yaml file above: https://bugzilla.redhat.com/show_bug.cgi?id=2212773#c8.
(If there is an issue with Noobaa CRDs, we may also need to apply this Yaml file https://raw.githubusercontent.com/red-hat-storage/mcg-osd-deployer/1eec1147b1ae70e938fa42dabc60453b8cd9449b/shim/crds/noobaa.noobaa.io.yaml). You can see that there is a new filed
"providerAPIServerServiceType: LoadBalancer".
9. Check that we can see the correct service type of LoadBalancer:
$ oc get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ocs-metrics-exporter ClusterIP 172.30.155.99 <none> 8080/TCP,8081/TCP 16m
ocs-provider-server LoadBalancer 172.30.193.179 a3e18905048ed4a9587ec6f3e0975705-962669330.us-east-2.elb.amazonaws.com 50051:31659/TCP 16m
rook-ceph-mgr ClusterIP 172.30.183.31 <none> 9283/TCP 4m38s
10. Edit the ocs-storagecluster and change the value of "providerAPIServerServiceType" to NodePort. Check the service type again:
$ oc get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ocs-metrics-exporter ClusterIP 172.30.155.99 <none> 8080/TCP,8081/TCP 24m
ocs-provider-server NodePort 172.30.193.179 <none> 50051:31659/TCP 24m
rook-ceph-mgr ClusterIP 172.30.183.31 <none> 9283/TCP 12m
11. Edit the ocs-storagecluster and change the value of "providerAPIServerServiceType" to a dummy value "foo". Check the service type again:
$ oc get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ocs-metrics-exporter ClusterIP 172.30.155.99 <none> 8080/TCP,8081/TCP 24m
ocs-provider-server NodePort 172.30.193.179 <none> 50051:31659/TCP 24m
rook-ceph-mgr ClusterIP 172.30.183.31 <none> 9283/TCP 12m
12. Check the ocs-operator logs and see the expected error:
{"level":"error","ts":"2023-07-17T13:00:08Z","msg":"Reconciler error","controller":"storagecluster","controllerGroup":"ocs.openshift.io","controllerKind":"StorageCluster","StorageCluster":{"name":"ocs-storagecluster","namespace":"openshift-storage"},"namespace":"openshift-storage","name":"ocs-storagecluster","reconcileID":"682f4750-9382-479f-8ebc-09a30152411d","error":"providerAPIServer only supports service of type NodePort and LoadBalancer"
13. Last check that I performed was to check that the default value is NodePort. I deployed the Storagecluster Yaml file above without the field "providerAPIServerServiceType" and checked the service again:
$ oc get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ocs-provider-server NodePort 172.30.110.82 <none> 50051:31659/TCP 10s
Additional info:
Versions:
OC version:
Client Version: 4.10.24
Server Version: 4.14.0-0.nightly-2023-07-17-215017
Kubernetes Version: v1.27.3+4aaeaec
OCS version:
ocs-operator.v4.14.0-67.stable OpenShift Container Storage 4.14.0-67.stable Succeeded
Cluster version
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.14.0-0.nightly-2023-07-17-215017 True False 71m Cluster version is 4.14.0-0.nightly-2023-07-17-215017
Link to the Jenkins slave: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/27017/
According to the comment above, I am moving the BZ to Verified. We found an issue with the new ocs 4.11.10-1 image. In the case of NodePort Service, the ocs operator pod keeps requiring. We need to fix this from the ocs-operator side, backport the fix and test it again. |