Bug 2211594

Summary: [ODF 4.11] [GSS] unknown parameter name "FORCE_OSD_REMOVAL"
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Malay Kumar parida <mparida>
Component: ocs-operatorAssignee: Malay Kumar parida <mparida>
Status: CLOSED ERRATA QA Contact: Itzhak <ikave>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.11CC: kramdoss, ocs-bugs, odf-bz-bot, vavuthu
Target Milestone: ---Keywords: Automation
Target Release: ODF 4.11.9   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 4.11.9-2 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-07-20 16:12:43 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Malay Kumar parida 2023-06-01 07:50:26 UTC
This bug was initially created as a copy of Bug #2143944

I am copying this bug because: 



Description of problem (please be detailed as possible and provide log
snippests):

When the customer tries to replace a osd the command gives this error

$ oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_IDS=${osd_id_to_remove} -p FORCE_OSD_REMOVAL=true |oc create -n openshift-storage -f -
error: unknown parameter name "FORCE_OSD_REMOVAL"
error: no objects passed to create


Version of all relevant components (if applicable):

ODF 4.9 and ODF 4.10

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?

No

Is there any workaround available to the best of your knowledge?

Yes, delete the template ocs-osd-removal forces it to reconcile and the option appears


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?

2


Can this issue reproducible?

Yes, installing an ODF version previous to 4.9.11 and upgrade it, the template is not updated so the option is not added to it.

Can this issue reproduce from the UI?

Steps to Reproduce:
1. Install ODF in a version previous to 4.9.11
2. Upgrade releases 
3. Try to replace an osd or review the template in a version above to 4.9.11


Actual results:

Template doesn’t have the option so the command fails


Expected results:

Command working


Additional info:

Comment 9 Itzhak 2023-07-19 15:32:01 UTC
I tested the BZ with a vSphere cluster with OCP4.10 and ODF 4.9.10(lower than 4.9.11).

I performed the following steps:

1. Checked the ocs osd removal job command, which resulted in the expected error: 
$ oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_IDS=0 -p FORCE_OSD_REMOVAL=false |oc create -n openshift-storage -f -
error: unknown parameter name "FORCE_OSD_REMOVAL"
error: no objects passed to create

2. Upgrade the ODF from 4.9 to 4.10.
3. Check again ocs osd removal job command, which shows the expected output: 
$ oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_IDS=0 -p FORCE_OSD_REMOVAL=false |oc create -n openshift-storage -f -
job.batch/ocs-osd-removal-job created
$ oc get jobs ocs-osd-removal-job 
NAME                  COMPLETIONS   DURATION   AGE
ocs-osd-removal-job   1/1           32s        136m

4. Upgrade the OCP version from 4.10 to 4.11.
5. Upgrade the ODF from 4.10 to 4.11.

6. Check again ocs osd removal job command, which shows the expected output: 
$ oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_IDS=0 -p FORCE_OSD_REMOVAL=false |oc create -n openshift-storage -f -
Warning: would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "operator" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "operator" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "operator" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "operator" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
job.batch/ocs-osd-removal-job created
$ oc get jobs ocs-osd-removal-job 
NAME                  COMPLETIONS   DURATION   AGE
ocs-osd-removal-job   1/1           7s         22s


Additional info: 

Versions:

OC version:
Client Version: 4.10.24
Server Version: 4.11.0-0.nightly-2023-07-17-215640
Kubernetes Version: v1.24.15+990d55b

OCS version:
ocs-operator.v4.11.9              OpenShift Container Storage   4.11.9    ocs-operator.v4.10.14              Succeeded

Cluster version
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-0.nightly-2023-07-17-215640   True        False         130m    Cluster version is 4.11.0-0.nightly-2023-07-17-215640

Rook version:
rook: v4.11.9-0.6934e4e22735898ae2286d4b4623b80966c1bd8c
go: go1.17.12

Ceph version:
ceph version 16.2.10-138.el8cp (a63ae467c8e1f7503ea3855893f1e5ca189a71b9) pacific (stable)


Link to the Jenkins job: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/27056/

Comment 13 errata-xmlrpc 2023-07-20 16:12:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Data Foundation 4.11.9 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:4238