Bug 2143944 - [GSS] unknown parameter name "FORCE_OSD_REMOVAL"
Summary: [GSS] unknown parameter name "FORCE_OSD_REMOVAL"
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: ocs-operator
Version: 4.9
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ODF 4.13.0
Assignee: Malay Kumar parida
QA Contact: Itzhak
URL:
Whiteboard:
Depends On:
Blocks: 2211592 2211595
TreeView+ depends on / blocked
 
Reported: 2022-11-18 12:58 UTC by amansan
Modified: 2023-12-08 04:31 UTC (History)
7 users (show)

Fixed In Version: 4.13.0-214
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2211592 2211595 (view as bug list)
Environment:
Last Closed: 2023-06-21 15:22:18 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github red-hat-storage ocs-operator pull 1959 0 None open Reconcile the job templates so it is in sync with the latest changes 2023-03-14 16:26:07 UTC
Github red-hat-storage ocs-operator pull 1962 0 None open Bug 2143944:[release-4.13] Reconcile the job templates so it is in sync with the latest changes 2023-03-15 13:15:15 UTC
Github red-hat-storage ocs-operator pull 2067 0 None Merged Always update the template parameters & objects to keep it up-to-date 2023-06-01 06:27:47 UTC
Github red-hat-storage ocs-operator pull 2068 0 None open Bug 2143944:[release-4.13] Always update the template parameters & objects to keep it up-to-date 2023-06-01 06:28:05 UTC
Red Hat Product Errata RHBA-2023:3742 0 None None None 2023-06-21 15:22:43 UTC

Comment 5 Malay Kumar parida 2022-11-22 13:09:43 UTC
I think let's keep this one bug only as I also think the root cause is the same for them

Comment 13 Malay Kumar parida 2023-02-14 05:20:12 UTC
Hi Alicia, Running a little busy due to the Feature Development cycle for 4.13 as just a couple of weeks are left. But I can assure you this bug is on my radar & I have already made some investigations into the root cause & I expect to look at it more deeply after the feature freeze for 4.13 which is on Feb 28. If there is some customer dependency or waiting on the issue please do let me know I can move things around, in that case, to have prioritized attention on this.

Comment 21 Malay Kumar parida 2023-04-17 04:56:56 UTC
Hi Alicia, Basically earlier when the template was created once it was not getting updated afterwards. Which was creating problem.
For ex, if someone installs odf 4.10 then the template is created at that time with a rook-ceph-image in the template job spec. Later on customer goes to upgrade odf from 4.10 to 4.11, 4.11 to 4.12 and so on. But as the template was not reconciled, the rook-ceph-image on the template job spec will remain the old one (4.10 one in this case) even though you are now on some newer version of odf like may be 4.12. 

With this fix the template will get reconciled, So the rook ceph image on the template job spec will remain the correct one.

Comment 24 Itzhak 2023-05-29 11:22:14 UTC
What should be the updated steps? should we try to update from 4.12 to 4.13? Or just deploy a cluster with 4.13 and execute the command: 
$ oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_IDS=${osd_id_to_remove} -p FORCE_OSD_REMOVAL=true |oc create -n openshift-storage -f -

Comment 25 Malay Kumar parida 2023-06-01 08:07:29 UTC
I found the earlier merged patch was incomplete, So I had moved the BZ to post to merge another fix for it. It's now merged. Also I will be backporting the fix to all way till 4.9

I have created clone Bzs
4.12- https://bugzilla.redhat.com/show_bug.cgi?id=2211592
4.11- https://bugzilla.redhat.com/show_bug.cgi?id=2211594
4.10- https://bugzilla.redhat.com/show_bug.cgi?id=2211595
4.9- https://bugzilla.redhat.com/show_bug.cgi?id=2211598

Comment 26 Malay Kumar parida 2023-06-01 10:28:41 UTC
Verification steps for the BZ-
Install ODF in a version previous to 4.9.11, and check the templates created for the rook-ceph-image in them, And the Parameters in them.

Now go on upgrading ODF releases, First to the latest version in ODF 4.9 then to 4.10 to 4.11 to 4.12.
Try creating the process from the template with each version 
oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_IDS=${osd_id_to_remove} -p FORCE_OSD_REMOVAL=true |oc create -n openshift-storage -f -
The mentioned error "unknown parameter name "FORCE_OSD_REMOVAL"" will be there.
Each time check the template yaml too, They won't have changed anything in their object section or in the parameter section. 

Now upgrade to ODF 4.13
If you try to now run
oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_IDS=${osd_id_to_remove} -p FORCE_OSD_REMOVAL=true |oc create -n openshift-storage -f -
It should succeed without any error.
Now if you also check the templates they should have been updated. There will be the latest rook-ceph image, The new parameter FORCE_OSD_REMOVAL will be now in the parameters section.

Comment 27 Itzhak 2023-06-19 09:56:41 UTC
I followed the process described in the comment above.

1. I deployed a cluster with OCP 4.9 and ODF 4.9.10(lower than 4.9.11).

2. I checked the ocs osd removal job command, which resulted in the expected error: 
$ oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_IDS=0 -p FORCE_OSD_REMOVAL=false |oc create -n openshift-storage -f -
error: unknown parameter name "FORCE_OSD_REMOVAL"
error: no objects passed to create

3. Upgrade OCP and ODF from 4.9 to 4.10. 
4. Checked the ocs osd removal job command, which resulted in the expected error: 
$ oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_IDS=0 -p FORCE_OSD_REMOVAL=false |oc create -n openshift-storage -f -
error: unknown parameter name "FORCE_OSD_REMOVAL"
error: no objects passed to create

5. Upgrade OCP and ODF from 4.10 to 4.11. 
6. Checked the ocs osd removal job command, which resulted in the expected error: 
$ oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_IDS=0 -p FORCE_OSD_REMOVAL=false |oc create -n openshift-storage -f -
error: unknown parameter name "FORCE_OSD_REMOVAL"
error: no objects passed to create

7. Upgrade OCP and ODF from 4.11 to 4.12. 
8. Checked the ocs osd removal job command, which resulted in the expected error: 
$ oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_IDS=0 -p FORCE_OSD_REMOVAL=false |oc create -n openshift-storage -f -
error: unknown parameter name "FORCE_OSD_REMOVAL"
error: no objects passed to create

9. Upgrade OCP and ODF from 4.12 to 4.13. 
10. Checked the ocs osd removal job command, which now succeeded: 
$ oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_IDS=0 -p FORCE_OSD_REMOVAL=false |oc create -n openshift-storage -f -
job.batch/ocs-osd-removal-job created

$ oc get jobs ocs-osd-removal-job 
NAME                  COMPLETIONS   DURATION   AGE
ocs-osd-removal-job   1/1           8s         21m

Comment 28 Itzhak 2023-06-19 09:58:33 UTC
Additional info: 

Link to the Jenkins job: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/25639/.

Cluster versions after the last upgrade:

OC version:
Client Version: 4.10.24
Server Version: 4.13.0-0.nightly-2023-06-15-222927
Kubernetes Version: v1.26.5+7d22122

OCS version:
ocs-operator.v4.13.0-rhodf              OpenShift Container Storage   4.13.0-rhodf   ocs-operator.v4.12.4-rhodf              Succeeded

Cluster version
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.13.0-0.nightly-2023-06-15-222927   True        False         43m     Cluster version is 4.13.0-0.nightly-2023-06-15-222927

Rook version:
rook: v4.13.0-0.b57f0c7db8116e754fc77b55825d7fd75c6f1aa3
go: go1.19.9

Ceph version:
ceph version 17.2.6-70.el9cp (fe62dcdbb2c6e05782a3e2b67d025b84ff5047cc) quincy (stable)

Comment 30 Itzhak 2023-06-19 10:02:18 UTC
According to the comments above, I am moving the BZ to Verified.

Comment 32 errata-xmlrpc 2023-06-21 15:22:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenShift Data Foundation 4.13.0 enhancement and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:3742

Comment 33 Red Hat Bugzilla 2023-12-08 04:31:26 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.