Bug 2129763
| Summary: | [cee/sd][ceph-ansible][RFE] Additional pre check required prior to remove legacy RGW daemons in the cephadm-adopt playbook | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Prasanth M V <pmv> |
| Component: | Ceph-Ansible | Assignee: | Teoman ONAY <tonay> |
| Status: | CLOSED WONTFIX | QA Contact: | Aditya Ramteke <aramteke> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 5.2 | CC: | aschoen, ceph-eng-bugs, gmeno, msaini, nthomas, sostapov |
| Target Milestone: | --- | Keywords: | FutureFeature |
| Target Release: | 5.3z5 | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-08-02 13:33:31 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Missed the 5.3 z1 window. Moving to 6.1. Please advise if this is a problem. Missed the 5.3 z4 deadline. Moving from z4 to z5. |
Description of problem: While adopting RGW daemons to cephadm by running the "cephadm-adopt" playbook in the RHCS 5 upgrade process: - The first ansible PLAY executing in the playbook is "[redeploy rgw daemons]" - In this play, new RGW daemons will be created in a TASK called "[update the placement of radosgw hosts]" - After this the next ansible PLAY is "[stop and remove legacy ceph rgw daemons]" - In this PLAY the legacy RGW daemon will be removed by the following TASKs in the playbook: TASK [stop and disable ceph-radosgw systemd service] ****************************************************** 1st Task TASK [stop and disable ceph-radosgw systemd target] ******************************************************* 2nd Task TASK [reset failed ceph-radosgw systemd unit] ************************************************************* 3rd Task TASK [remove ceph-radosgw systemd files] ****************************************************************** 4th Task TASK [remove legacy ceph radosgw data] ******************************************************************** 5th Task TASK [remove legacy ceph radosgw directory] *************************************************************** 6th Task - This removal procedure must be executed for the adoption of RGW daemons to cephadm as the PORTs should be free for the deployment of new RGW daemons otherwise the deployment of new RGW daemons will be failed with the error that the PORT is already occupied. - But it is also important to check that the new RGW daemons are deployed and managed by cephadm before removing the legacy daemons completely (In 4/5/6th Tasks). - In a scenario, where the new RGW daemons didn't deploy or were not managed by cephadm due to any reasons; and in the next steps (In 4/5/6th Tasks) the legacy RGW daemons will be removed completely then there is a chance for an impact in production as there no RGW daemons (either legacy or new) aren't deployed in the cluster. - Considering the fact that the legacy daemon should be removed to release the PORT number for the new RGW daemon I would suggest adding a step/task to check whether the new RGW daemons are deployed or not before removing the legacy RGW daemons completely. - My suggestion is to stop&disable the systemd service/target first (In 1st/2nd Task), this can free the PORTs for newly creating RGW daemons; After this TASKs add a step/task to check whether the new RGW daemons are deployed and managed by cephadm(New steps to check the RGWs are up and running). Once got the confirmation then can proceed with the remaining removal TASK of the legacy daemon. Version-Release number of selected component (if applicable): - Red Hat Ceph Storage 5.2 - ceph-ansible-6.0.27.9-1.el8cp.noarch - cephadm-16.2.8-85.el8cp.noarch How reproducible: - Not applicable