Description of problem: During the upgrade of OSP with ceph from 16.0 to 16.1 for DCN environment, ceph upgrade is failing below belor error ~~~ "fatal: [rhosp-con1 -> 172.10.10.101]: FAILED! => {\"changed\": true, \"cmd\": [\"podman\", \"exec\", \"ceph-mon-rhosp-con1\", \"ceph\", \"osd\", \"require-osd-release\", \"nautilus\"], \"delta\": \"0:00:00.435547\", \"end\": \"2020-09-06 15:51:37.055340\", \"msg\": \"non-zero return code\", \"rc\": 1, \"start\": \"2020-09-06 15:51:36.619793\", \"stderr\": \"Error initializing cluster client: ObjectNotFound('error calling conf_read_file',)\\nError: non zero exit code: 1: OCI runtime error\", \"stderr_lines\": [\"Error initializing cluster client: ObjectNotFound('error calling conf_read_file',)\", \"Error: non zero exit code: 1: OCI runtime error\"], \"stdout\": \"\", \"stdout_lines\": []}", "NO MORE HOSTS LEFT *************************************************************", "PLAY RECAP *********************************************************************", "localhost : ok=1 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0 ", "rhosp-con1 : ok=295 changed=30 unreachable=0 failed=1 skipped=491 rescued=0 ignored=1 ", "rhosp-con2 : ok=213 changed=20 unreachable=0 failed=0 skipped=395 rescued=0 ignored=1 ", "rhosp-con3 : ok=213 changed=20 unreachable=0 failed=0 skipped=395 rescued=0 ignored=1 ", "rhosp-hci1 : ok=207 changed=13 unreachable=0 failed=0 skipped=359 rescued=0 ignored=0 ", "rhosp-hci2 : ok=202 changed=12 unreachable=0 failed=0 skipped=348 rescued=0 ignored=0 ", "rhosp-hci3 : ok=203 changed=12 unreachable=0 failed=0 skipped=347 rescued=0 ignored=0 ", "rhosp-nfv1 : ok=105 changed=5 unreachable=0 failed=0 skipped=225 rescued=0 ignored=0 ", "rhosp-nfv2 : ok=105 changed=5 unreachable=0 failed=0 skipped=225 rescued=0 ignored=0 ", "Sunday 06 September 2020 15:51:37 +0200 (0:00:00.747) 0:18:25.016 ****** ", ~~~ Ansible is running the wrong command ~~~ [root@rhosp-con1 ~]# podman exec ceph-mon-rhosp-con1 ceph osd require-osd-release nautilus Error initializing cluster client: ObjectNotFound('error calling conf_read_file',) Error: non zero exit code: 1: OCI runtime error ~~~ The correct command is : ~~~ [root@rhosp-con1 ~]# podman exec ceph-mon-rhosp-con1 ceph -c /etc/ceph/central.conf osd require-osd-release nautilus ~~~ As the ceph cluster name is "central". Version-Release number of selected component (if applicable): Red Hat OpenStack Platform 16.1 Train How reproducible: Steps to Reproduce: 1. deploy a osp with ceph on central site with custom cluster name like central 2. Perform a upgrade of the env 3. Ceph upgrade playbook will fail Actual results: Ansible is not considering the custom ceph cluster name Expected results: Ansible is shoud consider the custom ceph cluster name Additional info:
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 4.1 Bug Fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4144