2233444 – [cee/sd][ceph-ansible] Cephadm-preflight playbook stops all the ceph services from node if older ceph rpms are present on the host.

Bug 2233444 - [cee/sd][ceph-ansible] Cephadm-preflight playbook stops all the ceph services from node if older ceph rpms are present on the host.

Summary: [cee/sd][ceph-ansible] Cephadm-preflight playbook stops all the ceph services...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	Ceph-Ansible
Sub Component:
Version:	5.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	5.3z6
Assignee:	Teoman ONAY
QA Contact:	Manisha Saini
Docs Contact:	Ranjini M N
URL:
Whiteboard:
Depends On:
Blocks:	2258797
TreeView+	depends on / blocked

Reported:	2023-08-22 08:38 UTC by Teoman ONAY
Modified:	2024-04-24 18:28 UTC (History)
CC List:	8 users (show)
Fixed In Version:	cephadm-ansible-1.17.0-1.el8cp
Doc Type:	Bug Fix
Doc Text:	.The Ceph packages now install without stopping any of the running Ceph services Previously, during the upgrade, all Ceph services stopped running as the Ceph 4 packages would be uninstalled instead of updating. With this fix, the new Ceph 5 packages are installed during upgrades and do not impact the running Ceph processes.
Clone Of:
Environment:
Last Closed:	2024-02-08 16:50:39 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	RHCEPH-7243	0	None	None	None	2023-08-22 08:41:53 UTC
Red Hat Product Errata	RHSA-2024:0745	0	None	None	None	2024-02-08 16:50:44 UTC

Description Teoman ONAY 2023-08-22 08:38:59 UTC

This bug was initially created as a copy of Bug #2211324

I am copying this bug because: 



Description of problem:
-----------------------
- After upgrading the cluster from RHCS 4.3z1 (baremetal) to RHCS 5.3z3 / RHCS 5.3z2, if we run cephadm-preflight playbook to install the latest ceph-common and cephadm packages on the ceph nodes,
it stops the ceph.target service, which in turn stops all the ceph services running on the host. 

This happens only when ceph rpms like ceph-common,ceph-base,ceph-mon,ceph-osd etc from older version (RHCS 4.3z1) still exists on the hosts (as cluster is migrated from baremetal to container).


Version-Release number of selected component (if applicable):
-------------------------------------------------------------
RHCS 5.3


How reproducible:
-----------------
Every time.


Steps to Reproduce:
--------------------
1. Deploy RHCS 4.3z1  baremetal cluster
2. Convert the Ceph services to containerized
3. Upgrade the cluster to RHCS 5.3z2 / RHCS 5.3z3
4. Run cephadm-preflight playbook to upgrade the ceph-common and cephadm package on the host.


Actual results:
---------------
The ceph packages is upgraded but all ceph services on the host are stoped.

Expected results:
-----------------
The ceph packages should be upgraded and no services should be impacted.

Comment 5 Manisha Saini 2023-12-21 00:14:07 UTC

Observing similar kind of issue when tried upgrade from 4.3z1 --> 5.3 (latest).
Post running the cephadm-preflight playbook, the ceph services (mon,mgr,osd's) got failed on all the nodes.But ceph.target was running.
As a result, ceph commands are getting hung.



[root@ceph-msaini-taooh8-node2 ~]#  systemctl | grep ceph
  ceph-crash                                                                                          loaded active running   Ceph crash dump collector
● ceph-mgr                                                                                            loaded failed failed    Ceph Manager
● ceph-mon                                                                                            loaded failed failed    Ceph Monitor
  system-ceph\x2dcrash.slice                                                                                                           loaded active active    system-ceph\x2dcrash.slice
  system-ceph\x2dmds.slice                                                                                                             loaded active active    system-ceph\x2dmds.slice
  system-ceph\x2dmgr.slice                                                                                                             loaded active active    system-ceph\x2dmgr.slice
  system-ceph\x2dmon.slice                                                                                                             loaded active active    system-ceph\x2dmon.slice
  ceph.target                                                                                                                          loaded active active    ceph target allowing to start/stop all ceph*@.service instances at once


[root@ceph-msaini-taooh8-node5 ~]# systemctl status ceph.target
● ceph.target - ceph target allowing to start/stop all ceph*@.service instances at once
   Loaded: loaded (/etc/systemd/system/ceph.target; enabled; vendor preset: enabled)
   Active: active since Wed 2023-12-20 14:58:47 EST; 4h 5min ago

Dec 20 14:58:47 ceph-msaini-taooh8-node5 systemd[1]: Reached target ceph target allowing to start/stop all ceph*@.service instances at once.

========================
Upgrade logs
========================

[root@ceph-msaini-taooh8-node1-installer ~]# ceph --version
ceph version 14.2.22-128.el8cp (40a2bf9c4e79e39754d69a95cd51bd60991284be) nautilus (stable)

[root@ceph-msaini-taooh8-node1-installer ~]# ceph versions
{
    "mon": {
        "ceph version 14.2.22-128.el8cp (40a2bf9c4e79e39754d69a95cd51bd60991284be) nautilus (stable)": 3
    },
    "mgr": {
        "ceph version 14.2.22-128.el8cp (40a2bf9c4e79e39754d69a95cd51bd60991284be) nautilus (stable)": 3
    },
    "osd": {
        "ceph version 14.2.22-128.el8cp (40a2bf9c4e79e39754d69a95cd51bd60991284be) nautilus (stable)": 12
    },
    "mds": {
        "ceph version 14.2.22-128.el8cp (40a2bf9c4e79e39754d69a95cd51bd60991284be) nautilus (stable)": 3
    },
    "rgw": {
        "ceph version 14.2.22-128.el8cp (40a2bf9c4e79e39754d69a95cd51bd60991284be) nautilus (stable)": 4
    },
    "overall": {
        "ceph version 14.2.22-128.el8cp (40a2bf9c4e79e39754d69a95cd51bd60991284be) nautilus (stable)": 25
    }
}

[root@ceph-msaini-taooh8-node1-installer ~]# ceph -s
  cluster:
    id:     07cd16a8-f925-4d09-a041-6d725b939582
    health: HEALTH_WARN
            1 pool(s) have non-power-of-two pg_num
            1 pools have too few placement groups
            3 pools have too many placement groups
            mons are allowing insecure global_id reclaim

  services:
    mon: 3 daemons, quorum ceph-msaini-taooh8-node3,ceph-msaini-taooh8-node2,ceph-msaini-taooh8-node1-installer (age 45m)
    mgr: ceph-msaini-taooh8-node1-installer(active, since 43m), standbys: ceph-msaini-taooh8-node2, ceph-msaini-taooh8-node3
    mds: cephfs:1 {0=ceph-msaini-taooh8-node2=up:active} 2 up:standby
    osd: 12 osds: 12 up (since 38m), 12 in (since 57m)
    rgw: 4 daemons active (ceph-msaini-taooh8-node5.rgw0, ceph-msaini-taooh8-node5.rgw1, ceph-msaini-taooh8-node6.rgw0, ceph-msaini-taooh8-node6.rgw1)

  data:
    pools:   13 pools, 676 pgs
    objects: 382 objects, 456 MiB
    usage:   13 GiB used, 227 GiB / 240 GiB avail
    pgs:     676 active+clean

  io:
    client:   2.5 KiB/s rd, 2 op/s rd, 0 op/s wr


[root@ceph-msaini-taooh8-node1-installer ~]# podman ps
CONTAINER ID  IMAGE                                                            COMMAND               CREATED         STATUS         PORTS       NAMES
b4bc2bbf0671  registry.redhat.io/openshift4/ose-prometheus-node-exporter:v4.6  --path.procfs=/ro...  54 minutes ago  Up 54 minutes              node-exporter
288dbf3d1416  registry.redhat.io/rhceph/rhceph-4-rhel8:latest                                        49 minutes ago  Up 49 minutes              ceph-mon-ceph-msaini-taooh8-node1-installer
e02558859efb  registry.redhat.io/rhceph/rhceph-4-rhel8:latest                                        46 minutes ago  Up 46 minutes              ceph-mgr-ceph-msaini-taooh8-node1-installer
fdc68705313e  registry.redhat.io/rhceph/rhceph-4-rhel8:latest                                        30 minutes ago  Up 30 minutes              ceph-crash-ceph-msaini-taooh8-node1-installer


[root@ceph-msaini-taooh8-node1-installer ceph-ansible]# sudo ansible-playbook -i hosts infrastructure-playbooks/rolling_update.yml --extra-vars "health_osd_check_retries=50 health_osd_check_delay=30"


PLAY RECAP **************************************************************************************************************************************************************************************************************************************
ceph-msaini-taooh8-node1-installer : ok=375  changed=59   unreachable=0    failed=0    skipped=633  rescued=0    ignored=0
ceph-msaini-taooh8-node2   : ok=370  changed=39   unreachable=0    failed=0    skipped=685  rescued=0    ignored=0
ceph-msaini-taooh8-node3   : ok=370  changed=39   unreachable=0    failed=0    skipped=690  rescued=0    ignored=0
ceph-msaini-taooh8-node4   : ok=252  changed=28   unreachable=0    failed=0    skipped=460  rescued=0    ignored=0
ceph-msaini-taooh8-node5   : ok=379  changed=38   unreachable=0    failed=0    skipped=625  rescued=0    ignored=0
ceph-msaini-taooh8-node6   : ok=368  changed=37   unreachable=0    failed=0    skipped=645  rescued=0    ignored=0
ceph-msaini-taooh8-node7   : ok=319  changed=38   unreachable=0    failed=0    skipped=495  rescued=0    ignored=0
localhost                  : ok=1    changed=1    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0



[root@ceph-msaini-taooh8-node1-installer ceph-ansible]# ansible-playbook -vvvv infrastructure-playbooks/rolling_update.yml -i hosts

 stdout: |-
    {
        "mon": {
            "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 3
        },
        "mgr": {
            "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 3
        },
        "osd": {
            "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 12
        },
        "mds": {
            "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 3
        },
        "rgw": {
            "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 4
        },
        "rgw-nfs": {
            "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 1
        },
        "overall": {
            "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 26
        }
    }
  stdout_lines: <omitted>
META: ran handlers
META: ran handlers

PLAY RECAP **************************************************************************************************************************************************************************************************************************************
ceph-msaini-taooh8-node1-installer : ok=372  changed=51   unreachable=0    failed=0    skipped=626  rescued=0    ignored=0
ceph-msaini-taooh8-node2   : ok=363  changed=27   unreachable=0    failed=0    skipped=676  rescued=0    ignored=0
ceph-msaini-taooh8-node3   : ok=364  changed=28   unreachable=0    failed=0    skipped=680  rescued=0    ignored=0
ceph-msaini-taooh8-node4   : ok=249  changed=21   unreachable=0    failed=0    skipped=453  rescued=0    ignored=0
ceph-msaini-taooh8-node5   : ok=375  changed=27   unreachable=0    failed=0    skipped=616  rescued=0    ignored=0
ceph-msaini-taooh8-node6   : ok=370  changed=27   unreachable=0    failed=0    skipped=629  rescued=0    ignored=0
ceph-msaini-taooh8-node7   : ok=317  changed=29   unreachable=0    failed=0    skipped=489  rescued=0    ignored=0
localhost                  : ok=1    changed=1    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0



[root@ceph-msaini-taooh8-node1-installer ceph-ansible]# ceph --version
ceph version 14.2.22-128.el8cp (40a2bf9c4e79e39754d69a95cd51bd60991284be) nautilus (stable)

[root@ceph-msaini-taooh8-node1-installer ceph-ansible]# ceph versions
{
    "mon": {
        "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 3
    },
    "mgr": {
        "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 3
    },
    "osd": {
        "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 12
    },
    "mds": {
        "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 3
    },
    "rgw": {
        "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 4
    },
    "rgw-nfs": {
        "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 1
    },
    "overall": {
        "ceph version 16.2.10-220.el8cp (380780920862a7326df3e00903e9912b85af7d30) pacific (stable)": 26
    }
}

[root@ceph-msaini-taooh8-node1-installer ceph-ansible]# podman ps
CONTAINER ID  IMAGE                                                                                                           COMMAND               CREATED         STATUS         PORTS       NAMES
6ca1e2071341  registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.3-rhel-8-containers-candidate-88814-20231215195330                        16 minutes ago  Up 16 minutes              ceph-mon-ceph-msaini-taooh8-node1-installer
f518b6b7588d  registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.3-rhel-8-containers-candidate-88814-20231215195330                        13 minutes ago  Up 13 minutes              ceph-mgr-ceph-msaini-taooh8-node1-installer
74a1b25bee9e  registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.3-rhel-8-containers-candidate-88814-20231215195330                        3 minutes ago   Up 3 minutes               ceph-crash-ceph-msaini-taooh8-node1-installer
38e14828d9ae  registry.redhat.io/openshift4/ose-prometheus-node-exporter:v4.6                                                 --path.procfs=/ro...  2 minutes ago   Up 2 minutes               node-exporter
[root@ceph-msaini-taooh8-node1-installer ceph-ansible]#


# systemctl | grep ceph
ceph-crash                                                                                loaded active running   Ceph crash dump collector
ceph-mgr                                                                                  loaded active running   Ceph Manager
ceph-mon                                                                                  loaded active running   Ceph Monitor
system-ceph\x2dcrash.slice                                                                                                           loaded active active    system-ceph\x2dcrash.slice
system-ceph\x2dmgr.slice                                                                                                             loaded active active    system-ceph\x2dmgr.slice
system-ceph\x2dmon.slice                                                                                                             loaded active active    system-ceph\x2dmon.slice
ceph-mgr.target                                                                                                                      loaded active active    ceph target allowing to start/stop all ceph-mgr@.service instances at once
ceph-mon.target                                                                                                                      loaded active active    ceph target allowing to start/stop all ceph-mon@.service instances at once
ceph.target                                                                                                                          loaded active active    ceph target allowing to start/stop all ceph*@.service instances at once



[root@ceph-msaini-taooh8-node1-installer ceph-ansible]# ansible-playbook infrastructure-playbooks/cephadm-adopt.yml -i hosts


TASK [add ceph label for core component] ********************************************************************************************************************************************************************************************************
fatal: [ceph-msaini-taooh8-node2 -> ceph-msaini-taooh8-node1-installer]: FAILED! => changed=false
  cmd:
  - podman
  - run
  - --rm
  - --net=host
  - -v
  - /etc/ceph:/etc/ceph:z
  - -v
  - /var/lib/ceph:/var/lib/ceph:ro
  - -v
  - /var/run/ceph:/var/run/ceph:z
  - --entrypoint=ceph
  - registry-proxy.engineering.redhat.com/rh-osbs/rhceph:ceph-5.3-rhel-8-containers-candidate-88814-20231215195330
  - --cluster
  - ceph
  - orch
  - host
  - label
  - add
  - ceph-msaini-taooh8-node2
  - ceph
  delta: '0:00:01.795436'
  end: '2023-12-20 18:15:08.390207'
  msg: non-zero return code
  rc: 22
  start: '2023-12-20 18:15:06.594771'
  stderr: 'Error EINVAL: host ceph-msaini-taooh8-node2 does not exist'
  stderr_lines: <omitted>
  stdout: ''
  stdout_lines: <omitted>



[root@ceph-msaini-taooh8-node1-installer ceph-ansible]# dnf install cephadm-ansible
Updating Subscription Management repositories.
Last metadata expiration check: 0:01:06 ago on Wed 20 Dec 2023 06:22:31 PM EST.
Dependencies resolved.
=================================================================================================================================================================================================================================================
 Package                                                               Architecture                            Version                                              Repository                                                              Size
=================================================================================================================================================================================================================================================
Installing:
 cephadm-ansible                                                       noarch                                  1.17.0-1.el8cp                                       rhceph-5-tools-for-rhel-8-x86_64-rpms                                   32 k
Installing dependencies:
 ansible-collection-ansible-posix                                      noarch                                  1.2.0-1.el8cp.1                                      rhceph-5-tools-for-rhel-8-x86_64-rpms                                  131 k
 ansible-collection-community-general                                  noarch                                  4.0.0-1.1.el8cp.1                                    rhceph-5-tools-for-rhel-8-x86_64-rpms                                  1.5 M
 ansible-core                                                          x86_64                                  2.15.3-1.el8                                         rhel-8-for-x86_64-appstream-rpms                                       3.6 M
 mpdecimal                                                             x86_64                                  2.5.1-3.el8                                          rhel-8-for-x86_64-appstream-rpms                                        93 k
 python3.11                                                            x86_64                                  3.11.5-1.el8_9                                       rhel-8-for-x86_64-appstream-rpms                                        30 k
 python3.11-cffi                                                       x86_64                                  1.15.1-1.el8                                         rhel-8-for-x86_64-appstream-rpms                                       293 k
 python3.11-cryptography                                               x86_64                                  37.0.2-5.el8                                         rhel-8-for-x86_64-appstream-rpms                                       1.1 M
 python3.11-libs                                                       x86_64                                  3.11.5-1.el8_9                                       rhel-8-for-x86_64-appstream-rpms                                        10 M
 python3.11-pip-wheel                                                  noarch                                  22.3.1-4.el8                                         rhel-8-for-x86_64-appstream-rpms                                       1.4 M
 python3.11-ply                                                        noarch                                  3.11-1.el8                                           rhel-8-for-x86_64-appstream-rpms                                       135 k
 python3.11-pycparser                                                  noarch                                  2.20-1.el8                                           rhel-8-for-x86_64-appstream-rpms                                       147 k
 python3.11-pyyaml                                                     x86_64                                  6.0-1.el8                                            rhel-8-for-x86_64-appstream-rpms                                       214 k
 python3.11-setuptools-wheel                                           noarch                                  65.5.1-2.el8                                         rhel-8-for-x86_64-appstream-rpms                                       720 k
 sshpass                                                               x86_64                                  1.09-4.el8ap                                         labrepo                                                                 30 k

Transaction Summary
=================================================================================================================================================================================================================================================
Install  15 Packages

Total download size: 20 M
Installed size: 78 M
Is this ok [y/N]: y



[root@ceph-msaini-taooh8-node1-installer cephadm-ansible]# systemctl | grep ceph
  ceph-crash                                                                                loaded active running   Ceph crash dump collector
● ceph-mgr                                                                                  loaded failed failed    Ceph Manager
● ceph-mon                                                                                  loaded failed failed    Ceph Monitor
  system-ceph\x2dcrash.slice                                                                                                           loaded active active    system-ceph\x2dcrash.slice
  system-ceph\x2dmgr.slice                                                                                                             loaded active active    system-ceph\x2dmgr.slice
  system-ceph\x2dmon.slice                                                                                                             loaded active active    system-ceph\x2dmon.slice
  ceph.target                                                                                                                          loaded active active    ceph target allowing to start/stop all ceph*@.service instances at once


[root@ceph-msaini-taooh8-node1-installer cephadm-ansible]# systemctl status ceph.target
● ceph.target - ceph target allowing to start/stop all ceph*@.service instances at once
   Loaded: loaded (/etc/systemd/system/ceph.target; enabled; vendor preset: enabled)
   Active: active since Wed 2023-12-20 14:58:46 EST; 3h 54min ago

Dec 20 14:58:46 ceph-msaini-taooh8-node1-installer systemd[1]: Reached target ceph target allowing to start/stop all ceph*@.service instances at once.


[root@ceph-msaini-taooh8-node1-installer cephadm-ansible]# ceph -s


[root@ceph-msaini-taooh8-node1-installer cephadm-ansible]# systemctl -l status ceph-mgr
● ceph-mgr - Ceph Manager
   Loaded: loaded (/etc/systemd/system/ceph-mgr@.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Wed 2023-12-20 18:34:26 EST; 37min ago
 Main PID: 110855 (code=exited, status=143)

Dec 20 18:34:24 ceph-msaini-taooh8-node1-installer ceph-mgr-ceph-msaini-taooh8-node1-installer[110855]: 2023-12-20T18:34:24.431-0500 7f0a8fddb700  0 log_channel(cluster) log [DBG] : pgmap v677: 701 pgs: 701 active+clean; 456 MiB data, 2.3 G>
Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer systemd[1]: Stopping Ceph Manager...
Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer ceph-mgr-ceph-msaini-taooh8-node1-installer[110855]: teardown: managing teardown after SIGTERM
Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer ceph-mgr-ceph-msaini-taooh8-node1-installer[110855]: teardown: Sending SIGTERM to PID 54
Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer ceph-mgr-ceph-msaini-taooh8-node1-installer[110855]: teardown: Waiting PID 54 to terminate .
Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer ceph-mgr-ceph-msaini-taooh8-node1-installer[110855]: teardown: Process 54 is terminated
Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer sh[142533]: f518b6b7588de6ed1793a6f58a4fa9ca41df91f58a7543dd90d97508e6f612e5
Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer systemd[1]: ceph-mgr: Main process exited, code=exited, status=143/n/a
Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer systemd[1]: ceph-mgr: Failed with result 'exit-code'.
Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer systemd[1]: Stopped Ceph Manager.


[root@ceph-msaini-taooh8-node1-installer cephadm-ansible]# systemctl -l status ceph-mon
● ceph-mon - Ceph Monitor
   Loaded: loaded (/etc/systemd/system/ceph-mon@.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Wed 2023-12-20 18:34:26 EST; 38min ago
 Main PID: 106377 (code=exited, status=143)

Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer ceph-mon-ceph-msaini-taooh8-node1-installer[106377]: debug 2023-12-20T18:34:26.595-0500 7f4c0a23b880  1 rocksdb: close waiting for compaction thread to stop
Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer ceph-mon-ceph-msaini-taooh8-node1-installer[106377]: debug 2023-12-20T18:34:26.595-0500 7f4c0a23b880  1 rocksdb: close compaction thread to stopped
Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer ceph-mon-ceph-msaini-taooh8-node1-installer[106377]: debug 2023-12-20T18:34:26.595-0500 7f4c0a23b880  4 rocksdb: [db_impl/db_impl.cc:397] Shutdown: canceling all background work
Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer ceph-mon-ceph-msaini-taooh8-node1-installer[106377]: debug 2023-12-20T18:34:26.599-0500 7f4c0a23b880  4 rocksdb: [db_impl/db_impl.cc:573] Shutdown complete
Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer ceph-mon-ceph-msaini-taooh8-node1-installer[106377]: teardown: Waiting PID 86 to terminate .
Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer ceph-mon-ceph-msaini-taooh8-node1-installer[106377]: teardown: Process 86 is terminated
Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer sh[142608]: 6ca1e2071341cf2fa0140bced76763b36ec0f17f55ddba50794aa25a1245099e
Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer systemd[1]: ceph-mon: Main process exited, code=exited, status=143/n/a
Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer systemd[1]: ceph-mon: Failed with result 'exit-code'.
Dec 20 18:34:26 ceph-msaini-taooh8-node1-installer systemd[1]: Stopped Ceph Monitor.
[root@ceph-msaini-taooh8-node1-installer cephadm-ansible]#


[root@ceph-msaini-taooh8-node1-installer cephadm-ansible]# rpm -qa | grep ceph
ceph-grafana-dashboards-14.2.22-128.el8cp.noarch
libcephfs2-16.2.10-208.el8cp.x86_64
cephadm-ansible-1.17.0-1.el8cp.noarch
python3-ceph-common-16.2.10-208.el8cp.x86_64
ceph-base-16.2.10-208.el8cp.x86_64
cephadm-16.2.10-220.el9cp.noarch
python3-ceph-argparse-16.2.10-208.el8cp.x86_64
python3-cephfs-16.2.10-208.el8cp.x86_64
ceph-common-16.2.10-208.el8cp.x86_64
ceph-selinux-16.2.10-208.el8cp.x86_64

Comment 10 Manisha Saini 2024-01-18 18:31:25 UTC

Cephadm-preflight.yaml playbook was passing and the ceph.target status was active as updated in comment #5.

As per comment #6 as offline discussion with Teoman Onay,  this BZ was related to the first step only which is fixed and working as expected. 

QE will be reproducing the issue of the ceph services (mon,mgr,osd's) which got failed on all the nodes. 
If the issue is reproducible, QE will raise a new BZ for same.


As the BZ fix is working as expected, marking this BZ as verified.

Comment 11 Manisha Saini 2024-01-21 20:28:37 UTC

(In reply to Manisha Saini from comment #10)
> Cephadm-preflight.yaml playbook was passing and the ceph.target status was
> active as updated in comment #5.
> 
> As per comment #6 as offline discussion with Teoman Onay,  this BZ was
> related to the first step only which is fixed and working as expected. 
> 
> QE will be reproducing the issue of the ceph services (mon,mgr,osd's) which
> got failed on all the nodes. 
> If the issue is reproducible, QE will raise a new BZ for same.
> 
> 

Unable to reproduce the issue seen in comment #5. Upgrade was successful and all services were up and running post upgrade. 
Detailed steps recorded  - https://docs.google.com/document/d/1xhRCY-bSRWTrKXzibdI7SlQ9rASXdBquPvItCN_hcb4/edit

> As the BZ fix is working as expected, marking this BZ as verified.

Comment 13 errata-xmlrpc 2024-02-08 16:50:39 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage 5.3 Security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:0745

Note You need to log in before you can comment on or make changes to this bug.