Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

This project is now read‑only. Starting Monday, February 2, please use https://ibm-ceph.atlassian.net/ for all bug tracking management.

Bug 2222003

Summary:	[cee/sd][ceph-ansible] RHCS 4.3z1 installation fails on RHEL 8.7 at TASK [ceph-mgr : wait for all mgr to be up]
Product:	[Red Hat Storage] Red Hat Ceph Storage	Reporter:	Teoman ONAY <tonay>
Component:	Ceph-Ansible	Assignee:	Teoman ONAY <tonay>
Status:	CLOSED ERRATA	QA Contact:	Aditya Ramteke <aramteke>
Severity:	high	Docs Contact:	Akash Raj <akraj>
Priority:	unspecified
Version:	5.3	CC:	akraj, anrao, atomic-bugs, bbaude, ceph-eng-bugs, cephqe-warriors, dwalsh, gjose, gmeno, jligon, jnovy, kdreyer, lithomas, lsm5, mboddu, mheon, msaini, pthomas, rmandyam, tpetr, tsweeney, umohnani, vereddy, vumrao
Target Milestone:	---
Target Release:	5.3z5
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	ceph-ansible-6.0.28.5-1	Doc Type:	Bug Fix
Doc Text:	.Ceph containers no longer fail during startup Previously, the behaviour of `podman` released with Red Hat Enterprise Linux 8.7 had changed with respect to SELinux relabeling. Due to this, depending on their startup order, some Ceph containers would fail to start as they would not have access to the files they needed. With this fix, the SElinux separation for the container is disabled and all Ceph containers start successfully.	Story Points:	---
Clone Of:	2169767	Environment:
Last Closed:	2023-08-28 09:40:56 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	2162781, 2169767
Bug Blocks:

Description Teoman ONAY 2023-07-11 13:03:35 UTC

+++ This bug was initially created as a clone of Bug #2169767 +++

+++ This bug was initially created as a clone of Bug #2162781 +++

Description of problem:
-----------------------
RHCS 4.3z1 installation fails on RHEL 8.7 at TASK [ceph-mgr : wait for all mgr to be up].

The mgr services fails to start because of below error:
```
Jan 19 03:12:25 ceph50 systemd[1]: Started Ceph Manager.
Jan 19 03:12:25 ceph50 ceph-mgr-ceph50[677680]: find: '/var/lib/ceph/mgr/ceph-ceph50/keyring': Permission denied
Jan 19 03:12:25 ceph50 ceph-mgr-ceph50[677680]: chown: cannot access '/var/lib/ceph/mgr/ceph-ceph50/keyring': Permission denied
Jan 19 03:12:26 ceph50 systemd[1]: ceph-mgr: Main process exited, code=exited, status=1/FAILURE
Jan 19 03:12:26 ceph50 systemd[1]: ceph-mgr: Failed with result 'exit-code'.
Jan 19 03:12:36 ceph50 systemd[1]: ceph-mgr: Service RestartSec=10s expired, scheduling restart.
Jan 19 03:12:36 ceph50 systemd[1]: ceph-mgr: Scheduled restart job, restart counter is at 5198.
Jan 19 03:12:36 ceph50 systemd[1]: Stopped Ceph Manager.
```

Version-Release number of selected component (if applicable):
--------------------------------------------------------------
RHCS         : 4.3z1 
ceph-ansible : 4.0.70.18-1.el8cp.noarch
RHEL         : 8.7
Podman       : 4.2


How reproducible:
-----------------
Every time


Steps to Reproduce:
Deploy fresh ceph cluster on 4.3z1 on top of RHEL 8.7.


Actual results:
---------------
ceph mgr fails to start service because of selinux label and hence the deployment fails.

Expected results:
-----------------
ceph mgr should start service and the deployment should succeed.

--- Additional comment from Lijo Stephen Thomas on 2023-01-20 21:04:10 UTC ---

Additional info:
----------------

- RHCS 4.3z1 installation does succeed on RHEL version 8.6 but fails on RHEL 8.7.

- Upon checking further I see the selinux label for mgr keyring file is set different on RHEL 8.7:

----> On RHEL 8.7 nodes

```
[root@ceph50 ~]# cat /etc/redhat-release 
Red Hat Enterprise Linux release 8.7 (Ootpa)

[root@ceph50 ~]# podman version
Client:       Podman Engine
Version:      4.2.0
API Version:  4.2.0
Go Version:   go1.18.4
Built:        Mon Dec 12 17:11:56 2022
OS/Arch:      linux/amd64

[root@ceph50 ~]# ll -lZ /var/lib/ceph/mgr/ceph-ceph50/
total 4
-rw-------. 1 167 167 system_u:object_r:var_lib_t:s0 137 Jan 20 07:53 keyring
```

----> On RHEL 8.6 nodes:

```
[root@ceph80 ~]# cat /etc/redhat-release 
Red Hat Enterprise Linux release 8.6 (Ootpa)

[root@ceph80 ~]# podman version
Client:       Podman Engine
Version:      4.1.1
API Version:  4.1.1
Go Version:   go1.17.7
Built:        Wed Oct 12 19:12:59 2022
OS/Arch:      linux/amd64

[root@ceph80 ~]# ll -lZ /var/lib/ceph/mgr/ceph-ceph80/keyring 
-rw-------. 1 167 167 system_u:object_r:container_file_t:s0 137 Jan 21 00:59 /var/lib/ceph/mgr/ceph-ceph80/keyring
``` 

- The correct selinux label should be `container_file_t`.


Discussed this issue with Guillaume++ and we believe this is due to the change in podman version that has resulted in the failure.


Regards,
Lijo

--- Additional comment from Lijo Stephen Thomas on 2023-01-20 21:07:44 UTC ---

Lab details:

[RHCS 4.3z1 + RHEL 8.7]

10.74.250.68    ceph50
10.74.252.92    ceph51
10.74.250.138	ceph52
10.74.250.26	cephadmin1  --> ansible_user=root| pass=redhat123

Logs: /root/ansible/ansible.log

--- Additional comment from Teoman ONAY on 2023-02-02 12:58:13 UTC ---

So far it does not seem related to the podman version. I have a setup running Centos Stream 8 + podman 4.2.0 and the relabeling of the keyring file to container_file_t when mgr starts works as expected.

I still need to figure out why the relabeling -v /var/lib/ceph:/var/lib/ceph:z does not work as expected. I am continuing my investigation and will update the BZ asap

--- Additional comment from Manisha Saini on 2023-02-02 12:59:55 UTC ---

Hi,

we tested this deployment RHCS 4.3z1 with RHEL 8.7, hitting same issue as reported in this BZ


[root@ceph-mobisht-4-3z1-aobcda-node1-installer ceph]# rpm -qa | grep ansible
ceph-ansible-4.0.70.18-1.el8cp.noarch
ansible-2.9.27-1.el8ae.noarch

[root@ceph-mobisht-4-3z1-aobcda-node1-installer ceph]# rpm -qa | grep podman
podman-4.2.0-6.module+el8.7.0+17498+a7f63b89.x86_64
podman-catatonit-4.2.0-6.module+el8.7.0+17498+a7f63b89.x86_64

[root@ceph-mobisht-4-3z1-aobcda-node1-installer ceph]# cat /etc/redhat-release 
Red Hat Enterprise Linux release 8.7 (Ootpa)

[root@ceph-mobisht-4-3z1-aobcda-node1-installer ceph]# podman version
Client:       Podman Engine
Version:      4.2.0
API Version:  4.2.0
Go Version:   go1.18.4
Built:        Mon Dec 12 06:41:56 2022
OS/Arch:      linux/amd64


Deployment failed with below error

============
Feb 02 07:22:06 ceph-mobisht-4-3z1-aobcda-node1-installer ceph-mgr-ceph-mobisht-4-3z1-aobcda-node1-installer[20023]: find: '/var/lib/ceph/mgr/ceph-ceph-mobisht-4-3z1-aobcda-node1-installer/keyring': Permission >
Feb 02 07:22:06 ceph-mobisht-4-3z1-aobcda-node1-installer ceph-mgr-ceph-mobisht-4-3z1-aobcda-node1-installer[20023]: chown: cannot access '/var/lib/ceph/mgr/ceph-ceph-mobisht-4-3z1-aobcda-node1-installer/keyrin>
Feb 02 07:22:06 ceph-mobisht-4-3z1-aobcda-node1-installer systemd[1]: ceph-mgr: Main process exited, code=exited, status=1/FAILURE
Feb 02 07:22:06 ceph-mobisht-4-3z1-aobcda-node1-installer systemd[1]: ceph-mgr: Failed with result 'exit-code'.
Feb 02 07:22:16 ceph-mobisht-4-3z1-aobcda-node1-installer systemd[1]: ceph-mgr: Service RestartSec=10s expired, scheduling restart.
Feb 02 07:22:16 ceph-mobisht-4-3z1-aobcda-node1-installer systemd[1]: ceph-mgr: Scheduled restart job, restart counter is at 3.
Feb 02 07:22:16 ceph-mobisht-4-3z1-aobcda-node1-installer systemd[1]: Stopped Ceph Manager.
Feb 02 07:22:16 ceph-mobisht-4-3z1-aobcda-node1-installer systemd[1]: Starting Ceph Manager...
Feb 02 07:22:16 ceph-mobisht-4-3z1-aobcda-node1-installer podman[20266]: Error: no container with name or ID "ceph-mgr-ceph-mobisht-4-3z1-aobcda-node1-installer" found: no such container
Feb 02 07:22:16 ceph-mobisht-4-3z1-aobcda-node1-installer podman[20277]: Error: no container with name or ID "ceph-mgr-ceph-mobisht-4-3z1-aobcda-node1-installer" found: no such container
Feb 02 07:22:16 ceph-mobisht-4-3z1-aobcda-node1-installer podman[20286]: 
Feb 02 07:22:16 ceph-mobisht-4-3z1-aobcda-node1-installer podman[20286]: a4fd9d117ca54e869ecc5cb4b9a42290d4a6d51ba38348bee186dba16edc3c08
Feb 02 07:22:16 ceph-mobisht-4-3z1-aobcda-node1-installer systemd[1]: Started Ceph Manager.
Feb 02 07:22:17 ceph-mobisht-4-3z1-aobcda-node1-installer ceph-mgr-ceph-mobisht-4-3z1-aobcda-node1-installer[20296]: find: '/var/lib/ceph/mgr/ceph-ceph-mobisht-4-3z1-aobcda-node1-installer/keyring': Permission >
Feb 02 07:22:17 ceph-mobisht-4-3z1-aobcda-node1-installer ceph-mgr-ceph-mobisht-4-3z1-aobcda-node1-installer[20296]: chown: cannot access '/var/lib/ceph/mgr/ceph-ceph-mobisht-4-3z1-aobcda-node1-installer/keyrin>
Feb 02 07:22:17 ceph-mobisht-4-3z1-aobcda-node1-installer systemd[1]: ceph-mgr: Main process exited, code=exited, status=1/FAILURE
Feb 02 07:22:17 ceph-mobisht-4-3z1-aobcda-node1-installer systemd[1]: ceph-mgr: Failed with result 'exit-code'.
Feb 02 07:22:27 ceph-mobisht-4-3z1-aobcda-node1-installer systemd[1]: ceph-mgr: Service RestartSec=10s expired, scheduling restart.
Feb 02 07:22:27 ceph-mobisht-4-3z1-aobcda-node1-installer systemd[1]: ceph-mgr: Scheduled restart job, restart counter is at 4.
=========================


[root@ceph-mobisht-4-3z1-aobcda-node1-installer ceph]# ll -lZ /var/lib/ceph/mgr/ceph-ceph-mobisht-4-3z1-aobcda-node1-installer/keyring 
-rw-------. 1 167 167 system_u:object_r:var_lib_t:s0 172 Feb  2 07:21 /var/lib/ceph/mgr/ceph-ceph-mobisht-4-3z1-aobcda-node1-installer/keyring


================================

--- Additional comment from Manisha Saini on 2023-02-02 13:00:38 UTC ---

Logs - http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-70LC76/

--- Additional comment from Manisha Saini on 2023-02-02 13:26:01 UTC ---

[root@ceph-mobisht-4-3z1-aobcda-node1-installer ceph]# rpm -qa | grep selinux
container-selinux-2.189.0-1.module+el8.7.0+17498+a7f63b89.noarch
rpm-plugin-selinux-4.14.3-24.el8_7.x86_64
libselinux-2.9-6.el8.x86_64
libselinux-utils-2.9-6.el8.x86_64
selinux-policy-3.14.3-108.el8_7.1.noarch
python3-libselinux-2.9-6.el8.x86_64
selinux-policy-targeted-3.14.3-108.el8_7.1.noarch

--- Additional comment from Teoman ONAY on 2023-02-02 21:28:52 UTC ---

My findings so far are that there was a change in podman between these 2 versions:

podman-3:4.2.0-1.module+el8.7.0+16772+33343656 <--- works
podman-3:4.2.0-4.module+el8.7.0+17064+3b31f55c <--- does not work

which broke the selinux relabeling of the objects on the shared volumes (e.g the "z" in -v /var/lib/ceph:/var/lib/ceph:z). When the container start it should relabel all files from system_u:object_r:var_lib_t:s0 -> system_u:object_r:container_file_t:s0.

I will now look in podman changelog to find out what could have broken this.

--- Additional comment from Jindrich Novy on 2023-02-03 09:29:26 UTC ---

Dan, is there anything obvious in comment #7 how it could broke? Podman only contains updates to the latest upstream branch, nothing else.

--- Additional comment from Daniel Walsh on 2023-02-04 14:08:54 UTC ---

ls -lZd /var/lib/ceph


The only change in this area that I am aware of is, if the top level directory is labeled correctly from an SELinux point of view, podman will no longer relabel the contents of the directory.

Meaning if you mv'd files into the directory we could have an issue, if /var/lib/ceph is labeled container_file_t:s0.

--- Additional comment from Teoman ONAY on 2023-02-14 16:09:13 UTC ---

podman behavior changed after podman-3:4.2.0-1.module+el8.7.0+16772+33343656 regarding folder and files selinux relabeling. podman-3:4.2.0-4.module+el8.7.0+17064+3b31f55c and later are affected.

The problem occurs when the MON container is started before the MGR one (the other way around it works). Both containers bindmount /var/lib/ceph folder.

This is the usual content of that folder:

drwxr-x---. 2 ceph ceph system_u:object_r:container_file_t:s0  6 Oct 17 20:40 bootstrap-mds
drwxr-x---. 2 ceph ceph system_u:object_r:container_file_t:s0  6 Oct 17 20:40 bootstrap-mgr
drwxr-x---. 2 ceph ceph system_u:object_r:container_file_t:s0  6 Oct 17 20:40 bootstrap-osd
drwxr-x---. 2 ceph ceph system_u:object_r:container_file_t:s0  6 Oct 17 20:40 bootstrap-rbd
drwxr-x---. 2 ceph ceph system_u:object_r:container_file_t:s0  6 Oct 17 20:40 bootstrap-rbd-mirror
drwxr-x---. 2 ceph ceph system_u:object_r:container_file_t:s0  6 Oct 17 20:40 bootstrap-rgw
drwxr-x---. 3 ceph ceph system_u:object_r:container_file_t:s0 20 Feb 10 19:04 crash
drwxr-xr-x. 3 ceph ceph system_u:object_r:container_file_t:s0 23 Feb 10 19:03 mds
drwxr-x---. 3 ceph ceph system_u:object_r:container_file_t:s0 23 Oct 17 20:40 mgr
drwxr-xr-x. 3 ceph ceph system_u:object_r:container_file_t:s0 23 Feb 10 19:03 mon
drwxr-xr-x. 2 ceph ceph system_u:object_r:container_file_t:s0  6 Feb 10 19:03 osd
drwxr-xr-x. 3 ceph ceph system_u:object_r:container_file_t:s0 27 Feb 10 19:03 radosgw
drwxr-x---. 2 ceph ceph system_u:object_r:container_file_t:s0  6 Oct 17 20:40 tmp

the MON process get access to files in /var/lib/ceph/mon while the MGR process get access to /var/lib/ceph/mgr.

Starting the MON first (which is started with --security-opt label=disable) does not relabel the /var/lib/ceph folder and its subfolders which are still labeled "var_lib_t" but it looks like it prevents the MGR process to do the relabeling as it should. All folders are relabeled but the file (keyring) contained in /var/lib/ceph/mgr keeps its var_lib_t label which prevent the MGR container from starting. In podman-3:4.2.0-1.module+el8.7.0+16772+33343656 this workflow works (keyring file is relabeled container_file_t) while with podman versions podman-3:4.2.0-4.module+el8.7.0+17064+3b31f55c and later the keyring file relabeling does not work. It seems to be related to "--security-opt label=disable".

As a temporary fix (until podman is fixed), I will bind the MON container to "-v /var/lib/ceph/mon:/var/lib/ceph/mon:rshared" instead of "/var/lib/ceph/" which seems to work regardless to what container is started first.


Here follows the systemd unit content for both containers:

MON
---
ExecStart=/usr/bin/podman run --rm --name ceph-mon-%i \
  -d --log-driver journald --conmon-pidfile /%t/%n-pid --cidfile /%t/%n-cid \
  --pids-limit=0 \
  --memory=968m \
  --cpus=1 \
  --security-opt label=disable \
  -v /var/lib/ceph:/var/lib/ceph:rshared \
  -v /etc/ceph:/etc/ceph \
  -v /var/run/ceph:/var/run/ceph \
  -v /etc/localtime:/etc/localtime \
  -v /var/log/ceph:/var/log/ceph \
-v /etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted \
--net=host \
-e IP_VERSION=4 \
  -e MON_IP=192.168.1.10 \
  -e CLUSTER=ceph \
  -e FSID=10d0bcd1-1028-49c3-b30e-fc63c6fafb5c \
  -e MON_PORT=3300 \
  -e CEPH_PUBLIC_NETWORK=192.168.1.0/24 \
  -e CEPH_DAEMON=MON \
  -e CONTAINER_IMAGE=quay.io/ceph/daemon:latest-quincy-devel \
  -e TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728 \
   \
  quay.io/ceph/daemon:latest-quincy-devel



MGR
---
ExecStart=/usr/bin/podman run --rm --net=host \
  -d --log-driver journald --conmon-pidfile /%t/%n-pid --cidfile /%t/%n-cid \
  --pids-limit=0 \
  --memory=968m \
  --cpus=1 \
  -v /var/lib/ceph:/var/lib/ceph:z,rshared \
  -v /etc/ceph:/etc/ceph:z \
  -v /var/run/ceph:/var/run/ceph:z \
  -v /etc/localtime:/etc/localtime:ro \
  -v /var/log/ceph:/var/log/ceph:z \
  -e CLUSTER=ceph \
  -e CEPH_DAEMON=MGR \
  -e CONTAINER_IMAGE=quay.io/ceph/daemon:latest-quincy-devel \
  -e TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728 \
   \
  --name=ceph-mgr-mgr0 \
  quay.io/ceph/daemon:latest-quincy-devel


As a next step I will assign this BZ to the podman team as the product behavior has changed and needs to be fixed. In the meantime I will clone this BZ to apply the temporary workaround to ceph-ansible.

--- Additional comment from RHEL Program Management on 2023-02-14 16:11:38 UTC ---

pm_ack is no longer used for this product. The flag has been reset.

See https://issues.redhat.com/browse/PTT-1821 for additional details.

--- Additional comment from RHEL Program Management on 2023-02-14 16:11:38 UTC ---

This bug was reopened or transitioned from a non-RHEL to RHEL product.  The stale date has been reset to +6 months.

--- Additional comment from Teoman ONAY on 2023-02-20 10:46:28 UTC ---

FYI, the fix has been merged upstream (https://github.com/ceph/ceph-ansible/pull/7380) but I still need to backport it to 4.3 and test it.

--- Additional comment from Anuchaithra on 2023-06-08 07:17:53 UTC ---

Issue of "FAILED - RETRYING: wait for all mgr to be up" seen in upgrade i.e upgrade from 4.3z1 to 5.3 z4 (in both multisite and single site cluster)

This issue is seen with rhel 8.7 and 8.8 (upgrade failed since mgr is not comming up)
upgrade from ceph version 14.2.22-128.el8cp to ceph Version 16.2.10-174.el8cp
http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-dll50/ceph_multisite_upgrade_0.log
http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-1xngk/Upgrade_ceph_cluster_to_5.x_latest_0.log


tested with rhel 8.5 and 8.6 upgrade is working fine
8.5: http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-ht8zn/ceph_multisite_upgrade_0.log
8.6: http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-i3ib2/Upgrade_ceph_cluster_to_5.x_latest_0.log
     http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-i3g5k/ceph_multisite_upgrade_0.log

Comment 10 errata-xmlrpc 2023-08-28 09:40:56 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 5.3 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:4760