Bug 2039247 - `ceph cephadm osd activate node1.rhcs4to5.com` prints `/var/lib/ceph/92e6589a-9717-423a-9fde-9238d1509c66/osd.4/config` not found
Summary: `ceph cephadm osd activate node1.rhcs4to5.com` prints `/var/lib/ceph/92e6589...
Keywords:
Status: NEW
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Cephadm
Version: 5.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 8.0
Assignee: Adam King
QA Contact: Sunil Kumar Nagaraju
Karen Norteman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-11 10:50 UTC by Sebastian Wagner
Modified: 2023-07-05 14:44 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-2944 0 None None None 2022-01-11 10:54:20 UTC

Description Sebastian Wagner 2022-01-11 10:50:01 UTC
This bug was initially created as a copy of Bug https://bugzilla.redhat.com/show_bug.cgi?id=2029695#c7 

----


After purging the osd using the command `ceph-volume lvm zap --destroy --osd-id 4` or lets assume that the directory `/var/lib/ceph/<FS_ID>/osd.<ID>/` is not present. In such situation,

1. I can able to create the OSD:

```
[root@node1 ~]# /usr/bin/podman run -it --rm --ipc=host --no-hosts --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=$ceph_image_id -e NODE_NAME=$node_name -e CEPH_USE_RANDOM_NONCE=1 -v /var/run/ceph/$CLUSTERID:/var/run/ceph:z -v /var/log/ceph/$CLUTERID:/var/log/ceph:z -v /var/lib/ceph/$CLUSTERID/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /var/lib/ceph/$CLUSTERID/osd.$OSDID/config:/etc/ceph/ceph.conf:rw -v "/var/lib/ceph/bootstrap-osd/ceph.keyring:/var/lib/ceph/bootstrap-osd/ceph.keyring" -v /var/lib/ceph/$CLUSTERID/selinux:/sys/fs/selinux:ro $ceph_image_id lvm  create --data $data_lv --block.db $db_lv --block.wal $wal_lv --no-systemd
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 29c46248-ac5b-4179-aedb-2ac245da1cc9
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-4
Running command: /usr/bin/chown -h ceph:ceph /dev/node1_hdd_vg2/hdd_lv2
Running command: /usr/bin/chown -R ceph:ceph /dev/dm-3
Running command: /usr/bin/ln -s /dev/node1_hdd_vg2/hdd_lv2 /var/lib/ceph/osd/ceph-4/block
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-4/activate.monmap
 stderr: got monmap epoch 5
Running command: /usr/bin/ceph-authtool /var/lib/ceph/osd/ceph-4/keyring --create-keyring --name osd.4 --add-key AQDSrNZhwoZoDhAAes20CAUL95LfRTymqz2BFg==
 stdout: creating /var/lib/ceph/osd/ceph-4/keyring
added entity osd.4 auth(key=AQDSrNZhwoZoDhAAes20CAUL95LfRTymqz2BFg==)
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-4/keyring
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-4/
Running command: /usr/bin/chown -h ceph:ceph /dev/node1_nvme_vg2/nvme_lv2
Running command: /usr/bin/chown -R ceph:ceph /dev/dm-7
Running command: /usr/bin/chown -h ceph:ceph /dev/node1_ssd_vg2/ssd_lv2
Running command: /usr/bin/chown -R ceph:ceph /dev/dm-5
Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 4 --monmap /var/lib/ceph/osd/ceph-4/activate.monmap --keyfile - --bluestore-block-wal-path /dev/node1_nvme_vg2/nvme_lv2 --bluestore-block-db-path /dev/node1_ssd_vg2/ssd_lv2 --osd-data /var/lib/ceph/osd/ceph-4/ --osd-uuid 29c46248-ac5b-4179-aedb-2ac245da1cc9 --setuser ceph --setgroup ceph
 stderr: 2022-01-06T08:48:19.669+0000 7fa61ce15080 -1 bluestore(/var/lib/ceph/osd/ceph-4/) _read_fsid unparsable uuid
--> ceph-volume lvm prepare successful for: node1_hdd_vg2/hdd_lv2
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-4
Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/node1_hdd_vg2/hdd_lv2 --path /var/lib/ceph/osd/ceph-4 --no-mon-config
Running command: /usr/bin/ln -snf /dev/node1_hdd_vg2/hdd_lv2 /var/lib/ceph/osd/ceph-4/block
Running command: /usr/bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-4/block
Running command: /usr/bin/chown -R ceph:ceph /dev/dm-3
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-4
Running command: /usr/bin/ln -snf /dev/node1_ssd_vg2/ssd_lv2 /var/lib/ceph/osd/ceph-4/block.db
Running command: /usr/bin/chown -h ceph:ceph /dev/node1_ssd_vg2/ssd_lv2
Running command: /usr/bin/chown -R ceph:ceph /dev/dm-5
Running command: /usr/bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-4/block.db
Running command: /usr/bin/chown -R ceph:ceph /dev/dm-5
Running command: /usr/bin/ln -snf /dev/node1_nvme_vg2/nvme_lv2 /var/lib/ceph/osd/ceph-4/block.wal
Running command: /usr/bin/chown -h ceph:ceph /dev/node1_nvme_vg2/nvme_lv2
Running command: /usr/bin/chown -R ceph:ceph /dev/dm-7
Running command: /usr/bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-4/block.wal
Running command: /usr/bin/chown -R ceph:ceph /dev/dm-7
--> ceph-volume lvm activate successful for osd ID: 4
--> ceph-volume lvm create successful for: node1_hdd_vg2/hdd_lv2
[root@node1 ~]#
```

2. While activating the osd using the command `ceph cephadm osd activate node1.rhcs4to5.com`, getting the error by saying that the file - `/var/lib/ceph/92e6589a-9717-423a-9fde-9238d1509c66/osd.4/config` not found. So I executed the below commands:

```
# mkdir /var/lib/ceph/$CLUSTERID/osd.$OSDID
# cephadm shell ceph config generate-minimal-conf
# vim /var/lib/ceph/92e6589a-9717-423a-9fde-9238d1509c66/osd.4/config
# chown 167.167 -R /var/lib/ceph/$CLUSTERID/osd.$OSDID
# chmod 700 /var/lib/ceph/$CLUSTERID/osd.$OSDID && chmod 600 /var/lib/ceph/$CLUSTERID/osd.$OSDID/config
```

3.But now I am seeing the below error:

```
Jan  6 14:26:43 node1 bash[38594]: /bin/bash: /var/lib/ceph/92e6589a-9717-423a-9fde-9238d1509c66/osd.4/unit.run: No such file or directory
Jan  6 14:26:43 node1 systemd[1]: ceph-92e6589a-9717-423a-9fde-9238d1509c66.service: Control process exited, code=exited status=127
Jan  6 14:26:43 node1 bash[38595]: /bin/bash: /var/lib/ceph/92e6589a-9717-423a-9fde-9238d1509c66/osd.4/unit.poststop: No such file or directory
Jan  6 14:26:43 node1 systemd[1]: ceph-92e6589a-9717-423a-9fde-9238d1509c66.service: Failed with result 'exit-code'.
Jan  6 14:26:43 node1 systemd[1]: Failed to start Ceph osd.4 for 92e6589a-9717-423a-9fde-9238d1509c66.
```
Is there any way to fix this? Or do we have any manual method to deploy the osd(pre-created lvm with different Data/Wal/Db devices)?

NOTE: These manual steps are required in case if customer used advanced lvm(pre-created lvm volume) in RHCS 4 and after upgrading to RHCS5, this needs to be addressed during the OSD replacement.

Comment 1 RHEL Program Management 2022-01-11 10:50:07 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.


Note You need to log in before you can comment on or make changes to this bug.