Bug 1848700

Summary: [Cephadm] Add new OSD is unable to complete in 5.0 ceph cluster with ceph orch command
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Preethi <pnataraj>
Component: CephadmAssignee: Juan Miguel Olmo <jolmomar>
Status: CLOSED ERRATA QA Contact: Vasishta <vashastr>
Severity: high Docs Contact: Karen Norteman <knortema>
Priority: unspecified    
Version: 5.0CC: sewagner, tserlin, vereddy
Target Milestone: ---   
Target Release: 5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-16.0.0-7209.el8cp Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-30 08:25:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 5 Preethi 2020-11-19 17:18:56 UTC
Issue is not seen when you set OSD flag to unmanaged for all available devices daemon. However, below output for reference



Step1) remove OSD from host magna067 ID#3
Step2) Remove OSD from host magna067 ID#4 with replace option as mentioned in the steps

ceph status when 3 and 4 is removed.

[ceph: root@magna094 /]# ceph -s
    id:     c97c2c8c-0942-11eb-ae18-002590fbecb6
    health: HEALTH_WARN
            1 pool(s) full
 
  services:
    mon: 3 daemons, quorum magna094,magna067,magna073 (age 6h)
    mgr: magna094.hussmr(active, since 6h), standbys: magna067.cudixx
    mds: test:1 {0=test.magna076.xymdrn=up:active} 2 up:standby
    osd: 26 osds: 25 up (since 96s), 25 in (since 81s)
    rgw: 2 daemons active (myorg.us-east-1.magna092.bxiihn, myorg.us-east-1.magna093.nhekwk)
 
  data:
    pools:   19 pools, 553 pgs
    objects: 427 objects, 416 KiB
    usage:   1.4 GiB used, 23 TiB / 23 TiB avail
    pgs:     553 active+clean
 
  io:
    client:   85 B/s rd, 0 op/s rd, 0 op/s wr


[ceph: root@magna094 /]# ceph osd tree
ID   CLASS  WEIGHT    TYPE NAME          STATUS  REWEIGHT  PRI-AFF
 -1         22.74246  root default                                
 -5          0.90970      host magna067                           
  5    hdd   0.90970          osd.5          up   1.00000  1.00000
 -7          2.72910      host magna073                           
  6    hdd   0.90970          osd.6          up   1.00000  1.00000
  7    hdd   0.90970          osd.7          up   1.00000  1.00000
  8    hdd   0.90970          osd.8          up   1.00000  1.00000
-17          2.72910      host magna075                           
 11    hdd   0.90970          osd.11         up   1.00000  1.00000
 17    hdd   0.90970          osd.17         up   1.00000  1.00000
 23    hdd   0.90970          osd.23         up   1.00000  1.00000
-15          2.72910      host magna076                           
 13    hdd   0.90970          osd.13         up   1.00000  1.00000
 19    hdd   0.90970          osd.19         up   1.00000  1.00000
 25    hdd   0.90970          osd.25         up   1.00000  1.00000
-19          2.72910      host magna077                           
  9    hdd   0.90970          osd.9          up   1.00000  1.00000
 15    hdd   0.90970          osd.15         up   1.00000  1.00000
 21    hdd   0.90970          osd.21         up   1.00000  1.00000
-13          2.72910      host magna079                           
 10    hdd   0.90970          osd.10         up   1.00000  1.00000
 16    hdd   0.90970          osd.16         up   1.00000  1.00000
 22    hdd   0.90970          osd.22         up   1.00000  1.00000
-11          2.72910      host magna092                           
 12    hdd   0.90970          osd.12         up   1.00000  1.00000
 18    hdd   0.90970          osd.18         up   1.00000  1.00000
 24    hdd   0.90970          osd.24         up   1.00000  1.00000
 -9          2.72910      host magna093                           
 14    hdd   0.90970          osd.14         up   1.00000  1.00000
 20    hdd   0.90970          osd.20         up   1.00000  1.00000
 26    hdd   0.90970          osd.26         up   1.00000  1.00000
 -3          2.72910      host magna094                           
  0    hdd   0.90970          osd.0          up   1.00000  1.00000
  1    hdd   0.90970          osd.1          up   1.00000  1.00000
  2    hdd   0.90970          osd.2          up   1.00000  1.00000
  4                0  osd.4                down         0  1.00000


[ceph: root@magna094 /]# ceph orch device zap magna067 /dev/sdb --force
/bin/podman:stderr WARNING: The same type, major and minor should not be used for multiple devices.
/bin/podman:stderr WARNING: The same type, major and minor should not be used for multiple devices.
/bin/podman:stderr --> Zapping: /dev/sdb
/bin/podman:stderr --> Zapping lvm member /dev/sdb. lv_path is /dev/ceph-bf46a6a3-5949-4f99-b29c-02001e801b8e/osd-block-c36d0031-7e3c-4208-862e-ec1354f990f0
/bin/podman:stderr Running command: /usr/bin/dd if=/dev/zero of=/dev/ceph-bf46a6a3-5949-4f99-b29c-02001e801b8e/osd-block-c36d0031-7e3c-4208-862e-ec1354f990f0 bs=1M count=10 conv=fsync
/bin/podman:stderr  stderr: 10+0 records in
/bin/podman:stderr 10+0 records out
/bin/podman:stderr  stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0980268 s, 107 MB/s
/bin/podman:stderr --> Only 1 LV left in VG, will proceed to destroy volume group ceph-bf46a6a3-5949-4f99-b29c-02001e801b8e
/bin/podman:stderr Running command: /usr/sbin/vgremove -v -f ceph-bf46a6a3-5949-4f99-b29c-02001e801b8e
/bin/podman:stderr  stderr: Removing ceph--bf46a6a3--5949--4f99--b29c--02001e801b8e-osd--block--c36d0031--7e3c--4208--862e--ec1354f990f0 (253:2)
/bin/podman:stderr  stderr: Archiving volume group "ceph-bf46a6a3-5949-4f99-b29c-02001e801b8e" metadata (seqno 5).
/bin/podman:stderr  stderr: Releasing logical volume "osd-block-c36d0031-7e3c-4208-862e-ec1354f990f0"
/bin/podman:stderr  stderr: Creating volume group backup "/etc/lvm/backup/ceph-bf46a6a3-5949-4f99-b29c-02001e801b8e" (seqno 6).
/bin/podman:stderr  stdout: Logical volume "osd-block-c36d0031-7e3c-4208-862e-ec1354f990f0" successfully removed
/bin/podman:stderr  stderr: Removing physical volume "/dev/sdb" from volume group "ceph-bf46a6a3-5949-4f99-b29c-02001e801b8e"
/bin/podman:stderr  stdout: Volume group "ceph-bf46a6a3-5949-4f99-b29c-02001e801b8e" successfully removed
/bin/podman:stderr Running command: /usr/bin/dd if=/dev/zero of=/dev/sdb bs=1M count=10 conv=fsync
/bin/podman:stderr  stderr: 10+0 records in
/bin/podman:stderr 10+0 records out
/bin/podman:stderr 10485760 bytes (10 MB, 10 MiB) copied, 0.136872 s, 76.6 MB/s
/bin/podman:stderr --> Zapping successful for: <Raw Device: /dev/sdb>
[ceph: root@magna094 /]# ceph orch daemon add osd magna067:/dev/sdb
Created osd(s) 3 on host 'magna067'
[ceph: root@magna094 /]# 


Add OSD is successful after cleaning up the disk as expected.

Comment 8 errata-xmlrpc 2021-08-30 08:25:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 5.0 bug fix and enhancement), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3294