Bug 1738379

Summary: ceph-volume zap - output says zapping successful for "None" when tried zapping using osd-fsid
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Vasishta <vashastr>
Component: Ceph-VolumeAssignee: Christina Meno <gmeno>
Status: CLOSED ERRATA QA Contact: Ameena Suhani S H <amsyedha>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.3CC: amsyedha, assingh, ceph-eng-bugs, ceph-qe-bugs, tchandra, tserlin
Target Milestone: rc   
Target Release: 4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-14.2.4-81.el8cp, ceph-14.2.4-24.el7cp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-31 12:46:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vasishta 2019-08-07 03:49:40 UTC
Description of problem:
When an OSD is zapped using its fsid, though the OSD gets zapped, the output says that OSD'None' was zapped successfully.

Version-Release number of selected component (if applicable):
12.2.12-42.el7cp

How reproducible:
Always

Steps to Reproduce:
1. Zap an OSD using its fsid

Comment 11 Alfredo Deza 2019-12-13 15:51:22 UTC
I would need to know how you are calling ceph-volume, not just the error at the very end. If possible run it with CEPH_VOLUME_DEBUG=1 or include the relevant log portion that has the full traceback.

Comment 13 Alfredo Deza 2019-12-13 17:07:45 UTC
The environment variable you need to export is not "$CEPH_VOLUME_DEBUG" it is "CEPH_VOLUME_DEBUG" (without the dollar sign). 

The error indicates that it couldn't find that OSD fsid. Are you sure it exists? You can check with `ceph-volume lvm list` and see if that FSID is there

Comment 14 Vasishta 2019-12-13 17:24:49 UTC
Thanks for the hint, I was trying with cluster fsid instead of osd fsid , my bad
I'm still hitting the issue with ceph version 12.2.12-84.el7cp

# ceph-volume lvm zap --destroy --osd-fsid a6050577-520c-4829-bbde-79aa67cfb03c
--> Zapping: /dev/data_vg/data_lv1
--> Unmounting /var/lib/ceph/osd/ceph-7
Running command: umount -v /var/lib/ceph/osd/ceph-7
 stderr: umount: /var/lib/ceph/osd/ceph-7 (/dev/mapper/data_vg-data_lv1) unmounted
Running command: dd if=/dev/zero of=/dev/data_vg/data_lv1 bs=1M count=10
 stderr: 10+0 records in
10+0 records out
10485760 bytes (10 MB) copied
 stderr: , 0.00518822 s, 2.0 GB/s
--> Only 1 LV left in VG, will proceed to destroy volume group data_vg
Running command: vgremove -v -f data_vg
 stderr: Removing data_vg-data_lv1 (253:0)
 stderr: Archiving volume group "data_vg" metadata (seqno 30).
 stderr: Releasing logical volume "data_lv1"
 stderr: Creating volume group backup "/etc/lvm/backup/data_vg" (seqno 31).
 stdout: Logical volume "data_lv1" successfully removed
 stderr: Removing physical volume "/dev/sdb" from volume group "data_vg"
 stderr: Removing physical volume "/dev/sdd" from volume group "data_vg"
 stdout: Volume group "data_vg" successfully removed
--> Zapping: /dev/sdc1
Running command: dd if=/dev/zero of=/dev/sdc1 bs=1M count=10
 stderr: 10+0 records in
10+0 records out
10485760 bytes (10 MB) copied
 stderr: , 0.049838 s, 210 MB/s
--> Destroying partition since --destroy was used: /dev/sdc1
Running command: parted /dev/sdc --script -- rm 1
--> Zapping successful for OSD: None

Comment 15 Vasishta 2019-12-13 17:33:04 UTC
I've created a separate tracker to track the changes into 3.x as this BZ is being used for 4.x
I request you to kindly let me know your thoughts on comment 14 in 1783412

Regards,
Vasishta

Comment 17 Alfredo Deza 2019-12-18 13:30:36 UTC
(In reply to Ameena Suhani S H from comment #16)
> Hi
> I tried the following,  what exactly are we expecting to be printed, fsid or
> osd id in "Zapping successful for OSD: "? 

Either. The user has the choice to provide an ID or an FSID, or even both. In which case ID is preferred for the reporting, and if not present the FSID.

So you are basically confirming this bug is fixed :)


> Because currently, it is printing
> fsid on a successful zap of osd. 
> 
> # podman exec ceph-osd-7 ceph-volume lvm zap --destroy --osd-fsid
> f7f4e626-b36c-49e9-959d-bda20506a458
> --> Zapping:
> /dev/ceph-d1b80b26-cabe-487e-ace5-16709f4ba2a0/osd-data-20c141d4-7297-4284-
> 88a9-ac20b47e643c
> Running command: /usr/sbin/wipefs --all
> /dev/ceph-d1b80b26-cabe-487e-ace5-16709f4ba2a0/osd-data-20c141d4-7297-4284-
> 88a9-ac20b47e643c
> Running command: /bin/dd if=/dev/zero
> of=/dev/ceph-d1b80b26-cabe-487e-ace5-16709f4ba2a0/osd-data-20c141d4-7297-
> 4284-88a9-ac20b47e643c bs=1M count=10
>  stderr: 10+0 records in
> 10+0 records out
> 10485760 bytes (10 MB, 10 MiB) copied, 0.048219 s, 217 MB/s
> --> Only 1 LV left in VG, will proceed to destroy volume group
> ceph-d1b80b26-cabe-487e-ace5-16709f4ba2a0
> Running command: /usr/sbin/vgremove -v -f
> ceph-d1b80b26-cabe-487e-ace5-16709f4ba2a0
>  stderr: Removing
> ceph--d1b80b26--cabe--487e--ace5--16709f4ba2a0-osd--data--20c141d4--7297--
> 4284--88a9--ac20b47e643c (253:1)
>  stderr: Archiving volume group "ceph-d1b80b26-cabe-487e-ace5-16709f4ba2a0"
> metadata (seqno 21).
>  stderr: Releasing logical volume
> "osd-data-20c141d4-7297-4284-88a9-ac20b47e643c"
>  stderr: Creating volume group backup
> "/etc/lvm/backup/ceph-d1b80b26-cabe-487e-ace5-16709f4ba2a0" (seqno 22).
>  stdout: Logical volume "osd-data-20c141d4-7297-4284-88a9-ac20b47e643c"
> successfully removed
>  stderr: Removing physical volume "/dev/sdc" from volume group
> "ceph-d1b80b26-cabe-487e-ace5-16709f4ba2a0"
>  stdout: Volume group "ceph-d1b80b26-cabe-487e-ace5-16709f4ba2a0"
> successfully removed
> --> Zapping successful for OSD: f7f4e626-b36c-49e9-959d-bda20506a458

Comment 18 Ameena Suhani S H 2019-12-19 04:16:43 UTC
Based on Alfredo Deza comment #17 moving this bz to "VERIFIED" state.

Version-Release number of selected component:

ceph version 14.2.4-85.el8cp

Comment 21 errata-xmlrpc 2020-01-31 12:46:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0312