Bug 1555793

Summary: Ceph-Ansible shrink-osd.yml does not work with naming convention for NVME devices
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: jquinn <jquinn>
Component: Ceph-AnsibleAssignee: Sébastien Han <shan>
Status: CLOSED DUPLICATE QA Contact: ceph-qe-bugs <ceph-qe-bugs>
Severity: low Docs Contact:
Priority: unspecified    
Version: 3.0CC: adeza, aschoen, ceph-eng-bugs, gmeno, mhackett, molasaga, nthomas, sankarshan
Target Milestone: rc   
Target Release: 3.*   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-20 09:44:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1553254    

Description jquinn 2018-03-14 19:41:11 UTC
Description of problem:

The main problems we have detected here are the following:

https://github.com/ceph/ceph-ansible/blob/master/infrastructure-playbooks/shrink-osd.yml#L110
    -In this case it seems that awk pattern it might not be the best option to use because this command will match all osd that start with the osd number. If you have more than 10 OSDs one will have problems to shrink OSDs using this method. Example:
/usr/sbin/ceph-disk list | awk  -v pattern=osd.1 '$0 ~ pattern {print $1}' will print all devices matching OSD names osd.1, osd.1{0..9}. 
[root@ceph2-5 ~]# ceph-disk list
/dev/vdb1 ceph data, active, cluster ceph, osd.1, journal /dev/vdb2
/dev/vdc1 ceph data, active, cluster ceph, osd.11, journal /dev/vdc2
/dev/vdd1 ceph data, active, cluster ceph, osd.12, journal /dev/vdc2
/dev/vde1 ceph data, active, cluster ceph, osd.13, journal /dev/vdc2
/dev/vdf1 ceph data, active, cluster ceph, osd.14, journal /dev/vdc2
/dev/vdg1 ceph data, active, cluster ceph, osd.15, journal /dev/vdc2
/dev/vdh1 ceph data, active, cluster ceph, osd.16, journal /dev/vdc2
/dev/vdi1 ceph data, active, cluster ceph, osd.17, journal /dev/vdc2
/dev/vdj1 ceph data, active, cluster ceph, osd.18, journal /dev/vdc2
/dev/vdk1 ceph data, active, cluster ceph, osd.19, journal /dev/vdc2
[root@ceph2-5 ~]# ceph-disk list | awk -v pattern=osd.1 '$0 ~ pattern {print $1}'
/dev/vdb1
/dev/vdc1
/dev/vdd1
/dev/vde1
/dev/vdf1
/dev/vdg1
/dev/vdh1
/dev/vdi1
/dev/vdj1
/dev/vdk1

https://github.com/ceph/ceph-ansible/blob/master/infrastructure-playbooks/shrink-osd.yml#L121
    -In this case, when using NVMe devices, partitions are created using p+partition number, not just with the partition number.
sda --> sda1
nvme0n1 --> nvme0n1p1
So, in this case this line when using NVMe devices should be name: "ceph-osd@{{ item.0.stdout[:-2] | regex_replace('/dev/', '') }}"


Version-Release number of selected component (if applicable):RHCS 3.x 



Additional info: Additional BZ's for NVME devices with ceph-ansible.  Should this be resolved in 3.0z1 as part of Ansible 3.0.25 as well? 


Same issue in the purge-docker-cluster.yml  ---   https://bugzilla.redhat.com/show_bug.cgi?id=1547999    should be fixed in 3.0z1

Install with NVME fails for the same reason ---   https://bugzilla.redhat.com/show_bug.cgi?id=1541016    should be fixed in 3.0z1

Install of NVME in container fails -- https://bugzilla.redhat.com/show_bug.cgi?id=1537980  should be fixed in 3.0z1

Comment 3 Sébastien Han 2018-04-20 09:44:40 UTC

*** This bug has been marked as a duplicate of bug 1561456 ***