Bug 1377867
| Summary: | Prepare OSDs with GPT label to help ensure Ceph deployment will succeed | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Alexander Chuzhoy <sasha> |
| Component: | rhosp-director | Assignee: | John Fulton <johfulto> |
| Status: | CLOSED DUPLICATE | QA Contact: | Yogev Rabl <yrabl> |
| Severity: | high | Docs Contact: | Don Domingo <ddomingo> |
| Priority: | high | ||
| Version: | 11.0 (Ocata) | CC: | alan_bishop, dbecker, dcritch, ddomingo, gfidente, jliberma, johfulto, jomurphy, mburns, mcornea, morazi, racedoro, rhel-osp-director-maint, rybrown, scohen, yrabl |
| Target Milestone: | --- | Keywords: | Triaged |
| Target Release: | 11.0 (Ocata) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Enhancement | |
| Doc Text: |
A disk can be in a variety of states which may cause director to fail when attempting to make the disk a Ceph OSD. In previous releases, a user could run a first-boot script to erase the disk and set a GPT label required by Ceph. With this release, a new default setting in Ironic will erase the disks when a node is set to available and a change in puppet-ceph will give the disk a GPT label if there is no GPT label on the disk.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-03-10 18:55:24 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1418040, 1432309 | ||
| Bug Blocks: | 1387433, 1399824 | ||
|
Description
Alexander Chuzhoy
2016-09-20 20:33:14 UTC
*** Bug 1257307 has been marked as a duplicate of this bug. *** *** Bug 1312190 has been marked as a duplicate of this bug. *** *** Bug 1391199 has been marked as a duplicate of this bug. *** The wipe-disk script provided in the documentation [1] does not fully clean the disk if the disk was a PV for LVM. When such a disk is cleaned the kernel needs to re-scan it to get the updated information so that it can be added to ceph. [1] https://access.redhat.com/documentation/en/red-hat-openstack-platform/10/single/red-hat-ceph-storage-for-the-overcloud/#Formatting_Ceph_Storage_Nodes_Disks_to_GPT Our team encountered this issue when we repurposed a server that had an LVM partition left over from its previous duty. Here is the wipe_disk code we developed that works for us:
wipe_disk:
type: OS::Heat::SoftwareConfig
properties:
config: |
#!/bin/bash
if [[ $HOSTNAME =~ "cephstorage" ]]; then
{
# LVM partitions are always in use by the kernel. Destroy all of the
# LVM components here so the disks are not in use and sgdisk and
# partprobe can do their thing
# Destroy all the logical volumes
lvs --noheadings -o lv_name | awk '{print $1}' | while read lv;
do
cmd="lvremove -f $lv"
echo $cmd
$cmd
done
# Destroy all the volume groups
vgs --noheadings -o vg_name | awk '{print $1}' | while read lvg;
do
cmd="vgremove -f $lvg"
echo $cmd
$cmd
done
# Destroy all the physical volumes
pvs --noheadings -o pv_name | awk '{print $1}' | while read pv;
do
cmd="pvremove -ff $pv"
echo $cmd
$cmd
done
lsblk -dno NAME,TYPE | \
while read disk type; do
# Skip if the device type isn't "disk" or if it's mounted
[ "${type}" == "disk" ] || continue
device="/dev/${disk}"
if grep -q ^${device}[1-9] /proc/mounts; then
echo "Skipping ${device} because it's mounted"
continue
fi
echo "Partitioning disk: ${disk}"
sgdisk -og ${device}
echo
done
partprobe
parted -lm
} > /root/wipe-disk.txt 2>&1
fi
David has proposed a fix upstream for this: https://review.openstack.org/#/c/420992/ *** Bug 1252158 has been marked as a duplicate of this bug. *** Update: Changes in Ironic will affect this BZ. See 1418040 [1] for details we are doing to test this. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1418040 Upstream change has merged and will be in OSP11 GA, possibly RC1: https://review.openstack.org/#/c/420992 Summary: A disk can be in a variety of states which may cause director to fail when attempting to make the disk a Ceph OSD. In previous releases a user could use a first-boot script to erase the disk and set a GPT label required by Ceph. A new default setting in Ironic will erase the disks when a node is set to available and a change in puppet-ceph will give the disk a GPT label if there is no GPT label on the disk. New default Ironic behaviour verified: https://bugzilla.redhat.com/show_bug.cgi?id=1418040 Shall we close this as duplicate of 1418040 ? Alternatively, we could wait for the change in https://review.openstack.org/#/c/420992 to arrive in our package and set the fixed-in flag. *** This bug has been marked as a duplicate of bug 1418040 *** The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |