Description of problem:
When the OSD_DEVICE can't be determined, `ceph-osd-run.sh` attempts to start the container with an invalid image name
Version-Release number of selected component (if applicable):
Red Hat Ceph Storage 3.3
How reproducible:
This will occur when there are no OSD devices, e.g. after failed re-deployment of OSDs
Actual results:
dockerd-current[1234]: time="2020-06-22T20:10:16.462419051+02:00" level=error msg="Handler for POST /v1.26/containers/create returned error: No such image: 2020-06-22:latest"
ceph-osd-run.sh[123456]: Unable to find image '2020-06-22:latest' locally
Expected results:
Handle cases where OSD_DEVICE is empty in a gracefull manner. The script shall not attempt to start the container with an invalid image name in this case. A meaningful error message should be printed.
Additional info:
The OSD_DEVICE is derived here:
---->8----
function id_to_device () {
DATA_PART=$(docker run --rm --ulimit nofile=1024:4096 --privileged=true -v /dev/:/dev/ -v /etc/ceph:/etc/ceph:z --entrypoint ceph-disk registry.access.redhat.com/rhceph/rhceph-3-rhel7:3-32 list | grep ", osd\.${1}," | awk '{ print $1 }')
if [[ "${DATA_PART}" =~ ^/dev/(cciss|nvme|loop) ]]; then
OSD_DEVICE=${DATA_PART:0:-2}
else
OSD_DEVICE=${DATA_PART:0:-1}
fi
}
----8<-----
This is the output of ceph-disk list:
---->8-----
/dev/sda :
/dev/sda2 other, iso9660
/dev/sda1 other, vfat
/dev/sda3 other, xfs, mounted on /
/dev/sdb :
/dev/sdb1 ceph data, prepared, cluster ceph, journal /dev/sdg1
/dev/sdc :
/dev/sdc1 ceph data, prepared, cluster ceph, journal /dev/sdg2
/dev/sdd :
/dev/sdd1 ceph data, prepared, cluster ceph, journal /dev/sdg3
/dev/sde :
/dev/sde1 ceph data, prepared, cluster ceph, journal /dev/sdg4
/dev/sdf other, unknown
/dev/sdg :
/dev/sdg1 ceph journal, for /dev/sdb1
/dev/sdg2 ceph journal, for /dev/sdc1
/dev/sdg3 ceph journal, for /dev/sdd1
/dev/sdg4 ceph journal, for /dev/sde1
/dev/sdg5 ceph journal, for /dev/sdh1
/dev/sdh :
/dev/sdh1 ceph data, prepared, cluster ceph, journal /dev/sdg5
----8<-----
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Moderate: Red Hat Ceph Storage 3.3 security and bug fix update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2020:3504