Description of problem: When containerized Ceph OSDs are deployed with ceph-ansible, the resulting container names and service names of the OSDs do not correspond in any way to the OSD number and are thus difficult to find and use (details below). This differs from non-containerized Ceph, where OSD services are identified by OSD number, and OSD processes can be easily found by searching for "-i NNN" where OSD number is NNN. While logs are forwarded to the hypervisor in RHOSP, it is again hard to read them there because you need to do "systemctl -u ceph-osd@XXXX" where XXXX is OSD's block device name, not its number. The RHOSP Ceph DFG team's comment was that this was a ceph-ansible issue and really nothing to do with OpenStack, which I think is correct. They suggested that I file this bug so that we could start tracking this problem and discussing what to do about it. I am concerned that this will never be fixed if it is part of RHOSP 13, a long-term release, because of upgrade requirements associated with fixing it. Version-Release number of selected component (if applicable): ceph-ansible master branch (sorry I'm not sure which branches and tags get used for RHCS or RHOSP). How reproducible: every time Steps to Reproduce: 1. deploy RHOSP 12 2. try to find ID of container that houses OSD N 3. try to start/stop OSD N 4. try to look at logs for container for OSD N Actual results: You can't easily do any of these things. These are common tasks for Ceph admins because block devices fail for a variety of reasons, including hardware failure, upgrades, etc. Expected results: container name should embed the OSD number so that it can be directly referenced if it is running. service name should embed the OSD number so that it can be directly started/stopped. consequence would be that it becomes easy to examine OSD logs with "systemctl -u ceph-osd@N" Additional info: Script to find container corresponding to OSD N is: for c in `docker ps | grep ceph-osd | awk '{ print $1}' ` ; do echo -n "$c " ; docker exec -it $c bash -c \ ‘ls /var/run/ceph/ceph-osd.*.log' ; \ done But if OSDs are flapping (going up and down), the container ID may no longer be valid by the time that you access it. Container names are more stable, but a reboot could cause container's block device name to no longer match container name, since Linux does not guarantee block device name stability across reboots.
Here's a suggestion for how it could work, have briefly discussed this with LVM developer Joe Thornber (ejt). -- preparation: Am I right that the problem is that we don't know the OSD number at the time that you want to first name the container and the unit? There is a way around that - ask Ceph to just assign an OSD number with osdnum=$(ceph osd create) and then pass this into the script/container that prepares the OSD. Then when you prepare the OSD using "ceph-volume lvm prepare --osd-id NNN", ceph-volume tags the volume using LVM's tagging feature, which stores data in the VG. # lvchange --addtag "cephosd=NNN" /dev/vg_... Unfortunately ceph-volume does not accept an OSD number in the command line, so there is no way to pass it into "ceph-volume lvm prepare", so we would need to fix ceph-volume to do this and fix the container that does the OSD preparation to do it. ceph-disk used to have this feature by the way. in "ceph-disk prepare --help": --osd-id ID unique OSD id to assign this disk to If we are preparing a non-LVM OSD device, it will be filestore (right Alfredo?) and ceph --mkfs will write the "whoami" file containing the OSD number in the mountpoint directory as before. -- activation: if we encounter an LVM volume with the LVM tag "cephosd=NNN" then we know to activate the OSD of this number on this volume. if we encounter a "simple" OSD device (created by ceph-disk), and it is using filestore, we dig the OSD number out by mounting it and reading the "whoami" file in the mountpoint, as before.
Indeed, a nice to have but not in our roadmap at the moment. I can't exactly tell you when this will be available. This won't be in 3.1 RHCS, unlikely in 3.2 so maybe 3.3. Thanks.
*** Bug 1628713 has been marked as a duplicate of this bug. ***
This is still not fixed in RHOSP 13 AFAICT. It is a major problem if you are trying to troubleshoot or maintain a containerized Ceph cluster. I know we're going to Rook and Kubernetes, but in the meantime there are a lot of sites trying to get by with what we have and Kubernetes isn't running everywhere yet. Suppose you have to try to start the OSD, but you don't know what unit file to use? How do you find out? I would settle for just embedding the OSD number and the device name both in the container name. So then you could just do docker ps | grep osd | grep _N_ | awk '{ print $1}' to get the container ID. As for the unit file, this could be done with softlinks, for example - because when the unit file is created Ceph knows what OSD number will go with that block device.
(In reply to leseb from comment #5) > Indeed, a nice to have but not in our roadmap at the moment. > I can't exactly tell you when this will be available. This won't be in 3.1 > RHCS, unlikely in 3.2 so maybe 3.3. > > Thanks. Why not? Looks like important supportability item to me, and not very difficult to implement (I reckon - did not look at this more than the brief comments above!)
the containerized Ceph documentation says to restart the container using its block device name, but does not say how to determine what that is. https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/container_guide/administering-ceph-clusters-that-run-in-containers#starting-stopping-and-restarting-ceph-daemons-that-run-in-containers
Yaniv, comment 5 was a long time ago and I realized that this could be implemented as part of a ceph-volume containerization happening here: https://github.com/ceph/ceph-ansible/pull/2866. So this is on-going, my goal is to have this for 3.2. Thanks.
Present in https://github.com/ceph/ceph-ansible/releases/tag/v3.2.0beta5
Hi, We observed that, only when osd_scenario is set as 'lvm', osd service names are set with corresponding OSD ids and when osd_scenario is set as collocated or non-collocated, OSD service names are still having device names. We think that this will create confusion for users thus affects usability. Moving back to ASSIGNED state, please let us know if there are any concerns. Regards, Vasishta shastry QE, Ceph
You must lvm for both container and non-container. collocated or non-collocated are not encouraged anymore, you should do all your testing on lvm. Thanks
what about sites that are being upgraded? These will still have the problem, yes? If so, will Bluestore migration in RHCS 4 remove this issue?
For existing deployed OSD yes they will still have the @<disk> naming.
lgtm, thanks!
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0020