Description of problem: When deploying an overcloud, we see the ceph service installed and enabled on all the nodes (not just on the ceph nodes): systemctl |grep ceph ceph.service loaded active exited LSB: Start Ceph distributed file system daemons at boot time Version-Release number of selected component (if applicable): Using sprint 5 bits How reproducible: 100% Steps to Reproduce: 1. Install a default overcloud, even one that's without any ceph nodes at all 2. Log in to the controller or the compute nodes 3. Run "systemctl |grep ceph" Actual results: ceph.service is installed and active Expected results: You need only ceph-mon, and it should not be enabled if you didn't install a ceph node and you're not using ceph.
In addition, the admin key (/etc/ceph/ceph.client.admin.keyring) is not needed on the compute nodes.
ceph has different daemons for different purposes, the Overcloud is expected to have ceph-mon running on controller nodes, ceph-osd on ceph nodes and no ceph daemons on the compute nodes the ceph.service unit though is a wrapper around the /etc/init.d/ceph LSB script, not a systemd service unit and the init script is enabled for rc levels 2,3,4 and 5 systemd will report then this unit as "active (exited)" on all nodes because have launched the script, you should have only ceph-mon on controllers and only ceph-osd on ceph nodes running though can you please check if this applies to your env and update the ticket?
1) ceph is "active (exited)" on the controller, and ceph-mon is running. 2) ceph-osd is running on the ceph node. 3) On the computes there is only ceph which is "active (exited)" So, it looks as you expected it to. However, why then do we need ceph on the computes if it will never run anything? Shouldn't it be disabled at boot?
we need to install ceph on the compute nodes because they use the python bindings and the cli tools (they are ceph clients) the init.d script is enabled by a postinstall script of the 'ceph' package in via chkconfig; the LSB script in /etc/init.d/ceph says so: # Provides: ceph # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 and the postinstall script does the following: postinstall scriptlet (using /bin/sh): /sbin/ldconfig /sbin/chkconfig --add ceph so we don't have much control over this process in tripleo the init script seems to be doing the right thing though, on computes it does not enable any service because computes are not listed as Monitors, OSDs or MDSs while on controllers and ceph nodes it is starting what is expected, as per your comment #3
Since we use a single unified image for all roles, the ceph package is installed on the compute nodes even though they are just a client and they don't need it, and with this package this service runs (and exits) at boot time. It might be worth it to disable this service on compute nodes, but further investigation has to be done to make sure that such a change won't break some required ceph logic...
Compute nodes do need the ceph python bindings and the ceph cli tools, just not the mon/osd/mds daemons so there is no use for the init script The init script is installed and also enabled by default by the package though, see comment #4 It looks like the most sensible thing to do would be to not enable the init script by default from the package. If so, I think the bug should be reassigned to ceph then.
The right thing to do would be not to install ceph, and only install ceph-common (which are the client tools). Our problem, though, is that we have 1 image for all roles... Therefor the way I see it we have 2 options; either disable the service on the compute nodes, or alternatively just forget about this bug (perhaps just document this as a known issue). I don't agree that the bug should be reassigned to the ceph team, you simply can't tell them that their enabling of the service by default is a bug.
Udi, if enabling the init script by default is not a bug (which indeed I do not think it is) then this might get fixed by the packages being split; adding kdreyer who might help in this regard. Ken, is there a subset of ceph packages we can/should install on the compute nodes to get the cli tools and the python bindings but not the init script?
This is what we see on the compute node: rpm -qa | grep ceph libcephfs1-0.80.7-0.4.el7.x86_64 python-cephfs-0.80.7-0.4.el7.x86_64 ceph-0.80.7-0.4.el7.x86_64 ceph-common-0.80.7-2.el7.x86_64 I think we would not have this issue if we only installed ceph-common, and not install the ceph server. Our problem is that we install everything on all nodes, and then the division to roles is done by enabling/disabling the appropriate services. Splitting the packages (if they're not split already) won't help us, if I understand correctly.
(In reply to Giulio Fidente from comment #8) > Ken, is there a subset of ceph packages we can/should install on the compute > nodes to get the cli tools and the python bindings but not the init script? Yes, the idea of the "ceph-common" package is that it's a "ceph clients" package, and it should have all of the utilities that you'd want on your client. This package has no init script because there are no persistent daemons here. For servers, the package is named "ceph". And that server package will ultimately be split into ceph/ceph-mon/ceph-osd/ceph-mds. (The split is already done in RH Ceph downstream, and I'm still working on getting the split done upstream.) This is the package that has the init script.
Please note that as of Ceph 10.2.0 "Jewel", the server packages are now split, and "ceph" is simply an empty meta-package. The relevant server packages are "ceph-osd", "ceph-mon", or "ceph-mds".
This bug is against a Version which has reached End of Life. If it's still present in supported release (http://releases.openstack.org), please update Version and reopen.