RDO tickets are now tracked in Jira https://issues.redhat.com/projects/RDO/issues/
Bug 1218168 - ceph.service should only be running on the ceph nodes, not on the controller and compute nodes
Summary: ceph.service should only be running on the ceph nodes, not on the controller ...
Keywords:
Status: CLOSED EOL
Alias: None
Product: RDO
Classification: Community
Component: openstack-tripleo
Version: trunk
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: Kilo
Assignee: Giulio Fidente
QA Contact: Shai Revivo
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-05-04 10:51 UTC by Udi Kalifon
Modified: 2016-05-19 15:44 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-05-19 15:44:40 UTC
Embargoed:


Attachments (Terms of Use)

Description Udi Kalifon 2015-05-04 10:51:57 UTC
Description of problem:
When deploying an overcloud, we see the ceph service installed and enabled on all the nodes (not just on the ceph nodes):

systemctl |grep ceph
ceph.service       loaded active exited    LSB: Start Ceph distributed file system daemons at boot time


Version-Release number of selected component (if applicable):
Using sprint 5 bits


How reproducible:
100% 


Steps to Reproduce:
1. Install a default overcloud, even one that's without any ceph nodes at all
2. Log in to the controller or the compute nodes
3. Run "systemctl |grep ceph"


Actual results:
ceph.service is installed and active


Expected results:
You need only ceph-mon, and it should not be enabled if you didn't install a ceph node and you're not using ceph.

Comment 1 Udi Kalifon 2015-05-04 10:53:08 UTC
In addition, the admin key (/etc/ceph/ceph.client.admin.keyring) is not needed on the compute nodes.

Comment 2 Giulio Fidente 2015-05-04 12:06:48 UTC
ceph has different daemons for different purposes, the Overcloud is expected to have ceph-mon running on controller nodes, ceph-osd on ceph nodes and no ceph daemons on the compute nodes

the ceph.service unit though is a wrapper around the /etc/init.d/ceph LSB script, not a systemd service unit and the init script is enabled for rc levels 2,3,4 and 5

systemd will report then this unit as "active (exited)" on all nodes because have launched the script, you should have only ceph-mon on controllers and only ceph-osd on ceph nodes running though

can you please check if this applies to your env and update the ticket?

Comment 3 Udi Kalifon 2015-05-04 14:55:40 UTC
1) ceph is "active (exited)" on the controller, and ceph-mon is running.

2) ceph-osd is running on the ceph node.

3) On the computes there is only ceph which is "active (exited)"

So, it looks as you expected it to. However, why then do we need ceph on the computes if it will never run anything? Shouldn't it be disabled at boot?

Comment 4 Giulio Fidente 2015-05-04 16:08:36 UTC
we need to install ceph on the compute nodes because they use the python bindings and the cli tools (they are ceph clients)

the init.d script is enabled by a postinstall script of the 'ceph' package in via chkconfig; the LSB script in /etc/init.d/ceph says so:

  # Provides:          ceph
  # Default-Start:     2 3 4 5
  # Default-Stop:      0 1 6

and the postinstall script does the following:

  postinstall scriptlet (using /bin/sh):
  /sbin/ldconfig
  /sbin/chkconfig --add ceph

so we don't have much control over this process in tripleo

the init script seems to be doing the right thing though, on computes it does not enable any service because computes are not listed as Monitors, OSDs or MDSs while on controllers and ceph nodes it is starting what is expected, as per your comment #3

Comment 5 Udi Kalifon 2015-05-04 17:40:55 UTC
Since we use a single unified image for all roles, the ceph package is installed on the compute nodes even though they are just a client and they don't need it, and with this package this service runs (and exits) at boot time. It might be worth it to disable this service on compute nodes, but further investigation has to be done to make sure that such a change won't break some required ceph logic...

Comment 6 Giulio Fidente 2015-05-04 18:41:18 UTC
Compute nodes do need the ceph python bindings and the ceph cli tools, just not the mon/osd/mds daemons so there is no use for the init script

The init script is installed and also enabled by default by the package though, see comment #4

It looks like the most sensible thing to do would be to not enable the init script by default from the package. If so, I think the bug should be reassigned to ceph then.

Comment 7 Udi Kalifon 2015-05-06 12:08:49 UTC
The right thing to do would be not to install ceph, and only install ceph-common (which are the client tools). Our problem, though, is that we have 1 image for all roles... Therefor the way I see it we have 2 options; either disable the service on the compute nodes, or alternatively just forget about this bug (perhaps just document this as a known issue).

I don't agree that the bug should be reassigned to the ceph team, you simply can't tell them that their enabling of the service by default is a bug.

Comment 8 Giulio Fidente 2015-05-06 12:52:47 UTC
Udi, if enabling the init script by default is not a bug (which indeed I do not think it is) then this might get fixed by the packages being split; adding kdreyer who might help in this regard.

Ken, is there a subset of ceph packages we can/should install on the compute nodes to get the cli tools and the python bindings but not the init script?

Comment 9 Udi Kalifon 2015-05-06 13:36:43 UTC
This is what we see on the compute node:

rpm -qa | grep ceph
libcephfs1-0.80.7-0.4.el7.x86_64
python-cephfs-0.80.7-0.4.el7.x86_64
ceph-0.80.7-0.4.el7.x86_64
ceph-common-0.80.7-2.el7.x86_64

I think we would not have this issue if we only installed ceph-common, and not install the ceph server. Our problem is that we install everything on all nodes, and then the division to roles is done by enabling/disabling the appropriate services. Splitting the packages (if they're not split already) won't help us, if I understand correctly.

Comment 10 Ken Dreyer (Red Hat) 2015-05-18 17:54:34 UTC
(In reply to Giulio Fidente from comment #8)
> Ken, is there a subset of ceph packages we can/should install on the compute
> nodes to get the cli tools and the python bindings but not the init script?

Yes, the idea of the "ceph-common" package is that it's a "ceph clients" package, and it should have all of the utilities that you'd want on your client. This package has no init script because there are no persistent daemons here.

For servers, the package is named "ceph". And that server package will ultimately be split into ceph/ceph-mon/ceph-osd/ceph-mds. (The split is already done in RH Ceph downstream, and I'm still working on getting the split done upstream.) This is the package that has the init script.

Comment 13 Ken Dreyer (Red Hat) 2016-04-22 14:56:51 UTC
Please note that as of Ceph 10.2.0 "Jewel", the server packages are now split, and "ceph" is simply an empty meta-package.

The relevant server packages are "ceph-osd", "ceph-mon", or "ceph-mds".

Comment 14 Chandan Kumar 2016-05-19 15:44:40 UTC
This bug is against a Version which has reached End of Life.
If it's still present in supported release (http://releases.openstack.org), please update Version and reopen.


Note You need to log in before you can comment on or make changes to this bug.