Bug 1319335

Summary: Document implication of collocating Ceph Monitors and OSP controllers
Product: Red Hat OpenStack Reporter: Alexandre Marangone <amarango>
Component: documentationAssignee: Dan Macpherson <dmacpher>
Status: CLOSED CURRENTRELEASE QA Contact: RHOS Documentation Team <rhos-docs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: amarango, dmacpher, jliberma, mburns, smustard, srevivo
Target Milestone: ---Keywords: Documentation
Target Release: 8.0 (Liberty)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-07-03 02:39:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Alexandre Marangone 2016-03-18 21:11:02 UTC
Description of problem:
In the current hardware specification requirement in our documentation, there's no mention of a config that runs both OSP Controllers and Ceph Monitors on the same nodes.  We have heard of customers running into performance issues when running both on the same node.
The implication of running both set of services on the same nodes should be explained.

Comment 2 Dan Macpherson 2016-04-26 02:05:44 UTC
(In reply to Alexandre Marangone from comment #0)
> Description of problem:
> In the current hardware specification requirement in our documentation,
> there's no mention of a config that runs both OSP Controllers and Ceph
> Monitors on the same nodes.  We have heard of customers running into
> performance issues when running both on the same node.
> The implication of running both set of services on the same nodes should be
> explained.

Hi Alexandre,

Can you provide some further details? Specifically:

* What performance issues are customers running into?
* What specific implications should be documented?

- Dan

Comment 3 Alexandre Marangone 2016-04-26 15:15:38 UTC
Hi Dan,

I'm adding Sheldon Mustard, he can comment better on the performance implication since one of his customer ran into issues in a high performance environment. 

There's also https://access.redhat.com/documentation/en/red-hat-openstack-platform/8/director-installation-and-usage/24-overcloud-requirement where the recommendations are exclusively made for OSP-Controller and do not take into account that OSP-d will colocate the Ceph Mons and OSP Controllers.
Usually for a Ceph Mon, we recommend to have SSDs for the mon store and more memory as well (at least 16GB)

Comment 4 Sheldon Mustard 2016-04-26 17:19:50 UTC
In the situations I have seen the performance issues generally came down to local root disk performance and ram utilization. Both of these can obviously be overcome but as Alex said I think either a disclaimer or a bump in the recommended specs would make sense.

The other issue is around availability, with the mons running on the openstack controllers they become a critical piece of the availability of the ceph cluster. AFAIK the controllers could have issues and the cloud overall would be fine but with the mons on them the ceph cluster would have issues when/if you lost >50% of them. Not a case which would happen often but again I think we should warn customers somewhere about this risk.

Comment 5 jliberma@redhat.com 2016-05-23 20:05:46 UTC
Dan -- the main issue here is that we must encourage people not to under-spec the controller nodes if they will also be used as ceph monitors.

So in the hardware requirements section, we should recommend that the controller nodes meet the minimum recommended requirements for a Ceph monitor node if they will be used as such.

I don't know the exact requirements, but to minimize the risk of performance problems, we should recommend:

1. At least 16 GB of RAM
2. SSD drives for the monitor store

There might be some additional doc work required to specify the location of the monitor store in OSPd to ensure it uses the SSD drives, or to mount the SSD drives in the appropriate place to ensure director uses them.

Thanks again, Jacob

Comment 6 Dan Macpherson 2016-05-25 03:53:38 UTC
(In reply to jliberma from comment #5)
> There might be some additional doc work required to specify the location of
> the monitor store in OSPd to ensure it uses the SSD drives, or to mount the
> SSD drives in the appropriate place to ensure director uses them.

I might need some help with this.

It looks like we'll need a script that does the following:

1. Checks /etc/fstab for an entry for /var/lib/ceph/mon. If it does, stop. If it doesn't continue.
2. Identifies the disk to use for the mon data
3. Formats it and adds a partition
4. Adds a mount in /etc/fstab at /var/lib/ceph/mon

The tricky part for me is step 2... What would be the best way to identify the disk to use?

jliberman, smustard, amarango -- any suggestions?

Comment 8 Dan Macpherson 2016-08-19 05:00:29 UTC
Does anyone have any further updates on comment #6?

Comment 9 jliberma@redhat.com 2016-08-22 13:21:23 UTC
I don't have a suggestion for identifying SSD

Comment 10 Dan Macpherson 2017-07-03 02:39:37 UTC
I think it's safe to close this bug since the composable services allow you to split the Ceph Mon from the Controller if need be. This pretty much mitigates the issues with keeping the Ceph Mon on the Controller nodes. Beyond this, I don't think I can provide any further documentation than the commit implemented in comment #7 (which is now published [1]).

If further documentation is required for this issue, please feel free to reopen this BZ.

[1] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/11/html/red_hat_ceph_storage_for_the_overcloud/introduction#setting_requirements