Bug 1423001
Summary: | Openstack Director updates to an even number of dedicate Ceph monitor nodes | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Yogev Rabl <yrabl> |
Component: | openstack-tripleo-validations | Assignee: | Giulio Fidente <gfidente> |
Status: | CLOSED ERRATA | QA Contact: | Yogev Rabl <yrabl> |
Severity: | urgent | Docs Contact: | Derek <dcadzow> |
Priority: | urgent | ||
Version: | 11.0 (Ocata) | CC: | aschultz, gfidente, jjoyce, jomurphy, jschluet, kbader, m.andre, mburns, mcornea, rhel-osp-director-maint, slinaber, tvignaud |
Target Milestone: | beta | ||
Target Release: | 11.0 (Ocata) | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | openstack-tripleo-validations-5.4.0-6.el7ost | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-05-17 20:00:07 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Yogev Rabl
2017-02-16 19:59:34 UTC
We don't support this kind of validation for roles or node counts so this would need to be an RFE Yogev, from the 'ceph status' output you sent me, we appear to have completed successfully the update producing a 4 nodes cluster in healthy state ... so I agree it would be better to use an uneven number of monitors but ceph itself doesn't prevent you from using an even number so I am not sure director should. In addition to the description: A fresh deployment of two dedicated Ceph monitor nodes ended successfully with both of them in quorum. The templates were set to use the block-storage node as a dedicated Ceph monitor node. The Overcloud topology was: - 3 Control nodes - 2 Block storage nodes - 3 Ceph storage nodes (each with 10 OSDs) - 2 Compute nodes The deployment started without any warning or a sign that there will be an even number of Ceph monitors in the cluster. I am adding a warning message in the post-deployment validations printed if the cluster is in HEALTH_WARN state. If Ceph returns HEALTH_OK with two and/or any other even number of ceph-mon instances, I don't think we should stop deployment of an even number of nodes in tripleo. The problem isn't so much even, or odd, it's that three monitors are required for HA. A transitional state of 4 monitors is not problematic. The problem arises during leader election (paxos). There are situations where you would want to have an even number: * Scaling from 3 monitors to an eventual 5 * Provisioning a 4th monitor with the intention of retiring an old monitor When an operator goes to deploy a HA OSP control plane, is this something that is enforced programmatically? For example, if HA OSP requires 3 controller nodes, and the templates contain 2, do we block installation? If we do block installation, then we should have a way of enforcing similar requirements for other components (eg. Ceph). hi Kyle, thanks for commenting. Currently OSPd does not enforce (nor block) the deployment of a specific number of MONs, OSDs or even MDSs. Instead this BZ added a post-install validation task which prints a warning message if at the end of the deployment the Ceph cluster is not in HEALTH_OK. My goal is to make OSPd more verbose if Ceph is in warning state; given that deploying an odd number of nodes isn't even worth a warning in Ceph, I don't think OSPd should prevent that. verified on openstack-tripleo-validations-5.4.0-7.el7ost.noarch Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1245 |