Description of problem:
The deployment of an overcloud with dedicated Ceph failed with an error:
ceph auth import -i /var/lib/ceph/mds/ceph-overcloud-blockstorage-2/keyring returned 1 instead of one of [0]
The reason is that the keyring is been set from each node that runs the MDS service. But the admin keyring is not set on a node that run only the MDS service (unlike OSDs and monitors, that has the admin keyring).
Version-Release number of selected component (if applicable):
puppet-ceph-2.3.0-0.20170220143056.9f52c30.el7ost.noarch
How reproducible:
100%
Steps to Reproduce:
1. Set in roles_data.yaml a node type with Ceph MDS dedicated node (without any other Ceph service)
2. Deploy the Overcloud with at least 1 dedicated MDS node.
Actual results:
Deployment failed with an error
Error: /bin/true # comment to satisfy puppet syntax requirements
set -ex
ceph auth import -i /var/lib/ceph/mds/ceph-overcloud-blockstorage-2/keyring returned 1 instead of one of [0]ESC[0m
Error: /Stage[main]/Ceph::Profile::Mds/Ceph::Key[mds.overcloud-blockstorage-2]/Exec[ceph-injectkey-mds.overcloud-blockstorage-2]/returns: change from notrun to 0 failed: /bin/true # comment to satisfy puppet syntax requirements
set -ex
ceph auth import -i /var/lib/ceph/mds/ceph-overcloud-blockstorage-2/keyring returned 1 instead of one of [0]ESC[0m
overcloud.AllNodesDeploySteps.ControllerDeployment_Step3:
resource_type: OS::Heat::StructuredDeploymentGroup
physical_resource_id: 21abeda6-9468-4b5e-9f45-d7e3919b13de
status: CREATE_FAILED
status_reason: |
CREATE aborted
overcloud.AllNodesDeploySteps.CephStorageDeployment_Step3:
resource_type: OS::Heat::StructuredDeploymentGroup
physical_resource_id: a53adb9a-9b36-437a-9a70-41084fbd2fb6
status: CREATE_FAILED
status_reason: |
CREATE aborted
Expected results:
The deployment is successful
Additional info:
(In reply to Yogev Rabl from comment #2)
> There is a workaround for this bug, in role_data.yaml, add the role
> 'CephClient' to the node type with 'CephMDS'.
hi Yogev, thanks.
While we work on a general fix, the general rule for the CephClient service is as follows:
1) operatotors should always include the CephClient service where there is a Ceph service *except* when on that same role the CephMon service is also deployed
2) if there is CephMon on a role, CephClient should *not* be enabled on that same role
Description of problem: The deployment of an overcloud with dedicated Ceph failed with an error: ceph auth import -i /var/lib/ceph/mds/ceph-overcloud-blockstorage-2/keyring returned 1 instead of one of [0] The reason is that the keyring is been set from each node that runs the MDS service. But the admin keyring is not set on a node that run only the MDS service (unlike OSDs and monitors, that has the admin keyring). Version-Release number of selected component (if applicable): puppet-ceph-2.3.0-0.20170220143056.9f52c30.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Set in roles_data.yaml a node type with Ceph MDS dedicated node (without any other Ceph service) 2. Deploy the Overcloud with at least 1 dedicated MDS node. Actual results: Deployment failed with an error Error: /bin/true # comment to satisfy puppet syntax requirements set -ex ceph auth import -i /var/lib/ceph/mds/ceph-overcloud-blockstorage-2/keyring returned 1 instead of one of [0]ESC[0m Error: /Stage[main]/Ceph::Profile::Mds/Ceph::Key[mds.overcloud-blockstorage-2]/Exec[ceph-injectkey-mds.overcloud-blockstorage-2]/returns: change from notrun to 0 failed: /bin/true # comment to satisfy puppet syntax requirements set -ex ceph auth import -i /var/lib/ceph/mds/ceph-overcloud-blockstorage-2/keyring returned 1 instead of one of [0]ESC[0m overcloud.AllNodesDeploySteps.ControllerDeployment_Step3: resource_type: OS::Heat::StructuredDeploymentGroup physical_resource_id: 21abeda6-9468-4b5e-9f45-d7e3919b13de status: CREATE_FAILED status_reason: | CREATE aborted overcloud.AllNodesDeploySteps.CephStorageDeployment_Step3: resource_type: OS::Heat::StructuredDeploymentGroup physical_resource_id: a53adb9a-9b36-437a-9a70-41084fbd2fb6 status: CREATE_FAILED status_reason: | CREATE aborted Expected results: The deployment is successful Additional info: