Bug 1832405

Summary: ceph-ansible admin_secret seems to be different from the one found in client.ceph.admin.keyring
Product: Red Hat OpenStack Reporter: Giulio Fidente <gfidente>
Component: openstack-tripleo-heat-templatesAssignee: Giulio Fidente <gfidente>
Status: CLOSED ERRATA QA Contact: Yogev Rabl <yrabl>
Severity: high Docs Contact:
Priority: high    
Version: 16.0 (Train)CC: akupczyk, amcleod, aschoen, bhubbard, ccamacho, ceph-eng-bugs, dzafman, gcharot, gmeno, gsitlani, johfulto, jpretori, kchai, lbezdick, mburns, morazi, nojha, nthomas, rzarzyns, slinaber, sseshasa, ykaul
Target Milestone: rcKeywords: Triaged
Target Release: 16.1 (Train on RHEL 8.2)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-11.3.2-0.20200603183438.7b2c249.el8ost Doc Type: Removed functionality
Doc Text:
In this release of Red Hat OpenStack Platform, you can no longer customize the Red Hat Ceph Storage cluster admin keyring secret. Instead, the admin keyring secret is generated randomly during initial deployment.
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-07-29 07:52:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ceph.conf
none
quorum_status.out
none
strace_ceph_s.out
none
the inventory file, including all vars set for the run
none
ceph.client.admin.keyring
none
playbook run logs none

Description Giulio Fidente 2020-05-06 16:43:53 UTC
Description of problem:
after upgrading the first MONs member of a Luminous cluster to latest Nautilus container image makes the Nautilus client "ceph -s" command fail and return "[errno 1] error connecting to the cluster" despite the ceph cluster itself appearingly quorate and in good health

attaching ceph.conf ; quorum_status gathered via ceph admin socket and strace output of "ceph -s"

Version-Release number of selected component (if applicable):
ceph-common-14.2.8-53.el8cp.x86_64

How reproducible:
always

Steps to Reproduce:
upgrade first mon instance to rhcs4 container image (using ceph-ansible rolling_update.yml)

Actual results:
"ceph -s" will exit with "[errno 1] error connecting to the cluster"

Comment 1 RHEL Program Management 2020-05-06 16:44:01 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 2 Giulio Fidente 2020-05-06 16:45:53 UTC
Created attachment 1685806 [details]
ceph.conf

Comment 3 Giulio Fidente 2020-05-06 16:46:42 UTC
Created attachment 1685807 [details]
quorum_status.out

Comment 4 Giulio Fidente 2020-05-06 16:47:56 UTC
Created attachment 1685808 [details]
strace_ceph_s.out

Comment 9 Giulio Fidente 2020-05-07 17:12:20 UTC
It looks like we provide to ceph-ansible the correct admin_secret but it is ignored; attaching inventory with full vars, playbook log and ceph.client.admin.keyring

Version of ceph-ansible is:
ceph-ansible-3.2.40-1.el7cp.noarch

Comment 10 Giulio Fidente 2020-05-07 17:17:33 UTC
Created attachment 1686245 [details]
the inventory file, including all vars set for the run

Comment 11 Giulio Fidente 2020-05-07 17:18:59 UTC
Created attachment 1686246 [details]
ceph.client.admin.keyring

Comment 12 Giulio Fidente 2020-05-07 17:23:53 UTC
Created attachment 1686247 [details]
playbook run logs

Comment 22 Giulio Fidente 2020-05-11 14:19:51 UTC
ceph-ansible will generate a random secret for client.admin when none is given but does not support replacing the client.admin secret

considering we have in the field osp13 deployments with client.admin secret created by ceph-ansible which we can't replace on upgrade, the less impactful solution to this issue seems to be to drop support for CephAdminKey

Comment 30 Yogev Rabl 2020-06-15 14:15:33 UTC
verified

Comment 31 Alex McLeod 2020-06-16 12:32:24 UTC
If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text.

If this bug does not require doc text, please set the 'requires_doc_text' flag to '-'.

Comment 35 errata-xmlrpc 2020-07-29 07:52:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3148