Bug 1802815 - Using a wrongly formatted CephX secret causes the deployment to fail
Summary: Using a wrongly formatted CephX secret causes the deployment to fail
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.0 (Train)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: z2
: 16.0 (Train on RHEL 8.1)
Assignee: Giulio Fidente
QA Contact: Yogev Rabl
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-13 23:04 UTC by Darin Sorrentino
Modified: 2020-05-14 12:16 UTC (History)
11 users (show)

Fixed In Version: openstack-tripleo-heat-templates-11.3.2-0.20200405044622.ec9970c.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-14 12:15:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
ansible.log from the deployment attempt (111.01 KB, application/gzip)
2020-02-13 23:04 UTC, Darin Sorrentino
no flags Details
Templates and deployment script (8.34 KB, application/gzip)
2020-02-14 22:30 UTC, Darin Sorrentino
no flags Details
ceph_ansible_command.log (1.89 MB, application/gzip)
2020-02-14 22:31 UTC, Darin Sorrentino
no flags Details
Logs requested by Giulio (1.90 MB, application/gzip)
2020-02-17 13:53 UTC, Darin Sorrentino
no flags Details
Including the file I missed because I apparently didn't read the whole e-mail first. Sorry! (8.43 KB, text/plain)
2020-02-17 13:59 UTC, Darin Sorrentino
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1864185 0 None None None 2020-02-21 10:37:19 UTC
OpenStack gerrit 709094 0 None MERGED Check Ceph*Key value format and halt on error 2020-06-17 05:26:39 UTC
OpenStack gerrit 710799 0 None MERGED Check Ceph*Key value format and halt on error 2020-06-17 05:26:38 UTC
Red Hat Product Errata RHBA-2020:2114 0 None None None 2020-05-14 12:16:01 UTC

Description Darin Sorrentino 2020-02-13 23:04:26 UTC
Created attachment 1663019 [details]
ansible.log from the deployment attempt

Description of problem:

Following instructions:

https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/integrating_an_overcloud_with_an_existing_red_hat_ceph_cluster/index

Deployment fails with:

        "failed: [overcloud-novacompute-0 -> 192.168.200.121] (item={'caps': {'mgr': 'allow *', 'mon': 'profile rbd', 'osd': 'profile rbd pool=volumes, profile rbd pool=backups, profile rbd pool=vms, profile rbd
 pool=images, profile rbd pool=metrics'}, 'key': 'AQBqbcRdoQJwEhAA1dKnpQK6sT53EV0F12TFZA', 'mode': '0600', 'name': 'client.openstack'}) => changed=false ",
        "  msg: path /etc/ceph/ceph.client.openstack.keyring does not exist",             
        "  path: /etc/ceph/ceph.client.openstack.keyring",                                                                                                                                                        


Attaching ansible.log.gz

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 John Fulton 2020-02-14 15:41:00 UTC
Please supply your complete heat templates, ceph-ansible version, and ceph container version. I'll set a needinfo in the bug. I don't want to close it for lack of information. I want to find what went wrong and help you get this working.

Comment 2 John Fulton 2020-02-14 15:59:38 UTC
I was able to determine which ceph container you're using based on the ansible log: 

        "ceph-container-common : pulling director16.ctlplane.homelab.net:8787/rhceph/rhceph-4-rhel8:4-14 image -- 24.74s",

If you can it would make things easier if you could supply the isolated ceph-ansible log. You provided the config-download ansible log but this might be a ceph-ansible bug so I'd like to get the clean ceph-ansible log. The path is indicated in the ansible log at:

[fultonj@runcible bz1802815]$ grep immediate\ log ansible.log
2020-02-13 17:47:43,423 p=256064 u=mistral |  TASK [tripleo-ceph-run-ansible : run ceph-ansible (immediate log at /var/lib/mistral/overcloud/ceph-ansible/ceph_ansible_command.log)] ***
[fultonj@runcible bz1802815]$ 

One thing I definitely need though is the version of ceph-ansible you're using: 

In summary I'm requesting the following from the undercloud:

- /var/lib/mistral/overcloud/ceph-ansible/ceph_ansible_command.log
- the output of 'rpm -q ceph-ansible' 

Thanks.

Comment 3 John Fulton 2020-02-14 16:02:33 UTC
(In reply to John Fulton from comment #2)
> In summary I'm requesting the following from the undercloud:
> 
> - /var/lib/mistral/overcloud/ceph-ansible/ceph_ansible_command.log
> - the output of 'rpm -q ceph-ansible' 

and your tripleo heat templates

Comment 4 Darin Sorrentino 2020-02-14 22:30:51 UTC
Created attachment 1663185 [details]
Templates and deployment script

I removed most of the key from the containers yaml file for the BZ.

Comment 5 Darin Sorrentino 2020-02-14 22:31:23 UTC
Created attachment 1663186 [details]
ceph_ansible_command.log

Comment 6 Darin Sorrentino 2020-02-14 22:32:32 UTC
I've updated the requested files.

(undercloud) [stack@director16 ~]$ rpm -q ceph-ansible
ceph-ansible-4.0.14-1.el8cp.noarch
(undercloud) [stack@director16 ~]$ 

Please let me know if you need anything else. I will keep this deployment in the current state until you don't need anything else.

Comment 8 Darin Sorrentino 2020-02-17 13:53:29 UTC
Created attachment 1663520 [details]
Logs requested by Giulio

Comment 9 Darin Sorrentino 2020-02-17 13:59:11 UTC
Created attachment 1663523 [details]
Including the file I missed because I apparently didn't read the whole e-mail first. Sorry!

Comment 12 John Fulton 2020-02-17 15:11:46 UTC
Darin,

The THT had an invalid key as input in. Here's the environment file which was attached to the bug:

[fultonj@runcible ceph-ansible]$ cat custom-ceph-external.yaml 
parameter_defaults:
  CephClusterFSID: aed38a87-94c5-4794-910a-3b4b9a1a0f51
  CephClientKey: AQBqbcRdoQJwEhAA1dKnpQK6sT53EV0F12TFZA
  CephExternalMonHost: 172.16.210.50,172.16.210.60,172.16.210.70

  GnocchiRbdPoolName: metrics

[fultonj@runcible ceph-ansible]$ 

As Dimitri pointed out, the keyring doesn't have 40 characters and doesn't ended with '=='. 

As per the templates "The Ceph client key. Can be created with ceph-authtool --gen-print-key.":

 https://github.com/openstack/tripleo-heat-templates/blob/stable/train/deployment/ceph-ansible/ceph-base.yaml#L129-L132

Comment 23 errata-xmlrpc 2020-05-14 12:15:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2114


Note You need to log in before you can comment on or make changes to this bug.