Bug 1613155 - [ceph-ansible] Do not allow ceph cluster creation when mon_use_fqdn and mds_use_fqdn set to true
Summary: [ceph-ansible] Do not allow ceph cluster creation when mon_use_fqdn and mds_u...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Ansible
Version: 3.1
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: 3.1
Assignee: Sébastien Han
QA Contact: Sidhant Agrawal
Aron Gunn
URL:
Whiteboard:
Depends On:
Blocks: 1584264
TreeView+ depends on / blocked
 
Reported: 2018-08-07 07:01 UTC by Sidhant Agrawal
Modified: 2018-09-26 18:24 UTC (History)
13 users (show)

Fixed In Version: RHEL: ceph-ansible-3.1.0-0.1.rc18 Ubuntu: ceph-ansible_3.1.0~rc18-2redhat1
Doc Type: Bug Fix
Doc Text:
.Setting the `mon_use_fqdn` or the `mds_use_fqdn` options to `true` fails the Ceph Ansible playbook Starting with {product} 3.1, Red Hat no longer supports deployments with fully qualified domain names. If either the `mon_use_fqdn` or `mds_use_fqdn` options are set to `true`, then the Ceph Ansible playbook will fail. If the storage cluster is already configured with fully qualified domain names, then you must set the `use_fqdn_yes_i_am_sure` option to `true` in the `group_vars/all.yml` file.
Clone Of:
Environment:
Last Closed: 2018-09-26 18:23:45 UTC
Embargoed:


Attachments (Terms of Use)
ansible-playbook logs (1.99 MB, text/plain)
2018-08-07 07:01 UTC, Sidhant Agrawal
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible pull 2991 0 None closed various fixes 2020-11-04 01:26:57 UTC
Red Hat Product Errata RHBA-2018:2819 0 None None None 2018-09-26 18:24:47 UTC

Description Sidhant Agrawal 2018-08-07 07:01:38 UTC
Created attachment 1473856 [details]
ansible-playbook logs

Description of problem:
Ansible Playbook fails during installation when mon_use_fqdn: true in the following task:
TASK [ceph-mon : create ceph mgr keyring(s) when mon is containerized] *********************************************************************************************************
2018-08-05 05:38:25,749 p=30613 u=ubuntu |  task path: /usr/share/ceph-ansible/roles/ceph-mon/tasks/docker/main.yml:97
2018-08-05 05:38:25,749 p=30613 u=ubuntu |  Sunday 05 August 2018  05:38:25 +0000 (0:00:00.031)       0:10:58.337 ********* 
2018-08-05 05:38:25,821 p=30613 u=ubuntu |   [WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: {{ groups.get(mgr_group_name, []) | length > 0 }}

2018-08-05 05:38:25,890 p=30613 u=ubuntu |  Using module file /usr/lib/python2.7/site-packages/ansible/modules/commands/command.py
2018-08-05 05:43:26,360 p=30613 u=ubuntu |  failed: [magna044] (item=magna006) => {
    "changed": false, 
    "cmd": [
        "docker", 
        "exec", 
        "ceph-mon-magna044", 
        "ceph", 
        "--cluster", 
        "ceph", 
        "auth", 
        "get-or-create", 
        "mgr.magna006", 
        "mon", 
        "allow profile mgr", 
        "osd", 
        "allow *", 
        "mds", 
        "allow *", 
        "-o", 
        "/etc/ceph/ceph.mgr.magna006.keyring"
    ], 
    "delta": "0:05:00.249756", 
    "end": "2018-08-05 05:43:26.338793", 
    "invocation": {
        "module_args": {
            "_raw_params": "docker exec ceph-mon-magna044 ceph --cluster ceph auth get-or-create mgr.magna006 mon 'allow profile mgr' osd 'allow *' mds 'allow *' -o /etc/ceph/ceph.mgr.magna006.keyring", 
            "_uses_shell": false, 
            "chdir": null, 
            "creates": "/etc/ceph/ceph.mgr.magna006.keyring", 
            "executable": null, 
            "removes": null, 
            "stdin": null, 
            "warn": true
        }
    }, 
    "item": "magna006", 
    "msg": "non-zero return code", 
    "rc": 1, 
    "start": "2018-08-05 05:38:26.089037", 
    "stderr": "2018-08-05 05:43:26.301639 7fa9e165b700  0 monclient(hunting): authenticate timed out after 300\n2018-08-05 05:43:26.301680 7fa9e165b700  0 librados: client.admin authentication error (110) Connection timed out\n[errno 110] error connecting to the cluster", 
    "stderr_lines": [
        "2018-08-05 05:43:26.301639 7fa9e165b700  0 monclient(hunting): authenticate timed out after 300", 
        "2018-08-05 05:43:26.301680 7fa9e165b700  0 librados: client.admin authentication error (110) Connection timed out", 
        "[errno 110] error connecting to the cluster"
    ], 
    "stdout": "", 
    "stdout_lines": []
}

Version-Release number of selected component (if applicable):

ceph-ansible-3.1.0-0.1.rc12.el7cp.noarch

How reproducible:
Always

Steps to Reproduce:
1. Follow doc to deploy containerised ceph cluster with mon_use_fqdn: true in all.yml
2. Run playbook

Actual results:
Deployment of containerised ceph cluster with mon_use_fqdn: true in all.yml fails.

Expected results:
Deployment of containerised ceph cluster with mon_use_fqdn: true in all.yml should succeed.

Additional info:

Comment 3 Sébastien Han 2018-08-07 14:23:56 UTC
This feature is not supported anymore, we only keep it alive for existing clusters. So it is not encouraged to use it on new deployment.

We need to reflect this on the doc, I don't see any bug here.
Thanks.

Comment 4 Harish NV Rao 2018-08-09 13:29:01 UTC
(In reply to leseb from comment #3)
> This feature is not supported anymore, we only keep it alive for existing
> clusters. So it is not encouraged to use it on new deployment.
> 
> We need to reflect this on the doc, I don't see any bug here.
> Thanks.

Based on above comment i am changing the component to documentation

Comment 5 Harish NV Rao 2018-08-09 13:30:46 UTC
(In reply to Harish NV Rao from comment #4)
> (In reply to leseb from comment #3)
> > This feature is not supported anymore, we only keep it alive for existing
> > clusters. So it is not encouraged to use it on new deployment.
> > 
> > We need to reflect this on the doc, I don't see any bug here.
> > Thanks.
> 
> Based on above comment i am changing the component to documentation

Sebastien, I saw your previous update on attaching a pr to this bz. Will this be fixed in 3.1 as part of ceph-ansible?

Comment 6 Sébastien Han 2018-08-09 13:37:37 UTC
It'll be fixed in the sense that we don't allow this kind of deployments anymore. So doc is still the right component.

Comment 7 Harish NV Rao 2018-08-09 13:48:37 UTC
In 3.1, we were able to deploy the baremetal RHEL based ceph cluster with 'mon_use_fqdn: true'. Is this option going to be blocked for both baremetal and container now?

Comment 8 Sébastien Han 2018-08-09 13:56:33 UTC
Yes Harish, this option is going to be blocked as of 3.1 for both container and non-container deployments.

Comment 9 Anjana Suparna Sriram 2018-08-13 11:31:39 UTC
(In reply to leseb from comment #8)
> Yes Harish, this option is going to be blocked as of 3.1 for both container
> and non-container deployments.

wouldn't this be part of the known issues for 3.1 release? If yes, kindly change the doc type and provide the relevant doc text for this bug.

Comment 10 Sébastien Han 2018-08-14 14:24:19 UTC
Sure, just did.

Comment 15 Harish NV Rao 2018-08-29 08:03:15 UTC
(In reply to leseb from comment #8)
> Yes Harish, this option is going to be blocked as of 3.1 for both container
> and non-container deployments.

Based on above, following needs to be done with this BZ
1) Change the summary to "[ceph-ansible] Do not allow ceph cluster creation when mon_use_fqdn and mds_use_fqdn set to true"
2) QE to verify the BZ by making sure that the cluster creation fails when 'mon_use_fqdn` and `mds_use_fqdn` are set to true.
3) Doc team to move this bug in RN from Known Issue section to the section which tells about issues fixed.

Comment 16 Sidhant Agrawal 2018-08-29 12:04:00 UTC
Verified with ceph-ansible-3.1.0-0.1.rc21.el7cp

The Ceph Ansible playbook fails if either the 'mon_use_fqdn' or 'mds_use_fqdn' options are set to 'true' in all.yml.

Moving the BZ to Verified.

Comment 19 errata-xmlrpc 2018-09-26 18:23:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2819


Note You need to log in before you can comment on or make changes to this bug.