Created attachment 1473856 [details] ansible-playbook logs Description of problem: Ansible Playbook fails during installation when mon_use_fqdn: true in the following task: TASK [ceph-mon : create ceph mgr keyring(s) when mon is containerized] ********************************************************************************************************* 2018-08-05 05:38:25,749 p=30613 u=ubuntu | task path: /usr/share/ceph-ansible/roles/ceph-mon/tasks/docker/main.yml:97 2018-08-05 05:38:25,749 p=30613 u=ubuntu | Sunday 05 August 2018 05:38:25 +0000 (0:00:00.031) 0:10:58.337 ********* 2018-08-05 05:38:25,821 p=30613 u=ubuntu | [WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: {{ groups.get(mgr_group_name, []) | length > 0 }} 2018-08-05 05:38:25,890 p=30613 u=ubuntu | Using module file /usr/lib/python2.7/site-packages/ansible/modules/commands/command.py 2018-08-05 05:43:26,360 p=30613 u=ubuntu | failed: [magna044] (item=magna006) => { "changed": false, "cmd": [ "docker", "exec", "ceph-mon-magna044", "ceph", "--cluster", "ceph", "auth", "get-or-create", "mgr.magna006", "mon", "allow profile mgr", "osd", "allow *", "mds", "allow *", "-o", "/etc/ceph/ceph.mgr.magna006.keyring" ], "delta": "0:05:00.249756", "end": "2018-08-05 05:43:26.338793", "invocation": { "module_args": { "_raw_params": "docker exec ceph-mon-magna044 ceph --cluster ceph auth get-or-create mgr.magna006 mon 'allow profile mgr' osd 'allow *' mds 'allow *' -o /etc/ceph/ceph.mgr.magna006.keyring", "_uses_shell": false, "chdir": null, "creates": "/etc/ceph/ceph.mgr.magna006.keyring", "executable": null, "removes": null, "stdin": null, "warn": true } }, "item": "magna006", "msg": "non-zero return code", "rc": 1, "start": "2018-08-05 05:38:26.089037", "stderr": "2018-08-05 05:43:26.301639 7fa9e165b700 0 monclient(hunting): authenticate timed out after 300\n2018-08-05 05:43:26.301680 7fa9e165b700 0 librados: client.admin authentication error (110) Connection timed out\n[errno 110] error connecting to the cluster", "stderr_lines": [ "2018-08-05 05:43:26.301639 7fa9e165b700 0 monclient(hunting): authenticate timed out after 300", "2018-08-05 05:43:26.301680 7fa9e165b700 0 librados: client.admin authentication error (110) Connection timed out", "[errno 110] error connecting to the cluster" ], "stdout": "", "stdout_lines": [] } Version-Release number of selected component (if applicable): ceph-ansible-3.1.0-0.1.rc12.el7cp.noarch How reproducible: Always Steps to Reproduce: 1. Follow doc to deploy containerised ceph cluster with mon_use_fqdn: true in all.yml 2. Run playbook Actual results: Deployment of containerised ceph cluster with mon_use_fqdn: true in all.yml fails. Expected results: Deployment of containerised ceph cluster with mon_use_fqdn: true in all.yml should succeed. Additional info:
This feature is not supported anymore, we only keep it alive for existing clusters. So it is not encouraged to use it on new deployment. We need to reflect this on the doc, I don't see any bug here. Thanks.
(In reply to leseb from comment #3) > This feature is not supported anymore, we only keep it alive for existing > clusters. So it is not encouraged to use it on new deployment. > > We need to reflect this on the doc, I don't see any bug here. > Thanks. Based on above comment i am changing the component to documentation
(In reply to Harish NV Rao from comment #4) > (In reply to leseb from comment #3) > > This feature is not supported anymore, we only keep it alive for existing > > clusters. So it is not encouraged to use it on new deployment. > > > > We need to reflect this on the doc, I don't see any bug here. > > Thanks. > > Based on above comment i am changing the component to documentation Sebastien, I saw your previous update on attaching a pr to this bz. Will this be fixed in 3.1 as part of ceph-ansible?
It'll be fixed in the sense that we don't allow this kind of deployments anymore. So doc is still the right component.
In 3.1, we were able to deploy the baremetal RHEL based ceph cluster with 'mon_use_fqdn: true'. Is this option going to be blocked for both baremetal and container now?
Yes Harish, this option is going to be blocked as of 3.1 for both container and non-container deployments.
(In reply to leseb from comment #8) > Yes Harish, this option is going to be blocked as of 3.1 for both container > and non-container deployments. wouldn't this be part of the known issues for 3.1 release? If yes, kindly change the doc type and provide the relevant doc text for this bug.
Sure, just did.
In https://github.com/ceph/ceph-ansible/releases/tag/v3.1.0rc18
(In reply to leseb from comment #8) > Yes Harish, this option is going to be blocked as of 3.1 for both container > and non-container deployments. Based on above, following needs to be done with this BZ 1) Change the summary to "[ceph-ansible] Do not allow ceph cluster creation when mon_use_fqdn and mds_use_fqdn set to true" 2) QE to verify the BZ by making sure that the cluster creation fails when 'mon_use_fqdn` and `mds_use_fqdn` are set to true. 3) Doc team to move this bug in RN from Known Issue section to the section which tells about issues fixed.
Verified with ceph-ansible-3.1.0-0.1.rc21.el7cp The Ceph Ansible playbook fails if either the 'mon_use_fqdn' or 'mds_use_fqdn' options are set to 'true' in all.yml. Moving the BZ to Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2819