Bug 1494643

Summary: [RHCeph 3.0/3.0.0-0.1.rc10.el7cp] Dont run include ceph_keys.yml when monitor's not in quorum
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Vasu Kulkarni <vakulkar>
Component: Ceph-AnsibleAssignee: Sébastien Han <shan>
Status: CLOSED ERRATA QA Contact: Vasu Kulkarni <vakulkar>
Severity: low Docs Contact:
Priority: low    
Version: 3.0CC: adeza, anharris, aschoen, ceph-eng-bugs, ceph-qe-bugs, gmeno, hnallurv, kdreyer, nthomas, sankarshan, seb, vakulkar
Target Milestone: rc   
Target Release: 3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: RHEL: ceph-ansible-3.0.0-0.1.rc12.el7cp Ubuntu: ceph-ansible_3.0.0~rc12-2redhat1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-05 23:45:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vasu Kulkarni 2017-09-22 17:49:36 UTC
Description of problem:

When debugging some of the ansible test failure, I noticed that ceph_keys fails but because monitor is not in quorum, in such cases its better to stop the rest of the playbooks to run since the errors that come out will be just confusing and hard to debug

Full logs:

http://magna002.ceph.redhat.com/vasu-2017-09-21_19:20:48-smoke-luminous---basic-multi/274653/teuthology.log

2017-09-21T23:18:25.920 INFO:teuthology.orchestra.run.pluto007.stdout:
2017-09-21T23:18:25.921 INFO:teuthology.orchestra.run.pluto007.stdout:TASK [ceph-mon : include ceph_keys.yml] ****************************************
2017-09-21T23:18:25.921 INFO:teuthology.orchestra.run.pluto007.stdout:task path: /home/ubuntu/ceph-ansible/roles/ceph-mon/tasks/main.yml:13
2017-09-21T23:18:25.921 INFO:teuthology.orchestra.run.pluto007.stdout:included: /home/ubuntu/ceph-ansible/roles/ceph-mon/tasks/ceph_keys.yml for pluto004.ceph.redhat.com, pluto005.ceph.redhat.com, pluto007.ceph.redhat.com
2017-09-21T23:18:25.921 INFO:teuthology.orchestra.run.pluto007.stdout:
2017-09-21T23:18:25.921 INFO:teuthology.orchestra.run.pluto007.stdout:TASK [ceph-mon : collect admin and bootstrap keys] *****************************
2017-09-21T23:18:25.921 INFO:teuthology.orchestra.run.pluto007.stdout:task path: /home/ubuntu/ceph-ansible/roles/ceph-mon/tasks/ceph_keys.yml:2
2017-09-21T23:18:25.922 INFO:teuthology.orchestra.run.pluto007.stdout:ok: [pluto004.ceph.redhat.com] => {
2017-09-21T23:18:25.922 INFO:teuthology.orchestra.run.pluto007.stdout:    "changed": false,
2017-09-21T23:18:25.922 INFO:teuthology.orchestra.run.pluto007.stdout:    "cmd": [
2017-09-21T23:18:25.922 INFO:teuthology.orchestra.run.pluto007.stdout:        "ceph-create-keys",
2017-09-21T23:18:25.922 INFO:teuthology.orchestra.run.pluto007.stdout:        "--cluster",
2017-09-21T23:18:25.922 INFO:teuthology.orchestra.run.pluto007.stdout:        "ceph",
2017-09-21T23:18:25.922 INFO:teuthology.orchestra.run.pluto007.stdout:        "-i",
2017-09-21T23:18:25.923 INFO:teuthology.orchestra.run.pluto007.stdout:        "pluto004"
2017-09-21T23:18:25.923 INFO:teuthology.orchestra.run.pluto007.stdout:    ],
2017-09-21T23:18:25.923 INFO:teuthology.orchestra.run.pluto007.stdout:    "delta": "0:00:02.422280",
2017-09-21T23:18:25.923 INFO:teuthology.orchestra.run.pluto007.stdout:    "end": "2017-09-22 03:20:13.828327",
2017-09-21T23:18:25.923 INFO:teuthology.orchestra.run.pluto007.stdout:    "failed": false,
2017-09-21T23:18:25.923 INFO:teuthology.orchestra.run.pluto007.stdout:    "failed_when_result": false,
2017-09-21T23:18:25.923 INFO:teuthology.orchestra.run.pluto007.stdout:    "rc": 0,
2017-09-21T23:18:25.924 INFO:teuthology.orchestra.run.pluto007.stdout:    "start": "2017-09-22 03:20:11.406047"
2017-09-21T23:18:25.924 INFO:teuthology.orchestra.run.pluto007.stdout:}
2017-09-21T23:18:25.924 INFO:teuthology.orchestra.run.pluto007.stdout:
2017-09-21T23:18:25.924 INFO:teuthology.orchestra.run.pluto007.stdout:STDERR:
2017-09-21T23:18:25.924 INFO:teuthology.orchestra.run.pluto007.stdout:
2017-09-21T23:18:25.924 INFO:teuthology.orchestra.run.pluto007.stdout:INFO:ceph-create-keys:ceph-mon is not in quorum: u'probing'
2017-09-21T23:18:25.924 INFO:teuthology.orchestra.run.pluto007.stdout:INFO:ceph-create-keys:Talking to monitor...
2017-09-21T23:18:25.925 INFO:teuthology.orchestra.run.pluto007.stdout:Error ENOENT: failed to find client.admin in keyring
2017-09-21T23:18:25.925 INFO:teuthology.orchestra.run.pluto007.stdout:INFO:ceph-create-keys:Talking to monitor...
2017-09-21T23:18:25.925 INFO:teuthology.orchestra.run.pluto007.stdout:INFO:ceph-create-keys:Talking to monitor...
2017-09-21T23:18:25.925 INFO:teuthology.orchestra.run.pluto007.stdout:INFO:ceph-create-keys:Talking to monitor...
2017-09-21T23:18:25.925 INFO:teuthology.orchestra.run.pluto007.stdout:INFO:ceph-create-keys:Talking to monitor...

Comment 2 seb 2017-09-25 11:51:24 UTC
Do you know why the mons were not in quorum?
Is there a bug in ceph-ansible or are you suggesting that we should check if mons are not in quorum and then fail?

Thanks!

Comment 3 Vasu Kulkarni 2017-09-25 17:17:54 UTC
Sebastein,

That is what I am suggesting to fail during ceph_keys playbook ( else it will eventually timeout after 10minutes trying to wait for quorum eg: create-keys:Talking to monitor...)


I am not sure why the mons were not in quorum, but it would be helpful to be fatal during earlier checks since the other failures later on might not be that useful to debug.

Comment 8 Vasu Kulkarni 2017-11-14 18:22:58 UTC
This is hard to recreate, but I haven't seen this in past couple of sanity runs so I will close this as sanit only verified.

Comment 11 errata-xmlrpc 2017-12-05 23:45:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3387