Bug 1494643 - [RHCeph 3.0/3.0.0-0.1.rc10.el7cp] Dont run include ceph_keys.yml when monitor's not in quorum
Summary: [RHCeph 3.0/3.0.0-0.1.rc10.el7cp] Dont run include ceph_keys.yml when monitor...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Ansible
Version: 3.0
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: rc
: 3.0
Assignee: Sébastien Han
QA Contact: Vasu Kulkarni
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-09-22 17:49 UTC by Vasu Kulkarni
Modified: 2017-12-05 23:45 UTC (History)
12 users (show)

Fixed In Version: RHEL: ceph-ansible-3.0.0-0.1.rc12.el7cp Ubuntu: ceph-ansible_3.0.0~rc12-2redhat1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-05 23:45:31 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible pull 1641 0 None None None 2017-09-26 15:35:44 UTC
Red Hat Product Errata RHBA-2017:3387 0 normal SHIPPED_LIVE Red Hat Ceph Storage 3.0 bug fix and enhancement update 2017-12-06 03:03:45 UTC

Description Vasu Kulkarni 2017-09-22 17:49:36 UTC
Description of problem:

When debugging some of the ansible test failure, I noticed that ceph_keys fails but because monitor is not in quorum, in such cases its better to stop the rest of the playbooks to run since the errors that come out will be just confusing and hard to debug

Full logs:

http://magna002.ceph.redhat.com/vasu-2017-09-21_19:20:48-smoke-luminous---basic-multi/274653/teuthology.log

2017-09-21T23:18:25.920 INFO:teuthology.orchestra.run.pluto007.stdout:
2017-09-21T23:18:25.921 INFO:teuthology.orchestra.run.pluto007.stdout:TASK [ceph-mon : include ceph_keys.yml] ****************************************
2017-09-21T23:18:25.921 INFO:teuthology.orchestra.run.pluto007.stdout:task path: /home/ubuntu/ceph-ansible/roles/ceph-mon/tasks/main.yml:13
2017-09-21T23:18:25.921 INFO:teuthology.orchestra.run.pluto007.stdout:included: /home/ubuntu/ceph-ansible/roles/ceph-mon/tasks/ceph_keys.yml for pluto004.ceph.redhat.com, pluto005.ceph.redhat.com, pluto007.ceph.redhat.com
2017-09-21T23:18:25.921 INFO:teuthology.orchestra.run.pluto007.stdout:
2017-09-21T23:18:25.921 INFO:teuthology.orchestra.run.pluto007.stdout:TASK [ceph-mon : collect admin and bootstrap keys] *****************************
2017-09-21T23:18:25.921 INFO:teuthology.orchestra.run.pluto007.stdout:task path: /home/ubuntu/ceph-ansible/roles/ceph-mon/tasks/ceph_keys.yml:2
2017-09-21T23:18:25.922 INFO:teuthology.orchestra.run.pluto007.stdout:ok: [pluto004.ceph.redhat.com] => {
2017-09-21T23:18:25.922 INFO:teuthology.orchestra.run.pluto007.stdout:    "changed": false,
2017-09-21T23:18:25.922 INFO:teuthology.orchestra.run.pluto007.stdout:    "cmd": [
2017-09-21T23:18:25.922 INFO:teuthology.orchestra.run.pluto007.stdout:        "ceph-create-keys",
2017-09-21T23:18:25.922 INFO:teuthology.orchestra.run.pluto007.stdout:        "--cluster",
2017-09-21T23:18:25.922 INFO:teuthology.orchestra.run.pluto007.stdout:        "ceph",
2017-09-21T23:18:25.922 INFO:teuthology.orchestra.run.pluto007.stdout:        "-i",
2017-09-21T23:18:25.923 INFO:teuthology.orchestra.run.pluto007.stdout:        "pluto004"
2017-09-21T23:18:25.923 INFO:teuthology.orchestra.run.pluto007.stdout:    ],
2017-09-21T23:18:25.923 INFO:teuthology.orchestra.run.pluto007.stdout:    "delta": "0:00:02.422280",
2017-09-21T23:18:25.923 INFO:teuthology.orchestra.run.pluto007.stdout:    "end": "2017-09-22 03:20:13.828327",
2017-09-21T23:18:25.923 INFO:teuthology.orchestra.run.pluto007.stdout:    "failed": false,
2017-09-21T23:18:25.923 INFO:teuthology.orchestra.run.pluto007.stdout:    "failed_when_result": false,
2017-09-21T23:18:25.923 INFO:teuthology.orchestra.run.pluto007.stdout:    "rc": 0,
2017-09-21T23:18:25.924 INFO:teuthology.orchestra.run.pluto007.stdout:    "start": "2017-09-22 03:20:11.406047"
2017-09-21T23:18:25.924 INFO:teuthology.orchestra.run.pluto007.stdout:}
2017-09-21T23:18:25.924 INFO:teuthology.orchestra.run.pluto007.stdout:
2017-09-21T23:18:25.924 INFO:teuthology.orchestra.run.pluto007.stdout:STDERR:
2017-09-21T23:18:25.924 INFO:teuthology.orchestra.run.pluto007.stdout:
2017-09-21T23:18:25.924 INFO:teuthology.orchestra.run.pluto007.stdout:INFO:ceph-create-keys:ceph-mon is not in quorum: u'probing'
2017-09-21T23:18:25.924 INFO:teuthology.orchestra.run.pluto007.stdout:INFO:ceph-create-keys:Talking to monitor...
2017-09-21T23:18:25.925 INFO:teuthology.orchestra.run.pluto007.stdout:Error ENOENT: failed to find client.admin in keyring
2017-09-21T23:18:25.925 INFO:teuthology.orchestra.run.pluto007.stdout:INFO:ceph-create-keys:Talking to monitor...
2017-09-21T23:18:25.925 INFO:teuthology.orchestra.run.pluto007.stdout:INFO:ceph-create-keys:Talking to monitor...
2017-09-21T23:18:25.925 INFO:teuthology.orchestra.run.pluto007.stdout:INFO:ceph-create-keys:Talking to monitor...
2017-09-21T23:18:25.925 INFO:teuthology.orchestra.run.pluto007.stdout:INFO:ceph-create-keys:Talking to monitor...

Comment 2 seb 2017-09-25 11:51:24 UTC
Do you know why the mons were not in quorum?
Is there a bug in ceph-ansible or are you suggesting that we should check if mons are not in quorum and then fail?

Thanks!

Comment 3 Vasu Kulkarni 2017-09-25 17:17:54 UTC
Sebastein,

That is what I am suggesting to fail during ceph_keys playbook ( else it will eventually timeout after 10minutes trying to wait for quorum eg: create-keys:Talking to monitor...)


I am not sure why the mons were not in quorum, but it would be helpful to be fatal during earlier checks since the other failures later on might not be that useful to debug.

Comment 8 Vasu Kulkarni 2017-11-14 18:22:58 UTC
This is hard to recreate, but I haven't seen this in past couple of sanity runs so I will close this as sanit only verified.

Comment 11 errata-xmlrpc 2017-12-05 23:45:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3387


Note You need to log in before you can comment on or make changes to this bug.