Bug 1376283
Summary: | [containers]: configuration of ceph-clusters with ansible fails at task - 'check the partition status of the journal devices' | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Storage Console | Reporter: | krishnaram Karthick <kramdoss> | ||||||||||
Component: | ceph-ansible | Assignee: | Sébastien Han <shan> | ||||||||||
Status: | CLOSED ERRATA | QA Contact: | Rachana Patel <racpatel> | ||||||||||
Severity: | high | Docs Contact: | |||||||||||
Priority: | unspecified | ||||||||||||
Version: | 2 | CC: | adeza, aschoen, ceph-eng-bugs, dang, flucifre, gmeno, hchen, hnallurv, ifont, jim.curtis, kdreyer, nthomas, pprakash, rcyriac, sankarshan, seb | ||||||||||
Target Milestone: | --- | ||||||||||||
Target Release: | 2 | ||||||||||||
Hardware: | Unspecified | ||||||||||||
OS: | Unspecified | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | ceph-ansible-2.2.1-1.el7scon | Doc Type: | If docs needed, set a value | ||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2017-06-19 13:15:26 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Bug Depends On: | |||||||||||||
Bug Blocks: | 1315538, 1371113 | ||||||||||||
Attachments: |
|
Description
krishnaram Karthick
2016-09-15 03:18:40 UTC
Which ansible version? I believe this is only happens on some ansible versions. I had done a git clone and this is the hash for the current commit. [root@dhcp42-15 ceph-ansible]# git rev-parse HEAD 5298de3ef5e45296278df10a91ff43ced4dbb033 I'm not able to find that hash in either ansible git or ceph-ansible-git. Where did you clone it? I had cloned from here --> git clone https://github.com/ceph/ceph-ansible as mentioned in the deployment guide. I don't see any attachement to this BZ, is it only me? The playbook shouldn't go through that step, so I suspect this is a configuration issue of ceph-ansible. This task is expected to get skipped because it is part of the non-containerized deployment. Please share your variable file. Thanks. Okay, sorry, I meant the ansible version, not the ceph-ansible version. Created attachment 1202483 [details]
site-docker.yml
Created attachment 1202484 [details]
groups_vars/all file used
I've attached 'groups_vars/all' for your reference. Current ansible version, #ansible --version ansible 1.9.4 I see that there is an update available now '1.9.4-1.el7_2'. I'll give a try with this version. Okay, that's the same version of ansible I've seen this issue on (although on Fedora). It didn't happen on older or new ansible that I've seen. I see the issue with ansible version '1.9.4-1.el7_2' too. Is there any other newer version I should try? Do you see anything amiss with the groups_vars/all file? I've skipped the configuration of 'rgw' and 'nfs' which I think shouldn't matter. group_vars/all looks good to me. Can you send a full trace of the play? I really want to see what's happening here. Can you bump to 2.0.0.1 and see if you still have the issue? Thanks! @ken, If I can get a full play trace, I'd be able to get a better understanding of what's happening. @krishnaram, can I get more logs please? Thanks Created attachment 1203603 [details]
ansible_playbook_console_redirect
No worries, would you mind testing this branch and see how to goes? https://github.com/ceph/ceph-ansible/pull/994 Thanks! ansible fails with the same error reported earlier with the above patch cherry-picked. Created attachment 1204103 [details] logs with the patch in comment#23 Are you running ceph-master? sorry I meant ceph-ansible master Yes. I from ceph ansible master, I had done a cherry pick of the patch with the fix. <<<<<<<Snippet of git log>>>>>>> commit 21d217e89098c102f9eb9de232698b1661003f38 Author: Sébastien Han <seb> Date: Thu Sep 22 16:41:06 2016 +0200 fix non skipped task for ansible v1.9.x please refer to https://bugzilla.redhat.com/show_bug.cgi?id=1376283 Signed-off-by: Sébastien Han <seb> commit d7699672a8b3fa078b8a5a681ef04121bc6a681a Merge: efe4235 e756214 Author: Leseb <seb> Date: Thu Sep 22 11:54:41 2016 +0200 Merge pull request #988 from batrick/linode-dockerfile docker: add Dockerfile for Linode cluster dev env I just pushed new changes on: https://github.com/ceph/ceph-ansible/pull/994/ can you try again? Thanks I still see the same issue with your patch. Thanks for sharing your setup, it's way easier to debug :). I successfully fix the problem with this PR: https://github.com/ceph/ceph-ansible/pull/994 Please purge the full setup and test again with both 1.9.4 and 2.x ;) ps: I have a branch with the patch, but please start over with the upstream branch I linked above. Thanks again. yay! it worked. I'm able to setup ceph cluster with ansible now with the fix now, tried with both 1.9.4 and 2.1.1.0 Thanks a lot for testing, I'm going to publish a 1.0.7 version soon. (upstream ceph-ansible) this is a blocker for the Ceph container release. Target => 2.2. This looks to have made it into the stable-2.1 branch of ceph-ansible upstream. Verified with below version: ceph-ansible-2.2.7-1.el7scon.noarch ansible-2.2.3.0-1.el7.noarch Followed Downstream Doc for installation and it dint failed. Able to bring up cluster hence moving to verified Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1496 |