Bug 1558252 - FFU: ceph upgrade fails with The conditional check 'not is_atomic' failed. The error was: error while evaluating conditional (not is_atomic): 'is_atomic' is undefined
Summary: FFU: ceph upgrade fails with The conditional check 'not is_atomic' failed. Th...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Ansible
Version: 3.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: rc
: 3.1
Assignee: Sébastien Han
QA Contact: Marius Cornea
URL:
Whiteboard:
Depends On:
Blocks: 1548353
TreeView+ depends on / blocked
 
Reported: 2018-03-19 22:07 UTC by Marius Cornea
Modified: 2018-05-09 16:33 UTC (History)
17 users (show)

Fixed In Version: RHEL: ceph-ansible-3.1.0-0.1.beta6.el7cp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-05-09 16:33:35 UTC
Embargoed:


Attachments (Terms of Use)
ceph-install-workflow.log (178.65 KB, text/plain)
2018-03-19 22:07 UTC, Marius Cornea
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible pull 2455 0 'None' closed ceph-defaults: set is_atomic variable 2020-08-08 09:04:38 UTC

Description Marius Cornea 2018-03-19 22:07:05 UTC
Created attachment 1410186 [details]
ceph-install-workflow.log

Description of problem:
FFU: ceph upgrade fails with The conditional check 'not is_atomic' failed. The error was: error while evaluating conditional (not is_atomic): 'is_atomic' is undefined:


2018-03-19 16:10:15,100 p=17627 u=mistral |  TASK [ceph-docker-common : remove ceph udev rules] *****************************
2018-03-19 16:10:15,100 p=17627 u=mistral |  task path: /usr/share/ceph-ansible/roles/ceph-docker-common/tasks/pre_requisites/remove_ceph_udev_rules.yml:2
2018-03-19 16:10:15,142 p=17627 u=mistral |  fatal: [192.168.24.10]: FAILED! => {"msg": "The conditional check 'not is_atomic' failed. The error was: error while evaluating conditional (not is_atomic): 'is_atomic' is undefined\n\nThe error appears to have been in '/usr/share/ceph-ansible/roles/ceph-docker-common/tasks/pre_requisites/remove_ceph_udev_rules.yml': line 2, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n---\n- name: remove ceph udev rules\n  ^ here\n"}


Version-Release number of selected component (if applicable):
ceph-ansible-3.1.0-0.1.beta3.el7.noarch

How reproducible:
100%

Steps to Reproduce:
1. Deploy OSP10 with 3 controllers + 2 computes + 3 ceph osd nodes
2. Run FFU to OSP13
3. Run overcloud deploy with environments/updates/update-from-ceph-newton.yaml evnironment

Actual results:
Upgrade fails while running the ceph-ansible playbook.

Expected results:
Upgrade succeeds.

Additional info:
Attaching /var/log/mistral/ceph-install-workflow.log

Comment 2 Guillaume Abrioux 2018-03-20 15:36:57 UTC
Hi Marius,

Any chance we access this env ?

Thanks!

Comment 3 Marius Cornea 2018-03-20 15:45:34 UTC
(In reply to Guillaume Abrioux from comment #2)
> Hi Marius,
> 
> Any chance we access this env ?
> 
> Thanks!

I'm currently working on a reproducing environment. I'll get back to you with the details once I have it ready. Thanks!

Comment 5 Andrew Schoen 2018-03-20 19:29:05 UTC
The is_atomic variable that is undefined here is required by the ceph-docker-common role, but is set in the site-docker-yml.sample playbook. When the ceph-docker-common playbook is then used in rolling_update.yml the is_atomic variable is not set and the playbook fails. This PR moves the creation of the is_atomic variable into the ceph-docker-common role to avoid this: https://github.com/ceph/ceph-ansible/pull/2455

Comment 6 Marius Cornea 2018-03-20 20:16:46 UTC
(In reply to Andrew Schoen from comment #5)
> The is_atomic variable that is undefined here is required by the
> ceph-docker-common role, but is set in the site-docker-yml.sample playbook.
> When the ceph-docker-common playbook is then used in rolling_update.yml the
> is_atomic variable is not set and the playbook fails. This PR moves the
> creation of the is_atomic variable into the ceph-docker-common role to avoid
> this: https://github.com/ceph/ceph-ansible/pull/2455

I applied the patch and retried the upgrade and now it's failing on a different step:

2018-03-20 16:10:29,490 p=11965 u=mistral |  TASK [ceph-osd : make sure an osd scenario was chosen] *************************
2018-03-20 16:10:29,559 p=11965 u=mistral |  fatal: [192.168.24.11]: FAILED! => {"changed": false, "msg": "please choose an osd scenario"}


I'll leave the environment available for debugging. If there's any other info I can provide please let me know.

Comment 7 Andrew Schoen 2018-03-20 20:33:25 UTC
Marius,

What do you have set for the `osd_scenario` variable? Thanks.

Comment 8 Marius Cornea 2018-03-20 21:36:52 UTC
(In reply to Andrew Schoen from comment #7)
> Marius,
> 
> What do you have set for the `osd_scenario` variable? Thanks.

It looks that the osd_scenario var didn't get set. I filed a separate bug 1558722 to keep track of it as it is a different issue.

Comment 9 John Fulton 2018-03-23 02:44:07 UTC
Hey Marius,

What's the next step for this bug? 

It looks to me like we have a PR to address the issue reported in #1 and that some experimentation happened with the osd_scenario as per 1558722, but my reading of that bug seems to indicate that it is now resolved.

If the playbook completed fine with the linked PR (provided the osd_scenario was set), then is the PR sufficient? If so, is the next step for the ceph team (maybe ktdryer) to identify a specific ceph-ansible build which contains the PR?

Comment 10 Marius Cornea 2018-03-23 02:56:34 UTC
(In reply to John Fulton from comment #9)
> Hey Marius,
> 
> What's the next step for this bug? 
> 
> It looks to me like we have a PR to address the issue reported in #1 and
> that some experimentation happened with the osd_scenario as per 1558722, but
> my reading of that bug seems to indicate that it is now resolved.
> 
> If the playbook completed fine with the linked PR (provided the osd_scenario
> was set), then is the PR sufficient? If so, is the next step for the ceph
> team (maybe ktdryer) to identify a specific ceph-ansible build which
> contains the PR?

Hey John,

Yes, the PR addressed the reported issue, we just need it shipped in a downstream ceph-ansible build.

Comment 14 Sébastien Han 2018-04-05 13:24:28 UTC
Will be in the 3.0 point release.

Comment 15 Sébastien Han 2018-04-05 13:24:48 UTC
more precisely in v3.0.29

Comment 16 Ken Dreyer (Red Hat) 2018-04-05 20:31:50 UTC
OSP 13 needs a new v3.1.0beta5 tag on master, since they're cross-shipping prereleases of ceph-ansible v3.1.0.

Would you please update this BZ when you've tagged v3.1.0beta5?

Comment 17 Sébastien Han 2018-04-10 09:58:56 UTC
Here it is: https://github.com/ceph/ceph-ansible/releases/tag/v3.1.0beta5

Comment 24 Ken Dreyer (Red Hat) 2018-05-09 16:33:35 UTC
Will be resolved in RHCEPH 3.1


Note You need to log in before you can comment on or make changes to this bug.