Created attachment 1410186 [details] ceph-install-workflow.log Description of problem: FFU: ceph upgrade fails with The conditional check 'not is_atomic' failed. The error was: error while evaluating conditional (not is_atomic): 'is_atomic' is undefined: 2018-03-19 16:10:15,100 p=17627 u=mistral | TASK [ceph-docker-common : remove ceph udev rules] ***************************** 2018-03-19 16:10:15,100 p=17627 u=mistral | task path: /usr/share/ceph-ansible/roles/ceph-docker-common/tasks/pre_requisites/remove_ceph_udev_rules.yml:2 2018-03-19 16:10:15,142 p=17627 u=mistral | fatal: [192.168.24.10]: FAILED! => {"msg": "The conditional check 'not is_atomic' failed. The error was: error while evaluating conditional (not is_atomic): 'is_atomic' is undefined\n\nThe error appears to have been in '/usr/share/ceph-ansible/roles/ceph-docker-common/tasks/pre_requisites/remove_ceph_udev_rules.yml': line 2, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n---\n- name: remove ceph udev rules\n ^ here\n"} Version-Release number of selected component (if applicable): ceph-ansible-3.1.0-0.1.beta3.el7.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy OSP10 with 3 controllers + 2 computes + 3 ceph osd nodes 2. Run FFU to OSP13 3. Run overcloud deploy with environments/updates/update-from-ceph-newton.yaml evnironment Actual results: Upgrade fails while running the ceph-ansible playbook. Expected results: Upgrade succeeds. Additional info: Attaching /var/log/mistral/ceph-install-workflow.log
Hi Marius, Any chance we access this env ? Thanks!
(In reply to Guillaume Abrioux from comment #2) > Hi Marius, > > Any chance we access this env ? > > Thanks! I'm currently working on a reproducing environment. I'll get back to you with the details once I have it ready. Thanks!
The is_atomic variable that is undefined here is required by the ceph-docker-common role, but is set in the site-docker-yml.sample playbook. When the ceph-docker-common playbook is then used in rolling_update.yml the is_atomic variable is not set and the playbook fails. This PR moves the creation of the is_atomic variable into the ceph-docker-common role to avoid this: https://github.com/ceph/ceph-ansible/pull/2455
(In reply to Andrew Schoen from comment #5) > The is_atomic variable that is undefined here is required by the > ceph-docker-common role, but is set in the site-docker-yml.sample playbook. > When the ceph-docker-common playbook is then used in rolling_update.yml the > is_atomic variable is not set and the playbook fails. This PR moves the > creation of the is_atomic variable into the ceph-docker-common role to avoid > this: https://github.com/ceph/ceph-ansible/pull/2455 I applied the patch and retried the upgrade and now it's failing on a different step: 2018-03-20 16:10:29,490 p=11965 u=mistral | TASK [ceph-osd : make sure an osd scenario was chosen] ************************* 2018-03-20 16:10:29,559 p=11965 u=mistral | fatal: [192.168.24.11]: FAILED! => {"changed": false, "msg": "please choose an osd scenario"} I'll leave the environment available for debugging. If there's any other info I can provide please let me know.
Marius, What do you have set for the `osd_scenario` variable? Thanks.
(In reply to Andrew Schoen from comment #7) > Marius, > > What do you have set for the `osd_scenario` variable? Thanks. It looks that the osd_scenario var didn't get set. I filed a separate bug 1558722 to keep track of it as it is a different issue.
Hey Marius, What's the next step for this bug? It looks to me like we have a PR to address the issue reported in #1 and that some experimentation happened with the osd_scenario as per 1558722, but my reading of that bug seems to indicate that it is now resolved. If the playbook completed fine with the linked PR (provided the osd_scenario was set), then is the PR sufficient? If so, is the next step for the ceph team (maybe ktdryer) to identify a specific ceph-ansible build which contains the PR?
(In reply to John Fulton from comment #9) > Hey Marius, > > What's the next step for this bug? > > It looks to me like we have a PR to address the issue reported in #1 and > that some experimentation happened with the osd_scenario as per 1558722, but > my reading of that bug seems to indicate that it is now resolved. > > If the playbook completed fine with the linked PR (provided the osd_scenario > was set), then is the PR sufficient? If so, is the next step for the ceph > team (maybe ktdryer) to identify a specific ceph-ansible build which > contains the PR? Hey John, Yes, the PR addressed the reported issue, we just need it shipped in a downstream ceph-ansible build.
Will be in the 3.0 point release.
more precisely in v3.0.29
OSP 13 needs a new v3.1.0beta5 tag on master, since they're cross-shipping prereleases of ceph-ansible v3.1.0. Would you please update this BZ when you've tagged v3.1.0beta5?
Here it is: https://github.com/ceph/ceph-ansible/releases/tag/v3.1.0beta5
Will be resolved in RHCEPH 3.1