When running rolling_update.yml twice on the same system the playbook fails on the "assign rbd application to pool" task: https://github.com/ceph/ceph-ansible/blob/master/roles/ceph-mon/tasks/openstack_config.yml#L17-L22 with the error below [1]. Perhaps the when could be expanded to include checking if the pool already has an enabled application? Using from ceph-ansible-3.1.0-0.1.beta4.el7cp.noarch with OSP13 beta and Luminous. [1] 2018-03-29 14:39:16,353 p=21727 u=mistral | TASK [ceph-mon : assign rbd application to pool(s)] **************************** 2018-03-29 14:39:16,353 p=21727 u=mistral | task path: /usr/share/ceph-ansible/roles/ceph-mon/tasks/openstack_config.yml:17 2018-03-29 14:39:16,353 p=21727 u=mistral | Thursday 29 March 2018 14:39:16 -0400 (0:00:03.643) 0:02:59.193 ******** 2018-03-29 14:39:18,114 p=21727 u=mistral | ok: [192.168.24.18] => (item={u'rule_name': u'', u'pg_num': 32, u'name': u'images'}) => {"changed": false, "cmd": ["docker", "exec", "ceph-mon-controller-1", "ceph", "--cluster", "ceph", "osd", "pool", "application", "enable", "images", "rbd"], "delta": "0:00:01.461224", "end": "2018-03-29 18:39:18.081302", "item": {"name": "images", "pg_num": 32, "rule_name": ""}, "rc": 0, "start": "2018-03-29 18:39:16.620078", "stderr": "enabled application 'rbd' on pool 'images'", "stderr_lines": ["enabled application 'rbd' on pool 'images'"], "stdout": "", "stdout_lines": []} 2018-03-29 14:39:18,813 p=21727 u=mistral | failed: [192.168.24.18] (item={u'rule_name': u'', u'pg_num': 32, u'name': u'metrics'}) => {"changed": false, "cmd": ["docker", "exec", "ceph-mon-controller-1", "ceph", "--cluster", "ceph", "osd", "pool", "application", "enable", "metrics", "rbd"], "delta": "0:00:00.466702", "end": "2018-03-29 18:39:18.794722", "item": {"name": "metrics", "pg_num": 32, "rule_name": ""}, "msg": "non-zero return code", "rc": 1, "start": "2018-03-29 18:39:18.328020", "stderr": "Error EPERM: Are you SURE? Pool 'metrics' already has an enabled application; pass --yes-i-really-mean-it to proceed anyway", "stderr_lines": ["Error EPERM: Are you SURE? Pool 'metrics' already has an enabled application; pass --yes-i-really-mean-it to proceed anyway"], "stdout": "", "stdout_lines": []} 2018-03-29 14:39:20,115 p=21727 u=mistral | ok: [192.168.24.18] => (item={u'rule_name': u'', u'pg_num': 32, u'name': u'backups'}) => {"changed": false, "cmd": ["docker", "exec", "ceph-mon-controller-1", "ceph", "--cluster", "ceph", "osd", "pool", "application", "enable", "backups", "rbd"], "delta": "0:00:01.045684", "end": "2018-03-29 18:39:20.086214", "item": {"name": "backups", "pg_num": 32, "rule_name": ""}, "rc": 0, "start": "2018-03-29 18:39:19.040530", "stderr": "enabled application 'rbd' on pool 'backups'", "stderr_lines": ["enabled application 'rbd' on pool 'backups'"], "stdout": "", "stdout_lines": []} 2018-03-29 14:39:21,134 p=21727 u=mistral | ok: [192.168.24.18] => (item={u'rule_name': u'', u'pg_num': 32, u'name': u'vms'}) => {"changed": false, "cmd": ["docker", "exec", "ceph-mon-controller-1", "ceph", "--cluster", "ceph", "osd", "pool", "application", "enable", "vms", "rbd"], "delta": "0:00:00.653458", "end": "2018-03-29 18:39:21.107759", "item": {"name": "vms", "pg_num": 32, "rule_name": ""}, "rc": 0, "start": "2018-03-29 18:39:20.454301", "stderr": "enabled application 'rbd' on pool 'vms'", "stderr_lines": ["enabled application 'rbd' on pool 'vms'"], "stdout": "", "stdout_lines": []} 2018-03-29 14:39:22,134 p=21727 u=mistral | ok: [192.168.24.18] => (item={u'rule_name': u'', u'pg_num': 32, u'name': u'volumes'}) => {"changed": false, "cmd": ["docker", "exec", "ceph-mon-controller-1", "ceph", "--cluster", "ceph", "osd", "pool", "application", "enable", "volumes", "rbd"], "delta": "0:00:00.744918", "end": "2018-03-29 18:39:22.112935", "item": {"name": "volumes", "pg_num": 32, "rule_name": ""}, "rc": 0, "start": "2018-03-29 18:39:21.368017", "stderr": "enabled application 'rbd' on pool 'volumes'", "stderr_lines": ["enabled application 'rbd' on pool 'volumes'"], "stdout": "", "stdout_lines": []} 2018-03-29 14:39:22,140 p=21727 u=mistral | PLAY RECAP ********************************************************************* 2018-03-29 14:39:22,140 p=21727 u=mistral | 192.168.24.10 : ok=65 changed=6 unreachable=0 failed=0 2018-03-29 14:39:22,140 p=21727 u=mistral | 192.168.24.12 : ok=4 changed=0 unreachable=0 failed=0 2018-03-29 14:39:22,140 p=21727 u=mistral | 192.168.24.15 : ok=4 changed=0 unreachable=0 failed=0 2018-03-29 14:39:22,140 p=21727 u=mistral | 192.168.24.18 : ok=68 changed=5 unreachable=0 failed=1 2018-03-29 14:39:22,140 p=21727 u=mistral | 192.168.24.22 : ok=4 changed=0 unreachable=0 failed=0 2018-03-29 14:39:22,141 p=21727 u=mistral | 192.168.24.9 : ok=64 changed=7 unreachable=0 failed=0 2018-03-29 14:39:22,141 p=21727 u=mistral | localhost : ok=0 changed=0 unreachable=0 failed=0 2018-03-29 14:39:22,141 p=21727 u=mistral | Thursday 29 March 2018 14:39:22 -0400 (0:00:05.787) 0:03:04.981 ********
Strange I have a different result: root@ceph-nano-b-faa32aebf00b:/# ceph -v ceph version 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable) root@ceph-nano-b-faa32aebf00b:/# rados mkpool leseb 8 setting auid:8 successfully created pool leseb root@ceph-nano-b-faa32aebf00b:/# ceph osd pool application enable leseb rbd enabled application 'rbd' on pool 'leseb' root@ceph-nano-b-faa32aebf00b:/# ceph osd pool application enable leseb rbd enabled application 'rbd' on pool 'leseb' root@ceph-nano-b-faa32aebf00b:/# ceph osd pool application enable leseb rbd enabled application 'rbd' on pool 'leseb'
Oh I got it, the command **is** idempotent, you just tried to assign a different application type, hence the warning. Feel free to re-open if you have more concern.
(In reply to leseb from comment #5) > Oh I got it, the command **is** idempotent, you just tried to assign a > different application type, hence the warning. > > Feel free to re-open if you have more concern. I'm not sure we assigned a different application to it. The tag was already set to rbd as the logs from the earlier run show [1]. Maybe something else changed it in between. I suppose more testing could happen to monitor the tag during upgrade to see if something else is setting the tag and if it's found, attribute the bug to that. The alternative is to just code ceph-ansible in a paranoid way and only try to set the tag if it is not already set. [1] https://bugzilla.redhat.com/attachment.cgi?id=1414867 2018-03-28 23:43:15,191 p=9685 u=mistral | ok: [192.168.24.18] => (item={u'rule_name': u'', u'pg_num': 32, u'name': u'metrics'}) => {"changed": false, "cmd": ["docker", "exec", "ceph-mon-controller-1", "ceph", "--cluster", "ceph", "osd", "pool", "application", "enable", "metrics", "rbd"], "delta": "0:00:00.738051", "end": "2018-03-29 03:43:15.171144", "item": {"name": "metrics", "pg_num": 32, "rule_name": ""}, "rc": 0, "start": "2018-03-29 03:43:14.433093", "stderr": "enabled application 'rbd' on pool 'metrics'", "stderr_lines": ["enabled application 'rbd' on pool 'metrics'"], "stdout": "", "stdout_lines": []}
I was able reproduce this issue when trying to re-run the upgrade after a failed attempt: 018-04-18 21:03:56,248 p=16352 u=mistral | ok: [192.168.24.18] => (item={u'rule_name': u'', u'pg_num': 32, u'name': u'images'}) => {"changed": false, "cmd": ["docker", "exec", "ceph-mon-controller-1", "ceph", "--cluster", "ceph", "osd", "pool", "application", "enable", "images", "rbd"], "delta": "0:00:00.885337", "end": "2018-04-19 01:03:56.433874", "item": {"name": "images", "pg_num": 32, "rule_name": ""}, "rc": 0, "start": "2018-04-19 01:03:55.548537", "stderr": "enab led application 'rbd' on pool 'images'", "stderr_lines": ["enabled application 'rbd' on pool 'images'"], "stdout": "", "stdout_lines": []} 2018-04-18 21:03:56,935 p=16352 u=mistral | failed: [192.168.24.18] (item={u'rule_name': u'', u'pg_num': 32, u'name': u'metrics'}) => {"changed": false, "cmd": ["docker", "exec", "ceph-mon-controller-1", "ceph", "--cluster", "ceph", "osd ", "pool", "application", "enable", "metrics", "rbd"], "delta": "0:00:00.433200", "end": "2018-04-19 01:03:57.119135", "item": {"name": "metrics", "pg_num": 32, "rule_name": ""}, "msg": "non-zero return code", "rc": 1, "start": "2018-04-1 9 01:03:56.685935", "stderr": "Error EPERM: Are you SURE? Pool 'metrics' already has an enabled application; pass --yes-i-really-mean-it to proceed anyway", "stderr_lines": ["Error EPERM: Are you SURE? Pool 'metrics' already has an enable d application; pass --yes-i-really-mean-it to proceed anyway"], "stdout": "", "stdout_lines": []} 2018-04-18 21:03:58,245 p=16352 u=mistral | ok: [192.168.24.18] => (item={u'rule_name': u'', u'pg_num': 32, u'name': u'backups'}) => {"changed": false, "cmd": ["docker", "exec", "ceph-mon-controller-1", "ceph", "--cluster", "ceph", "osd" , "pool", "application", "enable", "backups", "rbd"], "delta": "0:00:01.068002", "end": "2018-04-19 01:03:58.430988", "item": {"name": "backups", "pg_num": 32, "rule_name": ""}, "rc": 0, "start": "2018-04-19 01:03:57.362986", "stderr": "e nabled application 'rbd' on pool 'backups'", "stderr_lines": ["enabled application 'rbd' on pool 'backups'"], "stdout": "", "stdout_lines": []} 2018-04-18 21:03:59,232 p=16352 u=mistral | ok: [192.168.24.18] => (item={u'rule_name': u'', u'pg_num': 32, u'name': u'vms'}) => {"changed": false, "cmd": ["docker", "exec", "ceph-mon-controller-1", "ceph", "--cluster", "ceph", "osd", "p ool", "application", "enable", "vms", "rbd"], "delta": "0:00:00.595775", "end": "2018-04-19 01:03:59.416364", "item": {"name": "vms", "pg_num": 32, "rule_name": ""}, "rc": 0, "start": "2018-04-19 01:03:58.820589", "stderr": "enabled appli cation 'rbd' on pool 'vms'", "stderr_lines": ["enabled application 'rbd' on pool 'vms'"], "stdout": "", "stdout_lines": []} 2018-04-18 21:04:00,255 p=16352 u=mistral | ok: [192.168.24.18] => (item={u'rule_name': u'', u'pg_num': 32, u'name': u'volumes'}) => {"changed": false, "cmd": ["docker", "exec", "ceph-mon-controller-1", "ceph", "--cluster", "ceph", "osd" , "pool", "application", "enable", "volumes", "rbd"], "delta": "0:00:00.793744", "end": "2018-04-19 01:04:00.443578", "item": {"name": "volumes", "pg_num": 32, "rule_name": ""}, "rc": 0, "start": "2018-04-19 01:03:59.649834", "stderr": "e nabled application 'rbd' on pool 'volumes'", "stderr_lines": ["enabled application 'rbd' on pool 'volumes'"], "stdout": "", "stdout_lines": []}
The error is pretty slef-explanatory, if you get it it's because somehow the assigned pool changed. If the command has the same output all the time as shown in https://bugzilla.redhat.com/show_bug.cgi?id=1562220#c4 there is no problem. Once you try to assign a different application type that this is different from the current one the error occurs. We need to either, close this BZ and open a new one with the right description/title or update this one's title/description. But as is both description and title are misleading and wrong. So please let's do one of the two. Thanks.
Created attachment 1425777 [details] ceph-ansible logs
The error is reproducible when running the ceph upgrade deploy command(with environments/updates/update-from-ceph-newton.yaml) twice. I attached the logs in /var/log/mistral and /tmp/ansible-mistral*
Created attachment 1426133 [details] Inventory from run ansible-mistral-actionMI1OZq
Eureka! The following task from openstack_config.yml has a bug: https://github.com/ceph/ceph-ansible/blob/master/roles/ceph-mon/tasks/openstack_config.yml#L17-L22 because it assigns rbd to every pool, however the metrics pool is not of application type rbd, it should be of application type openstack_gnocchi: [root@controller-0 ceph]# ceph osd dump | grep "^pool" pool 0 'rbd' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 86 flags hashpspool stripe_width 0 application rbd pool 1 'metrics' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 last_change 86 flags hashpspool stripe_width 0 application openstack_gnocchi pool 2 'images' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 last_change 91 flags hashpspool stripe_width 0 application rbd pool 3 'backups' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 last_change 92 flags hashpspool stripe_width 0 application rbd pool 4 'volumes' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 last_change 94 flags hashpspool stripe_width 0 application rbd pool 5 'vms' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 last_change 103 flags hashpspool stripe_width 0 application rbd [root@controller-0 ceph]#
(In reply to leseb from comment #8) > The error is pretty slef-explanatory, if you get it it's because somehow the > assigned pool changed. If the command has the same output all the time as > shown in https://bugzilla.redhat.com/show_bug.cgi?id=1562220#c4 there is no > problem. Once you try to assign a different application type that this is > different from the current one the error occurs. I agree. The pool which failed is metrics and it was correctly assigned the type openstack_gnocchi, but when the task "assign rbd application to pool(s)" ran it tried to assign it rbd and we hit this error. https://github.com/ceph/ceph-ansible/blob/master/roles/ceph-mon/tasks/openstack_config.yml#L17-L22 > We need to either, close this BZ and open a new one with the right > description/title or update this one's title/description. > But as is both description and title are misleading and wrong. > > So please let's do one of the two. Sure. I have updated the title. One way I can think of to address this is to change the "when:" to check not only that the version is luminous but that the pool doesn't already have an application type. The output of 'ceph osd dump | grep "^pool"' could be used to do this. Does that make sense?
More info on this, Ceph sets the application pool, this was introduced in: https://github.com/ceph/ceph/commit/6d7ac66ae2e0cfd326cfbecb6c2c9ff953b74600 but removed in https://github.com/ceph/ceph/commit/afda7157aefc1d90368523e27efcbdd53d879a7e So basically Mimic won't have this anymore, this means newer cluster won't have it. To solve this nicely, we need to add a new k/v to the dict openstack_pools so we can set an application. So this code can run nicely without any issue: https://github.com/ceph/ceph-ansible/blob/master/roles/ceph-mon/tasks/openstack_config.yml#L17-L22 I'll work on a patch for this.
*** Bug 1574354 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2819