Description of problem: A scale up of the number of dedicate Ceph monitor nodes failed. After a successful deployment of a Ceph cluster as a part of the Overcloud with dedicated monitor node, the cluster lost its OSDs count and failed to deploy 1 of the additional Ceph monitors. The result of the scale is: # ceph -s cluster: id: 6ed05b60-1655-11e8-99ef-525400a0203f health: HEALTH_WARN 1/3 mons down, quorum monitor-1,monitor-2 services: mon: 3 daemons, quorum monitor-1,monitor-2, out of quorum: monitor-0 mgr: no daemons active osd: 0 osds: 0 up, 0 in data: pools: 0 pools, 0 pgs objects: 0 objects, 0 bytes usage: 0 kB used, 0 kB / 0 kB avail pgs: # ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0 root default [heat-admin@ceph-0 ~]$ sudo -i [root@ceph-0 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 252:0 0 20G 0 disk ├─vda1 252:1 0 1M 0 part └─vda2 252:2 0 20G 0 part / vdb 252:16 0 40G 0 disk ├─vdb1 252:17 0 39.5G 0 part └─vdb2 252:18 0 512M 0 part vdc 252:32 0 40G 0 disk ├─vdc1 252:33 0 39.5G 0 part └─vdc2 252:34 0 512M 0 part [root@ceph-0 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 70157ddf180a 192.168.24.1:8787/rhosp13/openstack-cron:2018-02-14.1 "kolla_start" 17 hours ago Up 17 hours logrotate_crond 6b259bb37f59 registry.access.redhat.com/rhceph/rhceph-3-rhel7:latest "/entrypoint.sh" 20 hours ago Up 20 hours ceph-osd-ceph-0-vdc ec073eb81645 registry.access.redhat.com/rhceph/rhceph-3-rhel7:latest "/entrypoint.sh" 20 hours ago Up 20 hours ceph-osd-ceph-0-vdb Version-Release number of selected component (if applicable): ceph-ansible-3.0.25-1.el7cp.noarch How reproducible: unknown Steps to Reproduce: 1. Deploy an overcloud with 3 controller nodes, 1 dedicated monitor node, 1 compute node and 3 Ceph storage nodes (each with 2 osds) 2. update the overcloud by add 2 additional dedicated Ceph monitor nodes Actual results: The update failed and cause the Ceph cluster Expected results: The scale up is successful Additional info:
Seems an issue with ceph-ansible, can you attach the ceph-install-workflow.log file? Can you also paste/link to the specific deploy and scaleup commands used?
Created attachment 1399508 [details] ceph-install-workflow.log from the dedicated monitor nodes Please follow the last installations in the log
Thanks, looks an issue with the containers restart.
Yogev, please give us access to the env or tell us why the ceph-mgr is not running. You don't see the OSDs/Pools/PGs because the ceph-mgr is not started. When it comes to the ceph-mon scale issue, unfortunately, the logs are not enough so we need access to the env. Thanks in advance.
(In reply to leseb from comment #7) > Yogev, please give us access to the env or tell us why the ceph-mgr is not > running. > You don't see the OSDs/Pools/PGs because the ceph-mgr is not started. > > When it comes to the ceph-mon scale issue, unfortunately, the logs are not > enough so we need access to the env. > > Thanks in advance. There's an environment ready for you to test
Thanks Yogev, I'm presently looking into this.
After some investigation, it appears that ooo purges the fetch_directory at the end of the play, see https://github.com/openstack/tripleo-common/blob/master/workbooks/ceph-ansible.yaml#L157-L159. This directory is critical as it gives an understanding of what's already present or not. In a containerized scenario, we rely on the content of this directory to deploy each monitor. When you have 3 monitors in a single play it works since the fetch_directory exists till the end. On a scale-up scenario, the fetch_directory does not exist so the during their bootstrap the new monitors believe they are new ones (no keys were copied) John Fulton and Yogev are working on removing the purge and validate if this is the culprit. FYI: there are ideas to remove the need for fetch_directory but there are just ideas at the moment. It has never been explicitly said that this directory was optional.
I wanted to share some additional details from when Seb looked into this with us. As per the reproduction steps: 1. Deploy an overcloud with 3 controller nodes, 1 dedicated monitor node, 1 compute node and 3 Ceph storage nodes (each with 2 osds) During step1 above there was one mon in the ansible inventory: mons: hosts: 192.168.24.16: {} 2. update the overcloud by add 2 additional dedicated Ceph monitor nodes During this step the other two mons, .10 and .20, were brought up and added the ceph-ansible inventory: mons: hosts: 192.168.24.10: {} 192.168.24.16: {} 192.168.24.20: {} The ceph.conf that was generated during the run contained the following: mon initial members = monitor-1,monitor-2,monitor-0 mon host = 172.17.3.18,172.17.3.14,172.17.3.12 The 172.17 addresses map one-to-one to the 192.168 addresses and simply refer to the ceph-storage network. The root cause seems to be that the order of the monitors as indicated by the mon_initial_members. What should have happened is that ceph-ansible should have gone back to the original monitor first. As per Seb's comment #10 the state of the deployment is preserved in the fetch directory. As TripleO currently creates and destroys this directory per ceph-ansible run: https://github.com/openstack/tripleo-common/blob/master/workbooks/ceph-ansible.yaml#L157-L159 If this ends up being the culprit, then the above workbook may need some integration with the undercloud swift to export the fetch directory after the first run of the playbook and import the fetch directory from swift before each run of the playbook to preserve that state information. Next step, modify the workbook to hard code that directory. NeedInfo Yogev: can you please set this test up again and ping me before you run step 1 of the reproduction steps so that I can modify the workbook to not delete the fetch directory in your env?
Preserving the fetch directory during initial deployment and then doing the monitor scale up with the existing fetch directory worked and I didn't hit the issues reported. The order of monitors in the ceph.conf don't seem any different than during the initial investigation of the issue: [root@monitor-0 ~]# grep mon /etc/ceph/ceph.conf mon host = 172.17.3.12,172.17.3.18,172.17.3.15 mon initial members = monitor-1,monitor-0,monitor-2 [root@monitor-0 ~]# However, the monitors are in quorum [1] and the stack update succeeded [2]. Next step: modify the workflow to preserve the fetch directory. [1] [root@monitor-0 ~]# ceph -s cluster: id: b667b35e-3353-11e8-8fad-525400a45353 health: HEALTH_OK services: mon: 3 daemons, quorum monitor-1,monitor-2,monitor-0 mgr: monitor-0(active), standbys: monitor-2, monitor-1 osd: 5 osds: 5 up, 5 in data: pools: 6 pools, 192 pgs objects: 0 objects, 0 bytes usage: 541 MB used, 99243 MB / 99784 MB avail pgs: 192 active+clean [root@monitor-0 ~]# [2] 2018-03-29 22:12:12Z [AllNodesDeploySteps]: UPDATE_COMPLETE state changed 2018-03-29 22:12:18Z [overcloud]: UPDATE_COMPLETE Stack UPDATE completed successfully Stack overcloud UPDATE_COMPLETE Started Mistral Workflow tripleo.deployment.v1.get_horizon_url. Execution ID: 08d1c10e-b8fc-4bd6-95fd-82a7fd89df92 Overcloud Endpoint: http://10.0.0.108:5000/ Overcloud Horizon Dashboard URL: http://10.0.0.108:80/dashboard Overcloud rc file: /home/stack/overcloudrc Overcloud Deployed (undercloud) [stack@undercloud-0 ~]$
*** Bug 1553676 has been marked as a duplicate of this bug. ***
*** Bug 1600202 has been marked as a duplicate of this bug. ***
As per bug 1600202 this also affects monitor replacement.
Fix merged in master branch [1] backport to queens in review [2]. [1] https://review.openstack.org/#/c/567782 [2] https://review.openstack.org/#/c/583229
https://review.openstack.org/#/c/583229 has merged
This bug is marked for inclusion in the errata but does not currently contain draft documentation text. To ensure the timely release of this advisory please provide draft documentation text for this bug as soon as possible. If you do not think this bug requires errata documentation, set the requires_doc_text flag to "-". To add draft documentation text: * Select the documentation type from the "Doc Type" drop down field. * A template will be provided in the "Doc Text" field based on the "Doc Type" value selected. Enter draft text in the "Doc Text" field.
Verified with : core_puddle=2018-08-16.1 ceph-ansible-3.1.0-0.1.rc10.el7cp.noarch deployed 3controllers, 2computes, 3 ceph and 1dedicated monitor nodes After scaling up to 2 dedicated monitors: ceph -s cluster: id: 71492142-a2af-11e8-929f-525400fee3e1 health: HEALTH_OK services: mon: 2 daemons, quorum monitor-1,monitor-0 mgr: monitor-0(active), standbys: monitor-1 osd: 15 osds: 15 up, 15 in data: pools: 5 pools, 160 pgs objects: 5588 objects, 301 MB usage: 2198 MB used, 155 GB / 157 GB avail pgs: 160 active+clean [heat-admin@monitor-0 ~]$ ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.15289 root default -3 0.05096 host ceph-0 0 hdd 0.01019 osd.0 up 1.00000 1.00000 4 hdd 0.01019 osd.4 up 1.00000 1.00000 7 hdd 0.01019 osd.7 up 1.00000 1.00000 9 hdd 0.01019 osd.9 up 1.00000 1.00000 11 hdd 0.01019 osd.11 up 1.00000 1.00000 -7 0.05096 host ceph-1 1 hdd 0.01019 osd.1 up 1.00000 1.00000 3 hdd 0.01019 osd.3 up 1.00000 1.00000 6 hdd 0.01019 osd.6 up 1.00000 1.00000 13 hdd 0.01019 osd.13 up 1.00000 1.00000 14 hdd 0.01019 osd.14 up 1.00000 1.00000 -5 0.05096 host ceph-2 2 hdd 0.01019 osd.2 up 1.00000 1.00000 5 hdd 0.01019 osd.5 up 1.00000 1.00000 8 hdd 0.01019 osd.8 up 1.00000 1.00000 10 hdd 0.01019 osd.10 up 1.00000 1.00000 12 hdd 0.01019 osd.12 up 1.00000 1.00000 sudo -i [root@ceph-0 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 252:0 0 10G 0 disk ├─vda1 252:1 0 1M 0 part └─vda2 252:2 0 10G 0 part / vdb 252:16 0 11G 0 disk ├─vdb1 252:17 0 10.5G 0 part └─vdb2 252:18 0 512M 0 part vdc 252:32 0 11G 0 disk ├─vdc1 252:33 0 10.5G 0 part └─vdc2 252:34 0 512M 0 part vdd 252:48 0 11G 0 disk ├─vdd1 252:49 0 10.5G 0 part └─vdd2 252:50 0 512M 0 part vde 252:64 0 11G 0 disk ├─vde1 252:65 0 10.5G 0 part └─vde2 252:66 0 512M 0 part vdf 252:80 0 11G 0 disk ├─vdf1 252:81 0 10.5G 0 part └─vdf2 252:82 0 512M 0 part [root@ceph-0 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES a8efd64506e3 192.168.24.1:8787/rhosp13/openstack-cron:2018-08-16.1 "kolla_start" 21 hours ago Up 21 hours logrotate_crond f4f31ac96980 192.168.24.1:8787/rhceph:3-11 "/entrypoint.sh" 23 hours ago Up 23 hours ceph-osd-ceph-0-vdf 8c5ed83c9000 192.168.24.1:8787/rhceph:3-11 "/entrypoint.sh" 23 hours ago Up 23 hours ceph-osd-ceph-0-vde 92b0c6da5f78 192.168.24.1:8787/rhceph:3-11 "/entrypoint.sh" 23 hours ago Up 23 hours ceph-osd-ceph-0-vdd 678ad98caaf3 192.168.24.1:8787/rhceph:3-11 "/entrypoint.sh" 23 hours ago Up 23 hours ceph-osd-ceph-0-vdc 90d6ceed9e78 192.168.24.1:8787/rhceph:3-11 "/entrypoint.sh" 23 hours ago Up 23 hours ceph-osd-ceph-0-vdb
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2574