Bug 1366807
Summary: | [RFE] ceph-ansible: remove MON and OSD nodes | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Federico Lucifredi <flucifre> | ||||||
Component: | Ceph-Ansible | Assignee: | seb | ||||||
Status: | CLOSED ERRATA | QA Contact: | Vasishta <vashastr> | ||||||
Severity: | urgent | Docs Contact: | Bara Ancincova <bancinco> | ||||||
Priority: | urgent | ||||||||
Version: | 3.0 | CC: | adeza, anharris, aschoen, ceph-eng-bugs, flucifre, hnallurv, kdreyer, nlevine, nthomas, racpatel, sankarshan, seb, shan, vashastr | ||||||
Target Milestone: | rc | Keywords: | FutureFeature | ||||||
Target Release: | 3.0 | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | RHEL: ceph-ansible-3.0.0-0.1.rc6.el7cp Ubuntu: ceph-ansible_3.0.0~rc6-2redhat1 | Doc Type: | Enhancement | ||||||
Doc Text: |
.Ansible now supports removing Monitors and OSDs
You can use the `ceph-ansible` utility to remove Monitors and OSDs from a Ceph cluster. For details, see the link:https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html-single/administration_guide/#removing-monitors-with-ansible[Removing Monitors with Ansible] and link:https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html-single/administration_guide/#removing-osds-with-ansible[Removing OSDs with Ansible] sections in the Red Hat Ceph Storage 3 Administration Guide. The same procedures apply also for removing Monitors and OSDs from a containerized Ceph cluster.
|
Story Points: | --- | ||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2017-12-05 23:31:14 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1322504, 1383917, 1412948, 1494421 | ||||||||
Attachments: |
|
Description
Federico Lucifredi
2016-08-12 22:29:59 UTC
This should be targeted at the first Async — but I only see targets 2 and 3.... Yup fixed in v1.0.8 This will ship concurrently with RHCS 2.1. What automated tests cover this feature as implemented today? From discussion with Andrew, it sounds like the current implementation requires the admin to run Ansible run *on* the Ceph cluster nodes? (runs local commands?) If so, we need to change that. *** Bug 1335569 has been marked as a duplicate of this bug. *** *** Bug 1414092 has been marked as a duplicate of this bug. *** Created attachment 1324368 [details]
File contains contents ansible-playbook log, conf file after removing a monitor
Hi all,
I worked on shrinking MON from the cluster. playbook run was successful, but
1) Monitor was still in the cluster though "verify the monitor is out of the cluster" completed without any errors
and
2) Configuration file still had entry of removed monitor.
By referring steps mentioned in Admin Doc to remove a monitor from the cluster, I expect ansible need to remove the mon from the cluster and modify, re-distribute the config file to increase the usability of the feature.
I'm moving the BZ back to ASSIGNED state, please let me know if my expectation is not appropriate. I've attached a file containing ansible-log and conf file after removing a mon.
(Terminal log after removing a MON from node magna051)
# sudo ceph -s --cluster 12_3a
-------
health: HEALTH_WARN
-------
1/3 mons down, quorum magna033,magna040
services:
mon: 3 daemons, quorum magna033,magna040, out of quorum: magna051
-------
$ sudo ceph mon stat --cluster 12_3a
e2: 3 mons at {magna033=10.8.128.33:6789/0,magna040=10.8.128.40:6789/0,magna051=10.8.128.51:6789/0}, election epoch 12, leader 0 magna033, quorum 0,1 magna033,magna040
$ sudo ceph mon remove magna051 --cluster 12_3a
removing mon.magna051 at 10.8.128.51:6789/0, there will be 2 monitors
$ sudo ceph mon stat --cluster 12_3a
e3: 2 mons at {magna033=10.8.128.33:6789/0,magna040=10.8.128.40:6789/0}, election epoch 14, leader 0 magna033, quorum 0,1 magna033,magna040
Regards,
Vasishta
It's weird, can you retry and run ansible in debug mode? with -vvvv please? I need to make sure the command was issued properly. Thanks! FYI I haven't been able to reproduce. Created attachment 1325704 [details]
File contains contents ansible-playbook log and conf file from different nodes
Hi Sebastien,
This time it worked partially. Mon was removed from the cluster as expected but conf file in rest of the cluster were not updated.
I've copied those conf files and ansible log with verbose enabled. Can you please check this once ?
This is expected that the user will update the ceph.conf. It's difficult for us to do the update and re-distribute because this means modifying their inventory. Modifying the inventory is not possible, even if we override it, the next-ansible run will override it again. Since you've been able to make it work eventually I'm moving this back to POST. Also as described in my earlier comment, I don't think we can do much more than what we currently do. Thanks. Vasishta is this still an issue in rc7? It is acceptable, and yes, let's please add this step to the docs. A prompt indicating ceph.conf needs to be updated may also be in order (Seb's call). At the end of the play, we prompt the user with a message saying: "The monitor has been successfully removed from the cluster. Please remove the monitor entry from the rest of your ceph configuration files, cluster wide." Hi Ken, Can you please move this BZ to ON_QA ? Regards, Vasishta Tried with ceph-ansible-3.0.2-1.el7cp.noarch, and observed that a message being displayed asking user to remove the monitor entry from the rest of your ceph configuration files, cluster wide. Looks good to me, moving to VERIFIED state. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:3387 |