Bug 1498218
| Summary: | ceph-ansible RGW role restarts all RGWs simutaneously | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Tupper Cole <tcole> |
| Component: | Ceph-Ansible | Assignee: | Sébastien Han <shan> |
| Status: | CLOSED ERRATA | QA Contact: | Vidushi Mishra <vimishra> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 3.0 | CC: | adeza, anharris, aschoen, ceph-eng-bugs, gabrioux, gmeno, hnallurv, icolle, kdreyer, nthomas, sankarshan, shan, tcole, vimishra |
| Target Milestone: | rc | ||
| Target Release: | 3.0 | ||
| Hardware: | All | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | RHEL: ceph-ansible-3.0.3-1.el7cp Ubuntu: ceph-ansible_3.0.3-2redhat1 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-12-05 23:46:42 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Tupper Cole
2017-10-03 18:36:48 UTC
There is a difference between upgrades and restart. The handler will restart all the gateway, we know that and I have a fix pending upstream. This has nothing to do with the rolling update playbook even if there is a dependancy. Thanks. Will be in 3.0.3, release upstream is here: https://github.com/ceph/ceph-ansible/releases/tag/v3.0.3 Ken, can you build a package? Thanks. Deploy rgws, change something in the ceph.conf with ceph_conf_overrides, run ansible again, look at the restart sequence. Then look at the process age to make sure they all restarted serially. Looks like magna079 wasn't restarted, was it? 1h12min ago versus 16min? Password for the logs? Ok it didn't notice the line was truncated, looking now Hi Vidushi, can you paste the group_vars/* as used for the deployment mentioned in c13, please? Please try again, if you look at this log, you will see that the restart works: https://2.jenkins.ceph.com/view/ceph-ansible-luminous-nightly/job/ceph-ansible-nightly-luminous-ansible2.3-centos7_cluster/28/consoleFull Search for "restart ceph rgw daemon(s) - container", you will see that ceph-rgw0 gets restarted. Hi Seb,
I re-ran the ansible-playbook. I observed that the rgw roles have restarted in the playbook logs as shown below:
RUNNING HANDLER [ceph-defaults : copy rgw restart script] ************************************************************************************************************************************************************************************
ok: [magna090]
RUNNING HANDLER [ceph-defaults : restart ceph rgw daemon(s) - non container] *****************************************************************************************************************************************************************
changed: [magna090 -> magna100] => (item=magna100)
changed: [magna090 -> magna090] => (item=magna090)
Via the systemctl status, it looks that the 2 rgw roles have restarted with a difference of 10-12 seconds. O/p shown below:
--------------------------- console o/p -----------------------
[root@magna100 ubuntu]# systemctl status ceph-radosgw.service
● ceph-radosgw.service - Ceph rados gateway
Loaded: loaded (/usr/lib/systemd/system/ceph-radosgw@.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2017-10-27 12:06:19 UTC; 3min 4s ago
Main PID: 29180 (radosgw)
CGroup: /system.slice/system-ceph\x2dradosgw.slice/ceph-radosgw.service
└─29180 /usr/bin/radosgw -f --cluster ceph --name client.rgw.magna100 --setuser ceph --setgroup ceph
Oct 27 12:06:19 magna100 systemd[1]: Started Ceph rados gateway.
Oct 27 12:06:19 magna100 systemd[1]: Starting Ceph rados gateway...
Oct 27 12:06:19 magna100 radosgw[29180]: warning: line 29: 'host' in section 'client.rgw.magna100' redefined
Oct 27 12:06:19 magna100 radosgw[29180]: warning: line 30: 'keyring' in section 'client.rgw.magna100' redefined
Oct 27 12:06:19 magna100 radosgw[29180]: warning: line 31: 'log_file' in section 'client.rgw.magna100' redefined
[root@magna100 ubuntu]#
[root@magna090 ceph-ansible]# systemctl status ceph-radosgw.service
● ceph-radosgw.service - Ceph rados gateway
Loaded: loaded (/usr/lib/systemd/system/ceph-radosgw@.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2017-10-27 12:06:30 UTC; 2min 57s ago
Main PID: 17471 (radosgw)
CGroup: /system.slice/system-ceph\x2dradosgw.slice/ceph-radosgw.service
└─17471 /usr/bin/radosgw -f --cluster ceph --name client.rgw.magna090 --setuser ceph --setgroup ceph
Oct 27 12:06:30 magna090 systemd[1]: Started Ceph rados gateway.
Oct 27 12:06:30 magna090 systemd[1]: Starting Ceph rados gateway...
[root@magna090 ceph-ansible]#
-------------------------------------------------------------------
Is this sufficient to verify this BZ? Do let me know.
Thanks,
Vidushi
Also, please let us know what is the expected time difference among the multiple rgw roles for restart? That's the expected behavior and results are good. Thanks, please move this to VERIFIED. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:3387 |