Bug 1525209
| Summary: | During upgrade, ceph-ansible does not disable the radosgw system service | |||
|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Keith Schincke <kschinck> | |
| Component: | Ceph-Ansible | Assignee: | Guillaume Abrioux <gabrioux> | |
| Status: | CLOSED ERRATA | QA Contact: | Yogev Rabl <yrabl> | |
| Severity: | high | Docs Contact: | Bara Ancincova <bancinco> | |
| Priority: | unspecified | |||
| Version: | 3.0 | CC: | adeza, agunn, aschoen, ceph-eng-bugs, ceph-qe-bugs, gabrioux, gfidente, gkadam, gmeno, hnallurv, kdreyer, kschinck, nthomas, sankarshan, yrabl | |
| Target Milestone: | rc | Keywords: | Triaged | |
| Target Release: | 2.5 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | RHEL: ceph-ansible-3.0.18-1.el7cp Ubuntu: ceph-ansible_3.0.18-2redhat1 | Doc Type: | Bug Fix | |
| Doc Text: |
.`ceph-ansible` now disables the Ceph Object Gateway service as expected when upgrading the OpenStack container
When upgrading the OpenStack container from version 11 to 12, the `ceph-ansible` utility did not properly disable the Ceph Object Gateway service provided by the overcloud image. Consequently, the containerized Ceph Object Gateway service entered a failed state because the port it used was bound. The `ceph-ansible` utility has been updated to properly disable the system Ceph Object Gateway service. As a result, the containerized Ceph Object Gateway service starts as expected after upgrading the OpenStack container from version 11 to 12.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 1528430 1539738 (view as bug list) | Environment: | ||
| Last Closed: | 2018-02-21 19:46:24 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1528430 | |||
| Bug Blocks: | 1536401, 1539738 | |||
Just to be sure, is it when running rolling_update.yml? (In reply to leseb from comment #1) > Just to be sure, is it when running rolling_update.yml? No, this is with switch-from-non-containerized-to-containerized-ceph-daemons.yml the verification failed. The rgw system service is running and cause the rgw container to fail - it uses the same port. post upgrade output: [root@controller-1 ~]# systemctl -a | grep ceph ceph-radosgw.service loaded active running Ceph rados gateway ceph-radosgw.service loaded activating auto-restart Ceph RGW # netstat -tuplan4 | grep 8080 tcp 0 0 172.17.3.19:8080 0.0.0.0:* LISTEN 158261/haproxy tcp 0 0 10.0.0.110:8080 0.0.0.0:* LISTEN 158261/haproxy tcp 0 0 172.17.3.20:8080 0.0.0.0:* LISTEN 93050/radosgw the verification failed Tested with v3.0.21 and it passed. @Yogev, I think we can move this BZ to 'VERIFIED' ? Hi Bara, it looks good to me. I have came across a problem in the upgrade that is not related to this bug, we are checking for a workaround and will verify this This is a work in progress, will test it tomorrow at the latest With the latest version of Ceph-ansible and the latest version of ceph docker image the verification failed:
[root@controller-0 ~]# ceph -s
cluster c9e9f454-0ce9-11e8-a5d6-5254007feace
health HEALTH_WARN
too many PGs per OSD (480 > max 300)
monmap e1: 3 mons at {controller-0=172.17.3.15:6789/0,controller-1=172.17.3.21:6789/0,controller-2=172.17.3.13:6789/0}
election epoch 38, quorum 0,1,2 controller-2,controller-0,controller-1
osdmap e48: 3 osds: 3 up, 3 in
flags sortbitwise,require_jewel_osds,recovery_deletes
pgmap v3255: 480 pgs, 14 pools, 1588 bytes data, 171 objects
138 MB used, 104 GB / 104 GB avail
480 active+clean
[root@controller-0 ~]# docker ps | grep ceph
61b91cc92326 brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:ceph-2-rhel-7-docker-candidate-81064-20180205070134 "/entrypoint.sh" 2 minutes ago Up 2 minutes ceph-mon-controller-0
dc6bbfae758b brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:ceph-2-rhel-7-docker-candidate-81064-20180205070134 "/entrypoint.sh" 3 minutes ago Up 3 minutes ceph-rgw-controller-0
[root@controller-0 ~]# systemctl -a | grep ceph
ceph-mon loaded active running Ceph Monitor
ceph-radosgw.service loaded active running Ceph RGW
system-ceph\x2dcreate\x2dkeys.slice loaded active active system-ceph\x2dcreate\x2dkeys.slice
system-ceph\x2dmon.slice loaded active active system-ceph\x2dmon.slice
system-ceph\x2dradosgw.slice loaded active active system-ceph\x2dradosgw.slice
ceph-mds.target loaded active active ceph target allowing to start/stop all ceph-mds@.service instances at once
ceph-mon.target loaded active active ceph target allowing to start/stop all ceph-mon@.service instances at once
ceph-osd.target loaded active active ceph target allowing to start/stop all ceph-osd@.service instances at once
ceph-radosgw.target loaded active active ceph target allowing to start/stop all ceph-radosgw@.service instances at once
ceph.target loaded active active ceph target allowing to start/stop all ceph*@.service
The verification was successful after all
[root@controller-2 ~]# systemctl status ceph-radosgw.service
● ceph-radosgw.service - Ceph RGW
Loaded: loaded (/etc/systemd/system/ceph-radosgw@.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2018-02-09 18:28:47 UTC; 2 days ago
Main PID: 729787 (docker-current)
CGroup: /system.slice/system-ceph\x2dradosgw.slice/ceph-radosgw.service
└─729787 /usr/bin/docker-current run --rm --net=host --memory=1g --cpu-quota=100000 -v /var/lib/ceph:/var/lib/ceph -v /etc/ceph:/etc/ceph -e RGW_CIVETWEB_IP=172.17.3.13 -v /etc/...
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0340 |
Description of problem: When upgrading from OSP11 to OSP12 container, ceph-ansible attempts to disable the RGW service provided by the overcloud image. The task attempts to stop/disable ceph-rgw@{{ ansible-hostname }} and ceph-radosgw@{{ ansible-hostname }}.service. The actual service name is ceph-radosgw@radosgw.$name Version-Release number of selected component (if applicable): ceph-ansible-3.0.14-1.el7cp.noarch How reproducible: Steps to Reproduce: 1. Install OSP11 2. Upgrade to OSP12 3. Actual results: radosgw system service remains enabled. this causes the radosgw container service to go into a failed state due to the port being bound. Expected results: radosgw system service disabled. radosgw container service running Additional info: Working to replicate the reported issue.