Description of problem: When upgrading from OSP11 to OSP12 container, ceph-ansible attempts to disable the RGW service provided by the overcloud image. The task attempts to stop/disable ceph-rgw@{{ ansible-hostname }} and ceph-radosgw@{{ ansible-hostname }}.service. The actual service name is ceph-radosgw@radosgw.$name Version-Release number of selected component (if applicable): ceph-ansible-3.0.14-1.el7cp.noarch How reproducible: Steps to Reproduce: 1. Install OSP11 2. Upgrade to OSP12 3. Actual results: radosgw system service remains enabled. this causes the radosgw container service to go into a failed state due to the port being bound. Expected results: radosgw system service disabled. radosgw container service running Additional info: Working to replicate the reported issue.
Just to be sure, is it when running rolling_update.yml?
(In reply to leseb from comment #1) > Just to be sure, is it when running rolling_update.yml? No, this is with switch-from-non-containerized-to-containerized-ceph-daemons.yml
the verification failed. The rgw system service is running and cause the rgw container to fail - it uses the same port. post upgrade output: [root@controller-1 ~]# systemctl -a | grep ceph ceph-radosgw.service loaded active running Ceph rados gateway ceph-radosgw.service loaded activating auto-restart Ceph RGW # netstat -tuplan4 | grep 8080 tcp 0 0 172.17.3.19:8080 0.0.0.0:* LISTEN 158261/haproxy tcp 0 0 10.0.0.110:8080 0.0.0.0:* LISTEN 158261/haproxy tcp 0 0 172.17.3.20:8080 0.0.0.0:* LISTEN 93050/radosgw
fix in https://github.com/ceph/ceph-ansible/releases/tag/v3.0.18
the verification failed
Tested with v3.0.21 and it passed. @Yogev, I think we can move this BZ to 'VERIFIED' ?
Hi Bara, it looks good to me.
I have came across a problem in the upgrade that is not related to this bug, we are checking for a workaround and will verify this
This is a work in progress, will test it tomorrow at the latest
With the latest version of Ceph-ansible and the latest version of ceph docker image the verification failed: [root@controller-0 ~]# ceph -s cluster c9e9f454-0ce9-11e8-a5d6-5254007feace health HEALTH_WARN too many PGs per OSD (480 > max 300) monmap e1: 3 mons at {controller-0=172.17.3.15:6789/0,controller-1=172.17.3.21:6789/0,controller-2=172.17.3.13:6789/0} election epoch 38, quorum 0,1,2 controller-2,controller-0,controller-1 osdmap e48: 3 osds: 3 up, 3 in flags sortbitwise,require_jewel_osds,recovery_deletes pgmap v3255: 480 pgs, 14 pools, 1588 bytes data, 171 objects 138 MB used, 104 GB / 104 GB avail 480 active+clean [root@controller-0 ~]# docker ps | grep ceph 61b91cc92326 brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:ceph-2-rhel-7-docker-candidate-81064-20180205070134 "/entrypoint.sh" 2 minutes ago Up 2 minutes ceph-mon-controller-0 dc6bbfae758b brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:ceph-2-rhel-7-docker-candidate-81064-20180205070134 "/entrypoint.sh" 3 minutes ago Up 3 minutes ceph-rgw-controller-0 [root@controller-0 ~]# systemctl -a | grep ceph ceph-mon loaded active running Ceph Monitor ceph-radosgw.service loaded active running Ceph RGW system-ceph\x2dcreate\x2dkeys.slice loaded active active system-ceph\x2dcreate\x2dkeys.slice system-ceph\x2dmon.slice loaded active active system-ceph\x2dmon.slice system-ceph\x2dradosgw.slice loaded active active system-ceph\x2dradosgw.slice ceph-mds.target loaded active active ceph target allowing to start/stop all ceph-mds@.service instances at once ceph-mon.target loaded active active ceph target allowing to start/stop all ceph-mon@.service instances at once ceph-osd.target loaded active active ceph target allowing to start/stop all ceph-osd@.service instances at once ceph-radosgw.target loaded active active ceph target allowing to start/stop all ceph-radosgw@.service instances at once ceph.target loaded active active ceph target allowing to start/stop all ceph*@.service
The verification was successful after all [root@controller-2 ~]# systemctl status ceph-radosgw.service ● ceph-radosgw.service - Ceph RGW Loaded: loaded (/etc/systemd/system/ceph-radosgw@.service; enabled; vendor preset: disabled) Active: active (running) since Fri 2018-02-09 18:28:47 UTC; 2 days ago Main PID: 729787 (docker-current) CGroup: /system.slice/system-ceph\x2dradosgw.slice/ceph-radosgw.service └─729787 /usr/bin/docker-current run --rm --net=host --memory=1g --cpu-quota=100000 -v /var/lib/ceph:/var/lib/ceph -v /etc/ceph:/etc/ceph -e RGW_CIVETWEB_IP=172.17.3.13 -v /etc/...
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0340