Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1525209 - During upgrade, ceph-ansible does not disable the radosgw system service
During upgrade, ceph-ansible does not disable the radosgw system service
Status: CLOSED ERRATA
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: Ceph-Ansible (Show other bugs)
3.0
Unspecified Unspecified
unspecified Severity high
: rc
: 2.5
Assigned To: Guillaume Abrioux
Yogev Rabl
Bara Ancincova
: Triaged
Depends On: 1528430
Blocks: 1536401 1539738
  Show dependency treegraph
 
Reported: 2017-12-12 14:28 EST by Keith Schincke
Modified: 2018-02-21 14:46 EST (History)
15 users (show)

See Also:
Fixed In Version: RHEL: ceph-ansible-3.0.18-1.el7cp Ubuntu: ceph-ansible_3.0.18-2redhat1
Doc Type: Bug Fix
Doc Text:
.`ceph-ansible` now disables the Ceph Object Gateway service as expected when upgrading the OpenStack container When upgrading the OpenStack container from version 11 to 12, the `ceph-ansible` utility did not properly disable the Ceph Object Gateway service provided by the overcloud image. Consequently, the containerized Ceph Object Gateway service entered a failed state because the port it used was bound. The `ceph-ansible` utility has been updated to properly disable the system Ceph Object Gateway service. As a result, the containerized Ceph Object Gateway service starts as expected after upgrading the OpenStack container from version 11 to 12.
Story Points: ---
Clone Of:
: 1528430 1539738 (view as bug list)
Environment:
Last Closed: 2018-02-21 14:46:24 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Github ceph/ceph-ansible/pull/2291 None None None 2017-12-21 04:31 EST
Github ceph/ceph-ansible/pull/2329 None None None 2018-01-18 04:07 EST
Red Hat Product Errata RHBA-2018:0340 normal SHIPPED_LIVE Red Hat Ceph Storage 2.5 bug fix and enhancement update 2018-02-21 19:50:32 EST

  None (edit)
Description Keith Schincke 2017-12-12 14:28:55 EST
Description of problem:
When upgrading from OSP11 to OSP12 container, ceph-ansible attempts to disable the RGW service provided by the overcloud image. The task attempts to stop/disable ceph-rgw@{{ ansible-hostname }} and ceph-radosgw@{{ ansible-hostname }}.service. The actual service name is ceph-radosgw@radosgw.$name


Version-Release number of selected component (if applicable):
ceph-ansible-3.0.14-1.el7cp.noarch

How reproducible:


Steps to Reproduce:
1. Install OSP11
2. Upgrade to OSP12
3.

Actual results:
radosgw system service remains enabled. this causes the radosgw container service to go into a failed state due to the port being bound. 


Expected results:
radosgw system service disabled.
radosgw container service running


Additional info:
Working to replicate the reported issue.
Comment 1 leseb 2017-12-15 05:52:59 EST
Just to be sure, is it when running rolling_update.yml?
Comment 2 Giulio Fidente 2017-12-15 10:28:38 EST
(In reply to leseb from comment #1)
> Just to be sure, is it when running rolling_update.yml?

No, this is with switch-from-non-containerized-to-containerized-ceph-daemons.yml
Comment 12 Yogev Rabl 2018-01-16 14:47:11 EST
the verification failed.

The rgw system service is running and cause the rgw container to fail - it uses the same port. 

post upgrade output:
[root@controller-1 ~]# systemctl -a | grep ceph
ceph-radosgw@radosgw.gateway.service                                                                  loaded    active     running      Ceph rados gateway
  ceph-radosgw@rgw.controller-1.service                                                                 loaded    activating auto-restart Ceph RGW

# netstat -tuplan4 | grep 8080 
tcp        0      0 172.17.3.19:8080        0.0.0.0:*               LISTEN      158261/haproxy
tcp        0      0 10.0.0.110:8080         0.0.0.0:*               LISTEN      158261/haproxy
tcp        0      0 172.17.3.20:8080        0.0.0.0:*               LISTEN      93050/radosgw
Comment 15 Yogev Rabl 2018-01-22 15:43:08 EST
the verification failed
Comment 17 Guillaume Abrioux 2018-01-29 12:01:39 EST
Tested with v3.0.21 and it passed.

@Yogev, I think we can move this BZ to 'VERIFIED' ?
Comment 21 Guillaume Abrioux 2018-02-02 16:29:13 EST
Hi Bara,

it looks good to me.
Comment 22 Yogev Rabl 2018-02-07 11:43:19 EST
I have came across a problem in the upgrade that is not related to this bug, we are checking for a workaround and will verify this
Comment 23 Yogev Rabl 2018-02-08 21:10:17 EST
This is a work in progress, will test it tomorrow at the latest
Comment 24 Yogev Rabl 2018-02-09 13:41:14 EST
With the latest version of Ceph-ansible and the latest version of ceph docker image the verification failed:

[root@controller-0 ~]# ceph -s
    cluster c9e9f454-0ce9-11e8-a5d6-5254007feace
     health HEALTH_WARN
            too many PGs per OSD (480 > max 300)
     monmap e1: 3 mons at {controller-0=172.17.3.15:6789/0,controller-1=172.17.3.21:6789/0,controller-2=172.17.3.13:6789/0}
            election epoch 38, quorum 0,1,2 controller-2,controller-0,controller-1
     osdmap e48: 3 osds: 3 up, 3 in
            flags sortbitwise,require_jewel_osds,recovery_deletes
      pgmap v3255: 480 pgs, 14 pools, 1588 bytes data, 171 objects
            138 MB used, 104 GB / 104 GB avail
                 480 active+clean
[root@controller-0 ~]# docker ps | grep ceph
61b91cc92326        brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:ceph-2-rhel-7-docker-candidate-81064-20180205070134   "/entrypoint.sh"         2 minutes ago       Up 2 minutes                                           ceph-mon-controller-0
dc6bbfae758b        brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:ceph-2-rhel-7-docker-candidate-81064-20180205070134   "/entrypoint.sh"         3 minutes ago       Up 3 minutes                                           ceph-rgw-controller-0
[root@controller-0 ~]# systemctl -a | grep ceph
  ceph-mon@controller-0.service                                                                         loaded    active   running   Ceph Monitor
  ceph-radosgw@rgw.controller-0.service                                                                 loaded    active   running   Ceph RGW
  system-ceph\x2dcreate\x2dkeys.slice                                                                   loaded    active   active    system-ceph\x2dcreate\x2dkeys.slice
  system-ceph\x2dmon.slice                                                                              loaded    active   active    system-ceph\x2dmon.slice
  system-ceph\x2dradosgw.slice                                                                          loaded    active   active    system-ceph\x2dradosgw.slice
  ceph-mds.target                                                                                       loaded    active   active    ceph target allowing to start/stop all ceph-mds@.service instances at once
  ceph-mon.target                                                                                       loaded    active   active    ceph target allowing to start/stop all ceph-mon@.service instances at once
  ceph-osd.target                                                                                       loaded    active   active    ceph target allowing to start/stop all ceph-osd@.service instances at once
  ceph-radosgw.target                                                                                   loaded    active   active    ceph target allowing to start/stop all ceph-radosgw@.service instances at once
  ceph.target                                                                                           loaded    active   active    ceph target allowing to start/stop all ceph*@.service
Comment 25 Yogev Rabl 2018-02-12 09:29:30 EST
The verification was successful after all

[root@controller-2 ~]# systemctl status ceph-radosgw@rgw.controller-2.serviceceph-radosgw@rgw.controller-2.service - Ceph RGW
   Loaded: loaded (/etc/systemd/system/ceph-radosgw@.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2018-02-09 18:28:47 UTC; 2 days ago
 Main PID: 729787 (docker-current)
   CGroup: /system.slice/system-ceph\x2dradosgw.slice/ceph-radosgw@rgw.controller-2.service
           └─729787 /usr/bin/docker-current run --rm --net=host --memory=1g --cpu-quota=100000 -v /var/lib/ceph:/var/lib/ceph -v /etc/ceph:/etc/ceph -e RGW_CIVETWEB_IP=172.17.3.13 -v /etc/...
Comment 28 errata-xmlrpc 2018-02-21 14:46:24 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0340

Note You need to log in before you can comment on or make changes to this bug.