Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
This project is now read‑only. Starting Monday, February 2, please use Jira Cloud for all bug tracking management.

Bug 1525209

Summary: During upgrade, ceph-ansible does not disable the radosgw system service
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Keith Schincke <kschinck>
Component: Ceph-AnsibleAssignee: Guillaume Abrioux <gabrioux>
Status: CLOSED ERRATA QA Contact: Yogev Rabl <yrabl>
Severity: high Docs Contact: Bara Ancincova <bancinco>
Priority: unspecified    
Version: 3.0CC: adeza, agunn, aschoen, ceph-eng-bugs, ceph-qe-bugs, gabrioux, gfidente, gkadam, gmeno, hnallurv, kdreyer, kschinck, nthomas, sankarshan, yrabl
Target Milestone: rcKeywords: Triaged
Target Release: 2.5   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: RHEL: ceph-ansible-3.0.18-1.el7cp Ubuntu: ceph-ansible_3.0.18-2redhat1 Doc Type: Bug Fix
Doc Text:
.`ceph-ansible` now disables the Ceph Object Gateway service as expected when upgrading the OpenStack container When upgrading the OpenStack container from version 11 to 12, the `ceph-ansible` utility did not properly disable the Ceph Object Gateway service provided by the overcloud image. Consequently, the containerized Ceph Object Gateway service entered a failed state because the port it used was bound. The `ceph-ansible` utility has been updated to properly disable the system Ceph Object Gateway service. As a result, the containerized Ceph Object Gateway service starts as expected after upgrading the OpenStack container from version 11 to 12.
Story Points: ---
Clone Of:
: 1528430 1539738 (view as bug list) Environment:
Last Closed: 2018-02-21 19:46:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1528430    
Bug Blocks: 1536401, 1539738    

Description Keith Schincke 2017-12-12 19:28:55 UTC
Description of problem:
When upgrading from OSP11 to OSP12 container, ceph-ansible attempts to disable the RGW service provided by the overcloud image. The task attempts to stop/disable ceph-rgw@{{ ansible-hostname }} and ceph-radosgw@{{ ansible-hostname }}.service. The actual service name is ceph-radosgw@radosgw.$name


Version-Release number of selected component (if applicable):
ceph-ansible-3.0.14-1.el7cp.noarch

How reproducible:


Steps to Reproduce:
1. Install OSP11
2. Upgrade to OSP12
3.

Actual results:
radosgw system service remains enabled. this causes the radosgw container service to go into a failed state due to the port being bound. 


Expected results:
radosgw system service disabled.
radosgw container service running


Additional info:
Working to replicate the reported issue.

Comment 1 Sébastien Han 2017-12-15 10:52:59 UTC
Just to be sure, is it when running rolling_update.yml?

Comment 2 Giulio Fidente 2017-12-15 15:28:38 UTC
(In reply to leseb from comment #1)
> Just to be sure, is it when running rolling_update.yml?

No, this is with switch-from-non-containerized-to-containerized-ceph-daemons.yml

Comment 12 Yogev Rabl 2018-01-16 19:47:11 UTC
the verification failed.

The rgw system service is running and cause the rgw container to fail - it uses the same port. 

post upgrade output:
[root@controller-1 ~]# systemctl -a | grep ceph
ceph-radosgw.service                                                                  loaded    active     running      Ceph rados gateway
  ceph-radosgw.service                                                                 loaded    activating auto-restart Ceph RGW

# netstat -tuplan4 | grep 8080 
tcp        0      0 172.17.3.19:8080        0.0.0.0:*               LISTEN      158261/haproxy
tcp        0      0 10.0.0.110:8080         0.0.0.0:*               LISTEN      158261/haproxy
tcp        0      0 172.17.3.20:8080        0.0.0.0:*               LISTEN      93050/radosgw

Comment 13 Sébastien Han 2018-01-18 14:35:27 UTC
fix in https://github.com/ceph/ceph-ansible/releases/tag/v3.0.18

Comment 15 Yogev Rabl 2018-01-22 20:43:08 UTC
the verification failed

Comment 17 Guillaume Abrioux 2018-01-29 17:01:39 UTC
Tested with v3.0.21 and it passed.

@Yogev, I think we can move this BZ to 'VERIFIED' ?

Comment 21 Guillaume Abrioux 2018-02-02 21:29:13 UTC
Hi Bara,

it looks good to me.

Comment 22 Yogev Rabl 2018-02-07 16:43:19 UTC
I have came across a problem in the upgrade that is not related to this bug, we are checking for a workaround and will verify this

Comment 23 Yogev Rabl 2018-02-09 02:10:17 UTC
This is a work in progress, will test it tomorrow at the latest

Comment 24 Yogev Rabl 2018-02-09 18:41:14 UTC
With the latest version of Ceph-ansible and the latest version of ceph docker image the verification failed:

[root@controller-0 ~]# ceph -s
    cluster c9e9f454-0ce9-11e8-a5d6-5254007feace
     health HEALTH_WARN
            too many PGs per OSD (480 > max 300)
     monmap e1: 3 mons at {controller-0=172.17.3.15:6789/0,controller-1=172.17.3.21:6789/0,controller-2=172.17.3.13:6789/0}
            election epoch 38, quorum 0,1,2 controller-2,controller-0,controller-1
     osdmap e48: 3 osds: 3 up, 3 in
            flags sortbitwise,require_jewel_osds,recovery_deletes
      pgmap v3255: 480 pgs, 14 pools, 1588 bytes data, 171 objects
            138 MB used, 104 GB / 104 GB avail
                 480 active+clean
[root@controller-0 ~]# docker ps | grep ceph
61b91cc92326        brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:ceph-2-rhel-7-docker-candidate-81064-20180205070134   "/entrypoint.sh"         2 minutes ago       Up 2 minutes                                           ceph-mon-controller-0
dc6bbfae758b        brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:ceph-2-rhel-7-docker-candidate-81064-20180205070134   "/entrypoint.sh"         3 minutes ago       Up 3 minutes                                           ceph-rgw-controller-0
[root@controller-0 ~]# systemctl -a | grep ceph
  ceph-mon                                                                         loaded    active   running   Ceph Monitor
  ceph-radosgw.service                                                                 loaded    active   running   Ceph RGW
  system-ceph\x2dcreate\x2dkeys.slice                                                                   loaded    active   active    system-ceph\x2dcreate\x2dkeys.slice
  system-ceph\x2dmon.slice                                                                              loaded    active   active    system-ceph\x2dmon.slice
  system-ceph\x2dradosgw.slice                                                                          loaded    active   active    system-ceph\x2dradosgw.slice
  ceph-mds.target                                                                                       loaded    active   active    ceph target allowing to start/stop all ceph-mds@.service instances at once
  ceph-mon.target                                                                                       loaded    active   active    ceph target allowing to start/stop all ceph-mon@.service instances at once
  ceph-osd.target                                                                                       loaded    active   active    ceph target allowing to start/stop all ceph-osd@.service instances at once
  ceph-radosgw.target                                                                                   loaded    active   active    ceph target allowing to start/stop all ceph-radosgw@.service instances at once
  ceph.target                                                                                           loaded    active   active    ceph target allowing to start/stop all ceph*@.service

Comment 25 Yogev Rabl 2018-02-12 14:29:30 UTC
The verification was successful after all

[root@controller-2 ~]# systemctl status ceph-radosgw.service
● ceph-radosgw.service - Ceph RGW
   Loaded: loaded (/etc/systemd/system/ceph-radosgw@.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2018-02-09 18:28:47 UTC; 2 days ago
 Main PID: 729787 (docker-current)
   CGroup: /system.slice/system-ceph\x2dradosgw.slice/ceph-radosgw.service
           └─729787 /usr/bin/docker-current run --rm --net=host --memory=1g --cpu-quota=100000 -v /var/lib/ceph:/var/lib/ceph -v /etc/ceph:/etc/ceph -e RGW_CIVETWEB_IP=172.17.3.13 -v /etc/...

Comment 28 errata-xmlrpc 2018-02-21 19:46:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0340