Bug 1613626

Summary: [ceph-ansible] - rolling_update not upgrading containerized OSDs when osd_auto_discovery set to true
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Vasishta <vashastr>
Component: Ceph-AnsibleAssignee: Sébastien Han <shan>
Status: CLOSED ERRATA QA Contact: Vasishta <vashastr>
Severity: high Docs Contact: Aron Gunn <agunn>
Priority: high    
Version: 3.0CC: agunn, aschoen, ceph-eng-bugs, edonnell, flucifre, gmeno, hnallurv, jbrier, nthomas, sankarshan, tchandra
Target Milestone: rcKeywords: Regression
Target Release: 3.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: RHEL: ceph-ansible-3.1.0-0.1.rc18.el7cp Ubuntu: ceph-ansible_3.1.0~rc18-2redhat1 Doc Type: Bug Fix
Doc Text:
.Containerized OSDs for which `osd_auto_discovery` flag was set to `true` properly restart during a rolling update Previously, when using the Ansible rolling update playbook in a containerized environment, OSDs for which `osd_auto_discovery` flag is set to `true` are not restarted and the OSD services run with old image. With this release, the OSDs are restarting as expected.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-09-26 18:23:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1584264    
Attachments:
Description Flags
File contains contents ansible-playbook log
none
File contans rolling update log snippet and releated disk info none

Description Vasishta 2018-08-08 03:16:21 UTC
Description of problem:
rolling_update not upgrading containerized OSDs when osd_auto_discovery set to true

Version-Release number of selected component (if applicable):
ceph-ansible-3.0.39-1.el7cp.noarch
Tried to upgrade from registry.access.redhat.com/rhceph/rhceph-3-rhel7:latest  to brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:3-12

How reproducible:
Always (2/2)

Steps to Reproduce:
1. Configure containerized OSDs of version n-1 with atleast one OSD with osd_auto_discovery flag set to true
2.initiate rolling update to upgrade the cluster to latest version

Actual results:
OSDs are not upgraded

Expected results:
OSDs must get updated to latest version

Additional info:
$ cat /etc/ansible/hosts

....
[osds]
....
magna023 osd_auto_discovery='true' osd_scenario="collocated" dmcrypt="true"
....
[nfss]
magna023

[ubuntu@magna020 ~]$ sudo docker ps
CONTAINER ID        IMAGE                                                              COMMAND             CREATED             STATUS              PORTS               NAMES
94fea1af2a54        brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:3-12   "/entrypoint.sh"    8 hours ago         Up 8 hours                              ceph-mds-magna020
012aa349d25f        registry.access.redhat.com/rhceph/rhceph-3-rhel7:latest            "/entrypoint.sh"    10 hours ago        Up 10 hours                             ceph-osd-magna020-sdc
2fae76123822        registry.access.redhat.com/rhceph/rhceph-3-rhel7:latest            "/entrypoint.sh"    10 hours ago        Up 10 hours                             ceph-osd-magna020-sdb
5aff84bf4e11        registry.access.redhat.com/rhceph/rhceph-3-rhel7:latest            "/entrypoint.sh"    10 hours ago        Up 10 hours                             ceph-osd-magna020-sdd

Comment 4 Vasishta 2018-08-08 03:31:05 UTC
Created attachment 1474137 [details]
File contains contents ansible-playbook log

Comment 7 Sébastien Han 2018-08-09 13:23:35 UTC
Do you mind testing the patch I have upstream?
Thanks.

Comment 9 Sébastien Han 2018-08-10 12:40:49 UTC
In https://github.com/ceph/ceph-ansible/releases/tag/v3.0.42

Comment 10 Vasishta 2018-08-13 05:40:45 UTC
Created attachment 1475462 [details]
File contans rolling update log snippet and releated disk info

Hi,

With ceph-ansible-3.1.0-0.1.rc17, I tried upstream file changes - https://github.com/ceph/ceph-ansible/pull/2995/files#diff-8bf493809fd7084ba4cfac427420678c

rolling update  is failing in task TASK [ceph-osd : create gpt disk label] as *all the disks* are bing selected for 'devices' instead of selecting devices which are configured for OSD.

I request you to kindly move the bug status back to ASSIGNED as fix is not working.

Regards,
Vasishta Shastry
QE, Ceph

Comment 13 Sébastien Han 2018-08-14 13:10:58 UTC
PR is pending upstream, the fix in the new tag should be present today or tomorrow.

Comment 14 Christina Meno 2018-08-14 20:54:47 UTC
Federico,

osd_auto_discover is used when you don't specify the devices 

From the docs:
Optionally, use the devices parameter to specify devices that the OSD nodes will use. Use a comma-separated list to list multiple devices.

[osds]
<ceph-host-name> devices="[ '<device_1>', '<device_2>' ]"

so in containerized setups where the customer doesn't say which devices to use in the inventory this bug would be triggered.

cheers,
G

Comment 20 Vasishta 2018-08-17 09:39:12 UTC
Working fine with ceph-ansible-3.1.0-0.1.rc18.el7cp.noarch
Moving to VERIFIED state.

Regards,
Vasishta shastry
QE, Ceph

Comment 22 Sébastien Han 2018-09-25 11:58:57 UTC
lgtm thanks

Comment 24 errata-xmlrpc 2018-09-26 18:23:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2819