Previously, when upgrading a containerized cluster to Red Hat Ceph Storage 3, the ceph-ansible utility failed to upgrade encrypted OSD nodes. As a consequence, such OSDs could not come up and journald logs included the following error message:
"Error response from daemon: No such container: expose_partitions_<disk>"
This bug has been fixed by modifying the underlying source code, and ceph-ansible upgrades containerized encrypted OSDs as expected.
An additional information -
As only three OSD nodes were there, playbook failed after retrying 40 times while running task "waiting for clean pgs..." as all osds were down on a node.
(In reply to Harish NV Rao from comment #6)
> @Drew, Can you please add the summary of the decision made in yesterday's
> program call regarding this bug?
@Drew, a gentle reminder. Please note that this bz is still having target release as 3.0.
Hi,
Working fine using ceph-ansible-3.0.26-1.el7cp.noarch to upgrade to
ceph-3.0-rhel-7-docker-candidate-38019-20180222163657 from rhceph-3-rhel7.
While upgrading cluster having OSDs with collocated journals, faced issue
reported in Bug 1548357, Followed workaround mentioned in
https://bugzilla.redhat.com/show_bug.cgi?id=1548357#c4 . (mention mgr's name at
top of mons group in inventory file)
Moving to VERIFIED state.
Regards,
Vasishta Shatsry
AQE, Ceph
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2018:0473
Created attachment 1360383 [details] File contains contents of OSD journald log snippet after enabling verbose Description of problem: Rolling update of containerized cluster from 2.4 to 3.0 failed on dmcrypt OSDs failed searching for container expose_partitions_<disk> Version-Release number of selected component (if applicable): ceph-ansible-3.0.14-1.el7cp.noarch Container image - rhceph:3-2 Steps to Reproduce: 1. Initialize ceph 2.4 cluster. 2. Update ceph-ansible to 3.x and follow doc to update the cluster to 3.0 Actual results: OSDs are failing to come up saying expose_partitions_sdd Expected results: OSDs must get updated successfully Additional info: OSD configurations as mentioned in inventory file - <node> ceph_osd_docker_prepare_env="-e CLUSTER={{ cluster }} -e OSD_JOURNAL_SIZE={{ journal_size }} -e OSD_FORCE_ZAP=1 -e OSD_DMCRYPT=1" ceph_osd_docker_extra_env="-e CLUSTER={{ cluster }} -e CEPH_DAEMON=OSD_CEPH_DISK_ACTIVATE -e OSD_JOURNAL_SIZE={{ journal_size }} -e OSD_DMCRYPT=1" devices="['/dev/sdb','/dev/sdc','/dev/sdd']" Contents of all.yml - $ grep -Ev '(^#|^$)' group_vars/all.yml --- dummy: ceph_docker_image_tag: "3-2" fetch_directory: ~/ceph-ansible-keys cluster: humpty monitor_interface: "eno1" radosgw_interface: "eno1" public_network: 10.8.128.0/21 docker: true ceph_docker_image: "rhceph" mon_containerized_deployment: true ceph_mon_docker_interface: "eno1" ceph_mon_docker_subnet: "{{ public_network }}" ceph_docker_registry: "brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888" #registry.access.redhat.com