Bug 1518788
Summary: | [ceph-ansible] [ceph-container] : rolling update of dmcrypt OSDs failed searching for container expose_partitions_<disk> | ||||||
---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Vasishta <vashastr> | ||||
Component: | Container | Assignee: | Sébastien Han <shan> | ||||
Status: | CLOSED ERRATA | QA Contact: | Vasishta <vashastr> | ||||
Severity: | high | Docs Contact: | Aron Gunn <agunn> | ||||
Priority: | high | ||||||
Version: | 3.0 | CC: | adeza, agunn, anharris, aschoen, ceph-eng-bugs, dang, gmeno, hchen, hnallurv, jim.curtis, kdreyer, me, nthomas, pprakash, sankarshan, shan | ||||
Target Milestone: | z1 | ||||||
Target Release: | 3.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | rhceph:ceph-3.0-rhel-7-docker-candidate-53483-20180117211610 | Doc Type: | Bug Fix | ||||
Doc Text: |
Previously, when upgrading a containerized cluster to Red Hat Ceph Storage 3, the ceph-ansible utility failed to upgrade encrypted OSD nodes. As a consequence, such OSDs could not come up and journald logs included the following error message:
"Error response from daemon: No such container: expose_partitions_<disk>"
This bug has been fixed by modifying the underlying source code, and ceph-ansible upgrades containerized encrypted OSDs as expected.
|
Story Points: | --- | ||||
Clone Of: | Environment: | ||||||
Last Closed: | 2018-03-08 15:46:22 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1494421 | ||||||
Attachments: |
|
An additional information - As only three OSD nodes were there, playbook failed after retrying 40 times while running task "waiting for clean pgs..." as all osds were down on a node. That's a nasty bug with no workaround at the moment :/ Only the patch will fix the issue. @Drew, Can you please add the summary of the decision made in yesterday's program call regarding this bug? (In reply to Harish NV Rao from comment #6) > @Drew, Can you please add the summary of the decision made in yesterday's > program call regarding this bug? @Drew, a gentle reminder. Please note that this bz is still having target release as 3.0. We will release note this for 3.0 and target it for the next async. I'm tracking this bug manually for now until we have a new target location for it. lgtm Hi, Working fine using ceph-ansible-3.0.26-1.el7cp.noarch to upgrade to ceph-3.0-rhel-7-docker-candidate-38019-20180222163657 from rhceph-3-rhel7. While upgrading cluster having OSDs with collocated journals, faced issue reported in Bug 1548357, Followed workaround mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1548357#c4 . (mention mgr's name at top of mons group in inventory file) Moving to VERIFIED state. Regards, Vasishta Shatsry AQE, Ceph Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0473 |
Created attachment 1360383 [details] File contains contents of OSD journald log snippet after enabling verbose Description of problem: Rolling update of containerized cluster from 2.4 to 3.0 failed on dmcrypt OSDs failed searching for container expose_partitions_<disk> Version-Release number of selected component (if applicable): ceph-ansible-3.0.14-1.el7cp.noarch Container image - rhceph:3-2 Steps to Reproduce: 1. Initialize ceph 2.4 cluster. 2. Update ceph-ansible to 3.x and follow doc to update the cluster to 3.0 Actual results: OSDs are failing to come up saying expose_partitions_sdd Expected results: OSDs must get updated successfully Additional info: OSD configurations as mentioned in inventory file - <node> ceph_osd_docker_prepare_env="-e CLUSTER={{ cluster }} -e OSD_JOURNAL_SIZE={{ journal_size }} -e OSD_FORCE_ZAP=1 -e OSD_DMCRYPT=1" ceph_osd_docker_extra_env="-e CLUSTER={{ cluster }} -e CEPH_DAEMON=OSD_CEPH_DISK_ACTIVATE -e OSD_JOURNAL_SIZE={{ journal_size }} -e OSD_DMCRYPT=1" devices="['/dev/sdb','/dev/sdc','/dev/sdd']" Contents of all.yml - $ grep -Ev '(^#|^$)' group_vars/all.yml --- dummy: ceph_docker_image_tag: "3-2" fetch_directory: ~/ceph-ansible-keys cluster: humpty monitor_interface: "eno1" radosgw_interface: "eno1" public_network: 10.8.128.0/21 docker: true ceph_docker_image: "rhceph" mon_containerized_deployment: true ceph_mon_docker_interface: "eno1" ceph_mon_docker_subnet: "{{ public_network }}" ceph_docker_registry: "brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888" #registry.access.redhat.com