Bug 1366808
Summary: | Ansible: upgrade nodes with encrypted OSDs and support for dm-crypt | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Storage Console | Reporter: | Federico Lucifredi <flucifre> | ||||||
Component: | ceph-ansible | Assignee: | seb | ||||||
Status: | CLOSED ERRATA | QA Contact: | ceph-qe-bugs <ceph-qe-bugs> | ||||||
Severity: | unspecified | Docs Contact: | Bara Ancincova <bancinco> | ||||||
Priority: | unspecified | ||||||||
Version: | 2 | CC: | adeza, aschoen, bancinco, ceph-eng-bugs, flucifre, gmeno, hnallurv, kdreyer, ldachary, nlevine, nthomas, rghatvis, sankarshan, seb, shan, tchandra, vashastr, vumrao | ||||||
Target Milestone: | --- | ||||||||
Target Release: | 2 | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: |
.Upgrading encrypted OSDs is now supported
Previously, the `ceph-ansible` utility did not support adding encrypted OSD nodes. As a consequence, an attempt to upgrade to a newer, minor, or major version failed on encrypted OSD nodes. In addition, Ansible returned the following error message during the disk activation task:
----
mount: unknown filesystem type 'crypto_LUKS'
----
With this update, `ceph-ansible` supports adding encrypted OSD nodes, and upgrading works as expected.
|
Story Points: | --- | ||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2017-03-14 15:50:45 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 1388885 | ||||||||
Bug Blocks: | 1322504, 1346350, 1383917, 1412948 | ||||||||
Attachments: |
|
Description
Federico Lucifredi
2016-08-12 22:36:18 UTC
This should be targeted at the first async, but I only have releases 2 and 3 as options — @Rakesh, I just did, let me know if it's clear. Ken, I'll ping you tomorrow, there is something I'm not sure about the cherry-pick stuff :) Looks good to me, I just added "during the disk activation task", to point exactly where the playbook will fail. Yup fixed with v1.0.8 This will ship concurrently with RHCS 2.1. Hi Bara, The doc text needs changing. the current version of ceph-ansible supports encrypted OSD's. ceph-ansible-1.0.5-39.el7scon Also I ran the rolling_updates.yml, commenting out roles is not needed anymore. @Seb can you please confirm. Thanks, Tejas LGTM Bara! Thanks! Docs, Please make sure the workaround documented in in 2.0 for use of encrypted OSDs is also documented in the 2.1 release notes. Hi Seb, Doing a rolling_update with the workaround that Bara has mentioned in the Doc text is causing a failure on the MON update. ( commenting out the roles except ceph-common) I have attached the log here. Thanks, Tejas Created attachment 1220469 [details]
workaround with encrypted OSD's rolling update
@Bara please update the doc, you only need to comment out the ceph-osd role, the others are good since "only" OSD nodes are impacted. Tejas, I see 2 issues: 1. ceph.conf changed and tries to trigger a restart, which shouldn't be an issue if the ceph-mon is called. 2. ceph-mon role was commented out Can you try without commenting out ceph-mon? Then @Bara please update the doc, you only need to comment out the ceph-osd role, the others are good since "only" OSD nodes are impacted. Tejas, I see 2 issues: 1. ceph.conf changed and tries to trigger a restart, which shouldn't be an issue if the ceph-mon is called. 2. ceph-mon role was commented out Can you try without commenting out ceph-mon? Seb, I tried without commenting any role, but I hit into this. https://bugzilla.redhat.com/show_bug.cgi?id=1391675 So the MON updates go through fine. Can't be certain about the OSD upgrades till this bug is resolved. Thanks, Tejas I added a Depends on: https://bugzilla.redhat.com/show_bug.cgi?id=1394928 That is the workaround Bara documented depends on https://bugzilla.redhat.com/show_bug.cgi?id=1394928 Created attachment 1220807 [details]
ansible playbook log with ceph-osd commented
*** Bug 1395171 has been marked as a duplicate of this bug. *** This doesn't seem to be an issue with Ansible, this looks more like a package upgrade problem. It seems that dm's permission got changed from ceph:ceph to root:disk. Udev rules look good, I'm not sure what changed the permission. It's a blocker -- we're asking for help and forming a plan, you'll know more as soon as we do. Could http://tracker.ceph.com/issues/17813 help ? Changing version to 2 so this bug shows up in our queries as it should. After investigating this further we determined that the one OSD having issues was because the journal had root permissions: # ls -alh /dev/dm-0 brw-rw----. 1 root disk 253, 0 Nov 15 11:19 /dev/dm-0 We got to /dev/dm-0 because the journal points to: lrwxrwxrwx. 1 ceph ceph 48 Nov 15 06:04 journal -> /dev/mapper/cd92e354-e9ed-4207-bbfd-62e6778d3838 And the mapper points to /dev/dm-0: lrwxrwxrwx. 1 root root 7 Nov 15 11:19 /dev/mapper/cd92e354-e9ed-4207-bbfd-62e6778d3838 -> ../dm-0 In order to get this working we had to do the following (manually): 1) Set the right permissions on the actual device: chown -R ceph:ceph /dev/dm-0 2) Reset the failure state for the ceph-osd service systemctl reset-failed ceph-osd 3) Start the osd daemon systemctl start ceph-osd 4) Verify that the OSD daemon came up and it shows in the osd tree: systemctl status ceph-osd ceph osd tree Once all those steps were completed, we were able to run the rolling upgrade playbook again but the cluster seems to be in a degraded state (unsure how this particular cluster got here) and the playbook is unable to complete. This looks like a problem originating by the system udev rules that are racing to change permissions on these devices (see linked tracker ticket). This leaves us with no guarantee that manually changing permissions in this way will persist. At this point we would need to test this again with a new, healthy cluster, or wait for the current one to get out of the degraded state to verify that the playbook can complete correctly. For the record, an IRC log of unconclusive explorations <loicd> andrewschoen: I commented on the bz but I still don't know how to workaround this race with dm. <andrewschoen> loicd: we're trying a workaround of manually fixing the permissions and then rerunning the update right now <andrewschoen> we did get the OSD back up with that <loicd> ideally there would be a way to override the default user/group for a given device <loicd> so that we don't have to fight udev over it <loicd> but maybe the root:root default user/group is hardcoded <loicd> or if that's not possible, it would be useful to ask udev to never chown devices <loicd> /lib/udev/rules.d/50-udev-default.rules <loicd> it only changes the group but it does a chown(2) instead of a chgrp(2) and changes both <loicd> systemd-229/src/udev/udev-rules.c is where it is interpreted <loicd> although 7.2 is running 219 I guess it did not change much <loicd> andrewschoen: if you can consistently reproduce the problem, it may be worth trying to comment out all lines in /lib/udev/rules.d/50-udev-default.rules that have GROUP="disk" and see if that fixes the problem. It does not solve the problem of packaging this workaround but ... <andrewschoen> loicd: at this point we can't run the playbook again because the cluster is not healthy, we're gonna wait and see if it gets healthy and then try to run the upgrade again <loicd> andrewschoen: ... it would confirm that it originates here <loicd> andrewschoen: do you have a minimal reproducer ? <alfredodeza> not right now loicd -*- loicd installing a cluster <andrewschoen> the strange thing is that in the run that found this 3 OSDs were upgraded and started just fine before this one failed <loicd> since it's impossible to guarantee the sequence in which udev events are fired, it is enough that something (anything) triggers the event that runs 50-udev-default.rules to revert the permissions <loicd> it would help to have the output of udevadm monitor to see the sequence of udev events <loicd> it should show that 50-udev-default.rules happens on the device before and *after* 95-ceph-osd.rules. If not ... it means the theory that 50-udev-default.rules is responsible for reverting permissions is false. <loicd> andrewschoen: do you confirm the playbook does not, at any time, run partprobe or partx on the devices ? <loicd> I guess it does not otherwise this problem would have surfaced often. <andrewschoen> loicd: I know for certain that this specific playbook run does not because the ceph-osd role is not in use for rolling upgrades of dmcrypt osds <andrewschoen> loicd: I can't find use of partprobe or partx in the ceph-osd role either <loicd> andrewschoen: from which ceph version to which ceph version ? <loicd> I thought maybe the somewhat recent re-addition of 60-ceph-by-parttypeuuid.rules would cause problem but it does not contain OWNER/GROUP/MODE <andrewschoen> yeah, so 10.2.2 to 10.2.3 -*- loicd looking at http://tracker.ceph.com/versions/518 <loicd> https://github.com/ceph/ceph/pull/8754/files is about premissions but not about devices <loicd> https://github.com/ceph/ceph/pull/10008/files is in the vicinity (parted) but I don't see any harm <loicd> https://github.com/ceph/ceph/pull/10497/files is about partprobe but prevents a race which is good <loicd> I dont' see anything in v10.2.3 http://tracker.ceph.com/versions/518 that could, even remotely introduce a regression on device permissions <ceph-ircslackbot> <sage> loicd it could just be a timing thing? <loicd> sage: I don't see how. The only thing I can think of is that too many udev events are fired. And if that's confirmed we'll have to figure out who's firing them. <loicd> sage: a) udev fires a "add" event, gets to default.rules which chown root:dis, then gets to ceph.rules which chown ceph:ceph, b) ceph activate runs in the background, c) "something" fires udev modified, d) default.rules chown root:disk, e) ceph-disk activate fails because of the permission <loicd> hum not even <loicd> because another ceph-disk activate would then be run, wait for the first to finish failing and run after the chown is done <loicd> because 50-udev-default.rules won't change permission on anything but udev add ( ACTION!="add", GOTO="default_permissions_end" ) -*- loicd out of ideas As a temporary workaround we could force the right permissions on the journal devices after the ceph-common role has been applied. If it's urgent and a blocker we can either: * document this as a well known but hard to reproduce issue * introduce the change in the playbook so we keep the right permissions What do you guys think? Seb, I had a question. Can the permissions of the osd-lockbox cause this. Since /dev/sdb3 is mounted and its permissions are root:disk? root@magna056 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 917G 2.7G 868G 1% / devtmpfs 16G 0 16G 0% /dev tmpfs 16G 0 16G 0% /dev/shm tmpfs 16G 8.6M 16G 1% /run tmpfs 16G 0 16G 0% /sys/fs/cgroup tmpfs 3.2G 0 3.2G 0% /run/user/1000 /dev/sdb3 8.7M 179K 7.9M 3% /var/lib/ceph/osd-lockbox/6b074d80-96e4-4a1a-b195-0b7224b541b9 /dev/dm-1 922G 34M 922G 1% /var/lib/ceph/osd/ceph-2 [root@magna056 ~]# [root@magna056 ~]# [root@magna056 ~]# [root@magna056 ~]# ceph-disk list /dev/dm-0 other, unknown /dev/dm-1 other, xfs, mounted on /var/lib/ceph/osd/ceph-2 /dev/sda : /dev/sda1 other, ext4, mounted on / /dev/sdb : /dev/sdb2 ceph journal (dmcrypt LUKS /dev/dm-0), for /dev/sdb1 /dev/sdb3 ceph lockbox, active, for /dev/sdb1 /dev/sdb1 ceph data (dmcrypt LUKS /dev/dm-1), cluster ceph, osd.2, journal /dev/sdb2 /dev/sdc other, unknown /dev/sdd other, unknown [root@magna056 ~]# [root@magna056 ~]# [root@magna056 ~]# ll /dev/dm* brw-rw----. 1 ceph ceph 253, 0 Nov 16 10:33 /dev/dm-0 brw-rw----. 1 ceph ceph 253, 1 Nov 16 10:33 /dev/dm-1 [root@magna056 ~]# [root@magna056 ~]# [root@magna056 ~]# [root@magna056 ~]# ll /dev/sdb* brw-rw----. 1 root disk 8, 16 Nov 16 10:33 /dev/sdb brw-rw----. 1 ceph ceph 8, 17 Nov 16 10:33 /dev/sdb1 brw-rw----. 1 ceph ceph 8, 18 Nov 16 10:33 /dev/sdb2 brw-rw----. 1 root disk 8, 19 Nov 16 10:33 /dev/sdb3 Thanks, Tejas I don't know, we just know this has something to do with udev. So udev might have caused this for this particular OSD and all its dependancies. Seb, we're recommending documenting as per https://bugzilla.redhat.com/show_bug.cgi?id=1366808#c37 Alright thanks Greg! (In reply to Alfredo Deza from comment #37) > After investigating this further we determined that the one OSD having > issues was because the journal had root permissions: > > # ls -alh /dev/dm-0 > brw-rw----. 1 root disk 253, 0 Nov 15 11:19 /dev/dm-0 > > We got to /dev/dm-0 because the journal points to: > > lrwxrwxrwx. 1 ceph ceph 48 Nov 15 06:04 journal -> > /dev/mapper/cd92e354-e9ed-4207-bbfd-62e6778d3838 > > And the mapper points to /dev/dm-0: > > lrwxrwxrwx. 1 root root 7 Nov 15 11:19 > /dev/mapper/cd92e354-e9ed-4207-bbfd-62e6778d3838 -> ../dm-0 > > In order to get this working we had to do the following (manually): > > 1) Set the right permissions on the actual device: > chown -R ceph:ceph /dev/dm-0 > > 2) Reset the failure state for the ceph-osd service > systemctl reset-failed ceph-osd > > 3) Start the osd daemon > systemctl start ceph-osd > > 4) Verify that the OSD daemon came up and it shows in the osd tree: > systemctl status ceph-osd > ceph osd tree > > Once all those steps were completed, we were able to run the rolling upgrade > playbook again but the cluster seems to be in a degraded state (unsure how > this particular cluster got here) and the playbook is unable to complete. The cluster is able to reach a OK state after following these steps > > This looks like a problem originating by the system udev rules that are > racing to change permissions on these devices (see linked tracker ticket). > This leaves us with no guarantee that manually changing permissions in this > way will persist. > > At this point we would need to test this again with a new, healthy cluster, > or wait for the current one to get out of the degraded state to verify that > the playbook can complete correctly. I ran again with a fresh cluster, and saw the same issue on the first OSD node. Ran the above steps, and cluster was OK again. After running rolling_update again, there was no issues on the first node, saw a failure on the second node and so on. Another observation: Only the /dev/dm* devices pertaining to colocated journals had their permissions changed to root:root. The dm devices of dedicated journals were not changed. Thanks, Tejas I recommend that we use yum update on failed nodes instead of another invocation of rolling_update because https://bugzilla.redhat.com/show_bug.cgi?id=1395820 We've already got that process documented in the upgrade guide for 1.3 to 2.0 Federico what do you think? We will document this issue by telling customers to do a rolling update for clusters that do not have encrypted OSDs. If you have an encrypted OSD then they should do a yum update(like we advised in the upgrade from 1.3 to 2.0 steps). lgtm :) Seb and Andrew, What if we wrote some ansible to make sure the ceph user was in the disk group? Gregory, I also see root:root in the comments here, so I'm unsure if making sure that the ceph user is in the disk group would be enough. Seb had an idea in https://bugzilla.redhat.com/show_bug.cgi?id=1366808#c39, maybe that would work? Seb, any thoughts on a workaround in ceph-ansible? Tejas, Would you please test this BZ again with the latest v2.1.9 version of ceph-ansible? I'm wondering if this is still even an issue now. I've been working on upstream tests for this and haven't been able to reproduce yet. Thanks! Andrew As part of ceph 2.2 testing, I have tested upgrades from 1.3.3 to 2.2, and from 2.1 to 2.2, all with colocated journal and dedicated journal encrypted OSDs. This has been done on both RHEL and Ubuntu. So far no issues have been seen. so yes this has been resolved in the rebased. ceph-ansible Thanks, Tejas Bara, I don't think this needs doc text anymore. Moving this bug to Verified state. Thanks, Tejas Bara, doc text fields looks good to me. *** Bug 1373736 has been marked as a duplicate of this bug. *** *** Bug 1391468 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:0515 |