Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1391920 - [ceph-ansible] Encrypted OSD creation fails with collocated journal and custom cluster name
[ceph-ansible] Encrypted OSD creation fails with collocated journal and custo...
Status: CLOSED ERRATA
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: Ceph-Disk (Show other bugs)
2.1
Unspecified Unspecified
unspecified Severity unspecified
: rc
: 2.3
Assigned To: leseb
Vasishta
Erin Donnelly
:
: 1451168 1452316 (view as bug list)
Depends On:
Blocks: 1412948 1437916
  Show dependency treegraph
 
Reported: 2016-11-04 07:22 EDT by Vasishta
Modified: 2017-07-30 10:57 EDT (History)
16 users (show)

See Also:
Fixed In Version: RHEL: ceph-10.2.7-21.el7cp Ubuntu: ceph_10.2.7-23redhat1
Doc Type: Bug Fix
Doc Text:
.Ansible and "ceph-disk" no longer fail to create encrypted OSDs if the cluster name is different than "ceph" Previously, the `ceph-disk` utility did not support configuring the `dmcrypt` utility if the cluster name was different than "ceph". Consequently, it was not possible to use the `ceph-ansible` utility to create encrypted OSDs if you use a custom cluster name. This bug has been fixed, and custom cluster names can now be used.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-06-19 09:27:29 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Ceph Project Bug Tracker 17821 None None None 2017-05-23 14:02 EDT
Red Hat Product Errata RHBA-2017:1497 normal SHIPPED_LIVE Red Hat Ceph Storage 2.3 bug fix and enhancement update 2017-06-19 13:24:11 EDT

  None (edit)
Description Vasishta 2016-11-04 07:22:04 EDT
Description of problem:
Encrypted OSD creation fails with collocated journal and custom cluster name.

Version-Release number of selected component (if applicable):
ceph-ansible-1.0.5-39.el7scon.noarch

How reproducible:
always

Steps to Reproduce:
1. Install ceph-ansible
2. Change following settings in /usr/share/ceph-ansible/group_vars/osds file

   dmcrypt_journal_collocation: true
   devices:
  - /dev/sdb
  - /dev/sdc
  - /dev/sdd
3. Run playbook.

Actual results:

TASK: [ceph-osd | manually prepare osd disk(s) (dmcrypt)] ********************* 
failed: [magna030] => (item=[{u'cmd': u"parted --script /dev/sdb print | egrep -sq '^ 1.*ceph'", u'end': u'2016-11-04 11:01:40.252042', 'failed': False, u'stdout': u'', u'changed': False, u'rc': 1, u'start': u'2016-11-04 11:01:40.157490', 'item': '/dev/sdb', u'warnings': [], u'delta': u'0:00:00.094552', 'invocation': {'module_name': u'shell', 'module_complex_args': {}, 'module_args': u"parted --script /dev/sdb print | egrep -sq '^ 1.*ceph'"}, 'stdout_lines': [], 'failed_when_result': False, u'stderr': u''}, {u'cmd': u"echo '/dev/sdb' | egrep '/dev/([hsv]d[a-z]{1,2}|cciss/c[0-9]d[0-9]p|nvme[0-9]n[0-9]p)[0-9]{1,2}$'", u'end': u'2016-11-04 11:01:38.796395', 'failed': False, u'stdout': u'', u'changed': False, u'rc': 1, u'start': u'2016-11-04 11:01:38.791004', 'item': '/dev/sdb', u'warnings': [], u'delta': u'0:00:00.005391', 'invocation': {'module_name': u'shell', 'module_complex_args': {}, 'module_args': u"echo '/dev/sdb' | egrep '/dev/([hsv]d[a-z]{1,2}|cciss/c[0-9]d[0-9]p|nvme[0-9]n[0-9]p)[0-9]{1,2}$'"}, 'stdout_lines': [], 'failed_when_result': False, u'stderr': u''}, '/dev/sdb']) => {"changed": true, "cmd": ["ceph-disk", "prepare", "--dmcrypt", "--cluster", "master", "/dev/sdb"], "delta": "0:00:00.162401", "end": "2016-11-04 11:01:41.703409", "item": [{"changed": false, "cmd": "parted --script /dev/sdb print | egrep -sq '^ 1.*ceph'", "delta": "0:00:00.094552", "end": "2016-11-04 11:01:40.252042", "failed": false, "failed_when_result": false, "invocation": {"module_args": "parted --script /dev/sdb print | egrep -sq '^ 1.*ceph'", "module_complex_args": {}, "module_name": "shell"}, "item": "/dev/sdb", "rc": 1, "start": "2016-11-04 11:01:40.157490", "stderr": "", "stdout": "", "stdout_lines": [], "warnings": []}, {"changed": false, "cmd": "echo '/dev/sdb' | egrep '/dev/([hsv]d[a-z]{1,2}|cciss/c[0-9]d[0-9]p|nvme[0-9]n[0-9]p)[0-9]{1,2}$'", "delta": "0:00:00.005391", "end": "2016-11-04 11:01:38.796395", "failed": false, "failed_when_result": false, "invocation": {"module_args": "echo '/dev/sdb' | egrep '/dev/([hsv]d[a-z]{1,2}|cciss/c[0-9]d[0-9]p|nvme[0-9]n[0-9]p)[0-9]{1,2}$'", "module_complex_args": {}, "module_name": "shell"}, "item": "/dev/sdb", "rc": 1, "start": "2016-11-04 11:01:38.791004", "stderr": "", "stdout": "", "stdout_lines": [], "warnings": []}, "/dev/sdb"], "rc": 1, "start": "2016-11-04 11:01:41.541008", "warnings": []}
stderr: ceph-disk: Error: Device is mounted: /dev/sdb3


Expected results:


Additional info:
Complete ansible-playbook log and files of group_vars have been copied to home of ubuntu user in magna111.ceph.redhat.com (/home/ubuntu/ansible_log & /home/ubuntu/group_vars)
Comment 4 seb 2016-11-04 10:48:13 EDT
I think the real issue here is that ceph-disk with dmcrypt doesn't support storing keys with a cluster name different than "ceph".

While trying to activate an OSD manually I noticed the keys couldn't get stored.

Alfredo? Am I right?
Comment 5 seb 2016-11-04 11:06:11 EDT
Patch proposed upstream: https://github.com/ceph/ceph/pull/11786
Comment 10 seb 2016-11-08 06:23:36 EST
LGTM Bara!
Comment 15 seb 2017-01-10 18:47:06 EST
No we can not test it, we are still waiting for this: https://github.com/ceph/ceph/pull/11786 to be merged in ceph.
Comment 17 Andrew Schoen 2017-01-30 12:59:04 EST
Moving to 2.3 to give more time for https://github.com/ceph/ceph/pull/11786 to be merged and tested.
Comment 19 Federico Lucifredi 2017-02-21 19:52:31 EST
If this can be merged this week, we will test it — Gregory will have an update for us in the program call.
Comment 20 Federico Lucifredi 2017-02-21 19:52:36 EST
If this can be merged this week, we will test it — Gregory will have an update for us in the program call.
Comment 22 Federico Lucifredi 2017-02-22 11:29:35 EST
good customer insight from Gregory is that the issue is in Ceph-disk w/ encrypted OSDs. Therefore it is highly unlikely there are clusters out there with encrypted OSDs that are not named 'Ceph'. 

This should not block upgrades. Pushing the fix to 2.3 so we have time for sorting out the ceph-disk fix that is churning upstream right now.
Comment 23 Harish NV Rao 2017-03-30 15:58:52 EDT
(In reply to Andrew Schoen from comment #17)
> Moving to 2.3 to give more time for https://github.com/ceph/ceph/pull/11786
> to be merged and tested.

@Seb, will this be fixed in 2.3? If not, could you please move it out of 2.3?
Comment 24 seb 2017-04-04 05:32:11 EDT
It depends, Ken, is https://github.com/ceph/ceph/pull/13573 part of 2.3?
However this is still not in Jewel upstream so we don't test this in our CI.
Comment 25 Ken Dreyer (Red Hat) 2017-04-04 13:42:19 EDT
Seb, the jewel backport PR 13496 lacks approval from Loic and a clean Teuthology run, so it will not be in the v10.2.7 upstream release.

Once v10.2.7 is tagged upstream, I'll rebase the internal ceph-2-rhel-patches branch to that, and then we'll need to cherry-pick a fix internally for this BZ.

If we do not yet have a stable fix for jewel that we can ship to customers with a high level of confidence, we'll need to re-target this BZ to a future RH Ceph Storage release.

How would you like to proceed on this?
Comment 26 seb 2017-04-05 05:52:17 EDT
Let's postpone this for a future release once we have the right backport for Jewel upstream.

I guess this means rhcs 3.0 right?
Comment 27 Ken Dreyer (Red Hat) 2017-04-05 13:39:55 EDT
Thanks, re-targeted
Comment 28 tserlin 2017-05-23 14:07:02 EDT
*** Bug 1452316 has been marked as a duplicate of this bug. ***
Comment 29 tserlin 2017-05-23 14:09:07 EDT
*** Bug 1451168 has been marked as a duplicate of this bug. ***
Comment 30 tserlin 2017-05-25 13:54:20 EDT
Just a clarification: PR 13496 was closed in favor of https://github.com/ceph/ceph/pull/14765
Comment 35 Warren 2017-05-26 20:53:07 EDT
On my test systems:

group_vars/all.yml has the following field set:

cluster: aard

group_vars/osds.yml has the following fields set:

devices:
  - /dev/sdb
  - /dev/sdc
  - /dev/sdd

dmcrypt_journal_collocation: true

The ansible-playbook command finished with no errors.

Running the following command:
sudo docker exec ceph-mon-magna045 ceph --cluster aard -s

Shows:
    cluster b946af73-d4ca-4c60-b261-7cbe2c6ac104
     health HEALTH_WARN
            clock skew detected on mon.magna055, mon.magna060
            Monitor clock skew detected 
     monmap e2: 3 mons at {magna045=10.8.128.45:6789/0,magna055=10.8.128.55:6789/0,magna060=10.8.128.60:6789/0}
            election epoch 6, quorum 0,1,2 magna045,magna055,magna060
     osdmap e18: 9 osds: 9 up, 9 in
            flags sortbitwise,require_jewel_osds
      pgmap v44: 128 pgs, 1 pools, 0 bytes data, 0 objects
            302 MB used, 8378 GB / 8378 GB avail
                 128 active+clean

So the cluster is named aard and dmcrypt_journal_collocation is set.

I do not see the ceph-disk Error reported
Comment 37 errata-xmlrpc 2017-06-19 09:27:29 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1497

Note You need to log in before you can comment on or make changes to this bug.