Bug 1377639

Summary: Ceph]: Ceph OSD create with the --dmcrypt option fails
Product: Red Hat Ceph Storage Reporter: Tejas <tchandra>
Component: DocumentationAssignee: Bara Ancincova <bancinco>
Status: CLOSED CURRENTRELEASE QA Contact: ceph-qe-bugs <ceph-qe-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 1.3.3CC: adeza, flucifre, hnallurv, kdreyer, ldachary, tchandra, tserlin
Target Milestone: rc   
Target Release: 1.3.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-09-30 17:21:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
log
none
Workaround log none

Comment 5 Tejas 2016-09-20 10:26:04 UTC
Created attachment 1202814 [details]
log

Comment 6 Tejas 2016-09-20 10:28:08 UTC
I have attached a log.
The issue seems to be an additional jounal partition passed as a parameter to the 'ceph-disk prepare' command, which is not needed.

this is the command as per doc:
ceph-deploy osd create --dmcrypt <hostname>:<disk_device>:<data_partition>

thanks,
Tejas

Comment 8 Loic Dachary 2016-09-20 12:17:00 UTC
This is a problem specific to the case where the both are on the same disk. It should be documented that it won't work and that providing the whole disk should be used instead (because it does the same thing: collocate the data and journal on the same disk, each with one partition).

Comment 9 Tejas 2016-09-21 07:27:23 UTC
Hi Loic,

    I had another query, isnt "osd create" command supposed to do the osd activation also?
I don't see that happening.

And when I manually try to activate it fails, please check #comment7.
Could you please take a look at this?

And also the doc is already changed to reflect the earlier issue #comment8.

https://access.qa.redhat.com/documentation/en/red-hat-ceph-storage/1.3/single/administration-guide/#adding_osds_by_using_literal_ceph_deploy_literal

Thanks,
Tejas

Comment 11 Federico Lucifredi 2016-09-21 16:22:15 UTC
Alfredo, looks like there is confusion about how to set up an encrypted volume. Could you provide the steps to QE for testing here?

Comment 12 Alfredo Deza 2016-09-21 17:13:55 UTC
(In reply to Tejas from comment #9)
> Hi Loic,
> 
>     I had another query, isnt "osd create" command supposed to do the osd
> activation also?
> I don't see that happening.

I think that what you want here is `ceph-deploy osd prepare` which I believe in this case will end up with activated disks (a reboot *might* be required).

> 
> And when I manually try to activate it fails, please check #comment7.
> Could you please take a look at this?
> 
> And also the doc is already changed to reflect the earlier issue #comment8.
> 
> https://access.qa.redhat.com/documentation/en/red-hat-ceph-storage/1.3/
> single/administration-guide/#adding_osds_by_using_literal_ceph_deploy_literal
> 
> Thanks,
> Tejas

Comment 13 Alfredo Deza 2016-09-21 17:23:43 UTC
It looks like Loic addressed the issue here on comment #8. I was unaware that a journal could not be in the same device as the OSD for dmcrypt.

@tejas: could you try and verify that not using the journal in the same device makes this work? If that is the case I think this is just a doc update/warning

Comment 14 Tejas 2016-09-21 18:14:49 UTC
(In reply to Alfredo Deza from comment #12)
> (In reply to Tejas from comment #9)
> > Hi Loic,
> > 
> >     I had another query, isnt "osd create" command supposed to do the osd
> > activation also?
> > I don't see that happening.
> 
> I think that what you want here is `ceph-deploy osd prepare` which I believe
> in this case will end up with activated disks (a reboot *might* be required).
> 
That worked, 'ceph deploy osd prepare' followed by a reboot of the OSD node successfully brought up the OSD. Thanks Alfredo.
> > 
> > And when I manually try to activate it fails, please check #comment7.
> > Could you please take a look at this?
> > 
> > And also the doc is already changed to reflect the earlier issue #comment8.
> > 
> > https://access.qa.redhat.com/documentation/en/red-hat-ceph-storage/1.3/
> > single/administration-guide/#adding_osds_by_using_literal_ceph_deploy_literal
> > 
> > Thanks,
> > Tejas

Comment 15 Tejas 2016-09-21 18:17:29 UTC
(In reply to Alfredo Deza from comment #13)
> It looks like Loic addressed the issue here on comment #8. I was unaware
> that a journal could not be in the same device as the OSD for dmcrypt.
> 
the journal can be on the same device as the data.
Just that we should not be passing this as a parameter for "osd create" command.
like this :
ceph-deploy osd create --dmcrypt <node>:/dev/sda:/dev/sda2

this will fail


> @tejas: could you try and verify that not using the journal in the same
> device makes this work? If that is the case I think this is just a doc
> update/warning

Thanks,
Tejas

Comment 17 Tejas 2016-09-22 10:45:24 UTC
Created attachment 1203683 [details]
Workaround log

Comment 18 Loic Dachary 2016-09-22 18:47:10 UTC
> I dont understand is ceph-disk does a "cryptsetup remove" on the data partition.

This is because activation done via udev calls luksOpen via the udev rules in udev/95-ceph-osd.rules. 

# Map journal if using dm-crypt and luks
ACTION=="add" SUBSYSTEM=="block", \
  ENV{DEVTYPE}=="partition", \
  ENV{ID_PART_ENTRY_TYPE}=="45b0969e-9b03-4f30-b4c6-35865ceff106", \
  RUN+="/sbin/cryptsetup --key-file /etc/ceph/dmcrypt-keys/$env{ID_PART_ENTRY_UUID}.luks.key luksOpen /dev/$name $env{ID_PART_ENTRY_UUID}"

Your workaround looks good. A lot better than trying with ceph-osd --mkfs.

Comment 21 Loic Dachary 2016-09-23 14:36:40 UTC
I agree

Comment 25 Loic Dachary 2016-09-26 11:33:19 UTC
Looks good to me.

Comment 27 Loic Dachary 2016-09-26 11:43:38 UTC
> The `<journal_partition>` parameter is optional. Do not specify the journal
> partition if the partition is on the same disk as the data stored by OSD.

This workaround is about the problem for which this issue was created, i.e when someone writes

   ceph-disk prepare /dev/sdX /dev/sdX2

which is unecessary because

   ceph-disk prepare /dev/sdX 

will do the same. Not only is it unecessary, it does not work at all and must be avoided. This is probably a rare use case as people who want to collocate the journal with the data will simply use the following form

   ceph-disk prepare /dev/sdX 

The other workaround above:

   ceph-deploy --dmcrypt prepare
   cryptsetup
   ceph-deploy --dmcrypt activate

is about a different problem. It deals with the fact that disk preparation in the case of encrypted devices is racy.

Does that answer your question ?

Comment 30 Tejas 2016-09-27 11:36:09 UTC
The changes look good .
Moving bug to verified.