Bug 1269329 - [Director] [doc] Colocated Ceph journal partitions are not created when empty journal location is given in hiera -- regression
Summary: [Director] [doc] Colocated Ceph journal partitions are not created when empty...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: documentation
Version: 7.0 (Kilo)
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: ---
: 7.0 (Kilo)
Assignee: RHOS Documentation Team
QA Contact: RHOS Documentation Team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-10-07 02:34 UTC by jliberma@redhat.com
Modified: 2017-05-12 03:01 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-16 17:29:15 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description jliberma@redhat.com 2015-10-07 02:34:59 UTC
Description of problem:
When adding additional OSD disks but specfying default journal location, the ceph journal partitions are not created and linked. It worked prior to the y1 update candidates.

Version-Release number of selected component (if applicable):
python-rdomanager-oscplugin-0.0.10-7.el7ost.noarch

How reproducible:
I have reproduced it twice.

Steps to Reproduce:
1. Specify additional ceph osd's in puppet/hieradata/ceph.yaml with default journal locations:

ceph::profile::params::osds: 
  '/dev/sdb':
    journal: {}
  '/dev/sdc':
    journal: {}
  '/dev/sdd':
    journal: {}
  '/dev/sde':
    journal: {}
  '/dev/sdf':
    journal: {}
  '/dev/sdg':
    journal: {}
  '/dev/sdh':
    journal: {}
  '/dev/sdi':
    journal: {}
  '/dev/sdj':
    journal: {}
  '/dev/sdk':
    journal: {}

2. Deploy overcloud + ceph specifying storage-environment.yaml locationa nd custom templates directory:

openstack overcloud deploy -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /home/stack/network-environment.yaml --control-flavor control --compute-flavor compute --ceph-storage-flavor ceph --ntp-server 10.16.255.2 --control-scale 3 --compute-scale 4 --ceph-storage-scale 4 --block-storage-scale 0 --swift-storage-scale 0 -t 90 --templates /home/stack/templates/openstack-tripleo-heat-templates/ -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml


3. ssh to a ceph storage node and check whether journal disks are linked / partitions are created.


Actual results:

Journal partitions are not created or linked:
[root@overcloud-cephstorage-0 ~]# ls -al /var/lib/ceph/osd/ceph-*/journal
lrwxrwxrwx. 1 root root 2 Oct  6 18:39 /var/lib/ceph/osd/ceph-12/journal -> {}
lrwxrwxrwx. 1 root root 2 Oct  6 18:39 /var/lib/ceph/osd/ceph-17/journal -> {}
lrwxrwxrwx. 1 root root 2 Oct  6 18:39 /var/lib/ceph/osd/ceph-20/journal -> {}
lrwxrwxrwx. 1 root root 2 Oct  6 18:39 /var/lib/ceph/osd/ceph-25/journal -> {}
lrwxrwxrwx. 1 root root 2 Oct  6 18:40 /var/lib/ceph/osd/ceph-27/journal -> {}
lrwxrwxrwx. 1 root root 2 Oct  6 18:40 /var/lib/ceph/osd/ceph-31/journal -> {}
lrwxrwxrwx. 1 root root 2 Oct  6 18:40 /var/lib/ceph/osd/ceph-35/journal -> {}
lrwxrwxrwx. 1 root root 2 Oct  6 18:40 /var/lib/ceph/osd/ceph-38/journal -> {}
lrwxrwxrwx. 1 root root 2 Oct  6 18:40 /var/lib/ceph/osd/ceph-39/journal -> {}
lrwxrwxrwx. 1 root root 2 Oct  6 18:39 /var/lib/ceph/osd/ceph-8/journal -> {}

Expected results:

Journal partitions are created and linked: (output from notes using previous version of OSP director but same ceph.yaml)

[root@overcloud-cephstorage-1 ~]# ls -al /var/lib/ceph/osd/ceph-*/journal
lrwxrwxrwx. 1 root root 58 Aug 10 17:00 /var/lib/ceph/osd/ceph-10/journal -> /dev/disk/by-partuuid/e67ff5d5-f3bd-4670-ba2e-497ce9b1c057
lrwxrwxrwx. 1 root root 58 Aug 10 17:00 /var/lib/ceph/osd/ceph-12/journal -> /dev/disk/by-partuuid/af367eb7-ecf6-4298-bd54-eaebfe3c02ac

Additional info:

[root@overcloud-cephstorage-0 ~]# ceph -s
    cluster db419788-6c74-11e5-bac8-90b11c56332a
     health HEALTH_WARN
            too few PGs per OSD (19 < min 30)
     monmap e1: 3 mons at {overcloud-controller-0=172.16.2.110:6789/0,overcloud-controller-1=172.16.2.105:6789/0,overcloud-controller-2=172.16.2.111:6789/0}
            election epoch 6, quorum 0,1,2 overcloud-controller-1,overcloud-controller-0,overcloud-controller-2
     osdmap e77: 40 osds: 40 up, 40 in
      pgmap v120: 256 pgs, 4 pools, 0 bytes data, 0 objects
            201 GB used, 37020 GB / 37221 GB avail
                 256 active+clean


os-collect-config log entries from ceph storage server:

Oct 06 18:40:36 overcloud-cephstorage-0.localdomain os-collect-config[5571]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdb]/Exec[ceph-osd-prepare-/dev/sdb]/returns: + test -b /dev/sdb
Oct 06 18:40:36 overcloud-cephstorage-0.localdomain os-collect-config[5571]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdb]/Exec[ceph-osd-prepare-/dev/sdb]/returns: + ceph-disk prepare /dev/sdb '{}'
Oct 06 18:40:36 overcloud-cephstorage-0.localdomain os-collect-config[5571]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdb]/Exec[ceph-osd-prepare-/dev/sdb]/returns: WARNING:ceph-disk:OSD will not be hot-swappable if journal is not the same device as the osd data
Oct 06 18:40:36 overcloud-cephstorage-0.localdomain os-collect-config[5571]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdb]/Exec[ceph-osd-prepare-/dev/sdb]/returns: The operation has completed successfully.
Oct 06 18:40:36 overcloud-cephstorage-0.localdomain os-collect-config[5571]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdb]/Exec[ceph-osd-prepare-/dev/sdb]/returns: partx: /dev/sdb: error adding partition 1
Oct 06 18:40:36 overcloud-cephstorage-0.localdomain os-collect-config[5571]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdb]/Exec[ceph-osd-prepare-/dev/sdb]/returns: meta-data=/dev/sdb1              isize=2048   agcount=4, agsize=61013951 blks
Oct 06 18:40:36 overcloud-cephstorage-0.localdomain os-collect-config[5571]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdb]/Exec[ceph-osd-prepare-/dev/sdb]/returns:          =                       sectsz=512   attr=2, projid32bit=1
Oct 06 18:40:36 overcloud-cephstorage-0.localdomain os-collect-config[5571]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdb]/Exec[ceph-osd-prepare-/dev/sdb]/returns:          =                       crc=0        finobt=0
Oct 06 18:40:36 overcloud-cephstorage-0.localdomain os-collect-config[5571]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdb]/Exec[ceph-osd-prepare-/dev/sdb]/returns: data     =                       bsize=4096   blocks=244055803, imaxpct=25
Oct 06 18:40:36 overcloud-cephstorage-0.localdomain os-collect-config[5571]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdb]/Exec[ceph-osd-prepare-/dev/sdb]/returns:          =                       sunit=0      swidth=0 blks
Oct 06 18:40:36 overcloud-cephstorage-0.localdomain os-collect-config[5571]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdb]/Exec[ceph-osd-prepare-/dev/sdb]/returns: naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
Oct 06 18:40:36 overcloud-cephstorage-0.localdomain os-collect-config[5571]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdb]/Exec[ceph-osd-prepare-/dev/sdb]/returns: log      =internal log           bsize=4096   blocks=119167, version=2
Oct 06 18:40:36 overcloud-cephstorage-0.localdomain os-collect-config[5571]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdb]/Exec[ceph-osd-prepare-/dev/sdb]/returns:          =                       sectsz=512   sunit=0 blks, lazy-count=1
Oct 06 18:40:36 overcloud-cephstorage-0.localdomain os-collect-config[5571]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdb]/Exec[ceph-osd-prepare-/dev/sdb]/returns: realtime =none                   extsz=4096   blocks=0, rtextents=0
Oct 06 18:40:36 overcloud-cephstorage-0.localdomain os-collect-config[5571]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdb]/Exec[ceph-osd-prepare-/dev/sdb]/returns: The operation has completed successfully.
Oct 06 18:40:36 overcloud-cephstorage-0.localdomain os-collect-config[5571]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdb]/Exec[ceph-osd-prepare-/dev/sdb]/returns: partx: /dev/sdb: error adding partition 1
Oct 06 18:40:36 overcloud-cephstorage-0.localdomain os-collect-config[5571]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdb]/Exec[ceph-osd-prepare-/dev/sdb]/returns: executed successfully

Comment 2 Giulio Fidente 2015-10-07 12:10:06 UTC
hi, if I understand correctly you did get all the HDDs provisioned as OSDs but the journal files are pointing to {} rather than the actual partition on the HDD?

there could be an issue with parsing the data, can you try with:

ceph::profile::params::osds: 
  /dev/sdb: {}
  /dev/sdc: {}
  /dev/sdd: {}

by omitting the definition of journal (rather than passing empty hash) it is assumed to be on the same HDD and the partitions should be created as well.

Comment 3 rlopez 2015-10-07 14:34:53 UTC
(In reply to Giulio Fidente from comment #2)
> hi, if I understand correctly you did get all the HDDs provisioned as OSDs
> but the journal files are pointing to {} rather than the actual partition on
> the HDD?
> 
> there could be an issue with parsing the data, can you try with:
> 
> ceph::profile::params::osds: 
>   /dev/sdb: {}
>   /dev/sdc: {}
>   /dev/sdd: {}
> 
> by omitting the definition of journal (rather than passing empty hash) it is
> assumed to be on the same HDD and the partitions should be created as well.

Hello Giulio,

I'm working with Jacob on this env. I just redeployed with your recommendation but seeing the exact same results.

[heat-admin@overcloud-cephstorage-0 ~]$ ls -al /var/lib/ceph/osd/ceph-*/journal
lrwxrwxrwx. 1 root root 2 Oct  7 10:05 /var/lib/ceph/osd/ceph-12/journal -> {}
lrwxrwxrwx. 1 root root 2 Oct  7 10:05 /var/lib/ceph/osd/ceph-16/journal -> {}
lrwxrwxrwx. 1 root root 2 Oct  7 10:05 /var/lib/ceph/osd/ceph-20/journal -> {}
lrwxrwxrwx. 1 root root 2 Oct  7 10:05 /var/lib/ceph/osd/ceph-24/journal -> {}
lrwxrwxrwx. 1 root root 2 Oct  7 10:05 /var/lib/ceph/osd/ceph-29/journal -> {}
lrwxrwxrwx. 1 root root 2 Oct  7 10:06 /var/lib/ceph/osd/ceph-32/journal -> {}
lrwxrwxrwx. 1 root root 2 Oct  7 10:06 /var/lib/ceph/osd/ceph-35/journal -> {}
lrwxrwxrwx. 1 root root 2 Oct  7 10:06 /var/lib/ceph/osd/ceph-38/journal -> {}
lrwxrwxrwx. 1 root root 2 Oct  7 10:06 /var/lib/ceph/osd/ceph-39/journal -> {}
lrwxrwxrwx. 1 root root 2 Oct  7 10:05 /var/lib/ceph/osd/ceph-8/journal -> {}

Comment 4 jliberma@redhat.com 2015-10-07 15:22:29 UTC
Giulio,

Is the alternative to create the journal and OSD partitions explicitly prior to installation, and then reference them in the ceph.yaml?

The syntax in the bug description worked with GA/0-day OSPd.

Leaving the journal device string empty in hiera resulted in journal partitions created on the OSD disks automatically.

Should we consider this a regression?

Thanks, Jacob

Comment 8 rlopez 2015-10-07 17:43:41 UTC
The recommendation does work. I accidentally placed the modified ceph.yaml in the wrong location. Once I placed it in the /home/stack/templates/openstack-tripleo-heat-templates/puppet/hierdata it did work.

Removing the journal line with the brackets {} did the trick. 

snippet of ceph.yaml file that works correctly: 

ceph::profile::params::osds: 
  '/dev/sdb': {}
  '/dev/sdc': {}
  '/dev/sdd': {}
  '/dev/sde': {}
  '/dev/sdf': {}
  '/dev/sdg': {}
  '/dev/sdh': {}
  '/dev/sdi': {}
  '/dev/sdj': {}
  '/dev/sdk': {}


Output of it working:

# ls -al /var/lib/ceph/osd/ceph-*/journal
lrwxrwxrwx. 1 root root 58 Oct  7 13:27 /var/lib/ceph/osd/ceph-13/journal -> /dev/disk/by-partuuid/2fe9adb7-f547-497d-a043-a1fbbbfbacc0
lrwxrwxrwx. 1 root root 58 Oct  7 13:27 /var/lib/ceph/osd/ceph-17/journal -> /dev/disk/by-partuuid/e5a5cf23-51d1-4a39-9f96-45cf439e9657
lrwxrwxrwx. 1 root root 58 Oct  7 13:27 /var/lib/ceph/osd/ceph-1/journal -> /dev/disk/by-partuuid/df5f2b10-547c-4ed7-adbd-0f19a7e21d90
lrwxrwxrwx. 1 root root 58 Oct  7 13:27 /var/lib/ceph/osd/ceph-21/journal -> /dev/disk/by-partuuid/6a04c461-b617-49fd-bb14-c731b1206b34
lrwxrwxrwx. 1 root root 58 Oct  7 13:27 /var/lib/ceph/osd/ceph-25/journal -> /dev/disk/by-partuuid/4b5bc37a-e00b-4c00-acd6-cd0a7119f231
lrwxrwxrwx. 1 root root 58 Oct  7 13:28 /var/lib/ceph/osd/ceph-29/journal -> /dev/disk/by-partuuid/4cd1c940-9eb2-467d-853e-b240efc3192b
lrwxrwxrwx. 1 root root 58 Oct  7 13:28 /var/lib/ceph/osd/ceph-33/journal -> /dev/disk/by-partuuid/c7151ac9-6553-47cc-82e3-85196460dda1
lrwxrwxrwx. 1 root root 58 Oct  7 13:28 /var/lib/ceph/osd/ceph-37/journal -> /dev/disk/by-partuuid/3b4ace81-3ce3-449b-a052-13de3eda74a1
lrwxrwxrwx. 1 root root 58 Oct  7 13:27 /var/lib/ceph/osd/ceph-5/journal -> /dev/disk/by-partuuid/a133526c-acfd-4245-b17f-6bb15066c4ef
lrwxrwxrwx. 1 root root 58 Oct  7 13:27 /var/lib/ceph/osd/ceph-9/journal -> /dev/disk/by-partuuid/4e2bd71e-cb57-4117-808d-33865574885b

Comment 9 Giulio Fidente 2015-10-09 10:26:03 UTC
I'd be inclined to reassign this as a Doc bug. We shouldn't pass 'journal:' and point it to empty string or empty hash when there is no external journal disk.

That might have worked but I don't think it was ever intended to be supported. Instead, from my experience setting a param to empty string means we want puppet to set it empty string in the target config file. At least the puppet-neutron or puppet-nova modules behaves so. I think 

What do you think? It looks to me that reassigning to Doc so that we remove any reference to 'journal: ""' or 'journal: {}' is best.

Comment 10 jliberma@redhat.com 2015-10-09 14:32:18 UTC
Giulio, we tested and the syntax you recommended works. Some of the options in the ceph.yaml are still ignored, but the journals create successfully. 

I will remove references to the previous syntax, which did work in the prior version of OSP director, but will not work going forward. 

This procedure is not documented in the product docs yet. So I don't think it's necessary to make this a doc bug. We can close it as not a bug. Thanks!

Comment 11 chris alfonso 2015-10-09 16:18:20 UTC
Jacob, did you open a separate docs bug for this? If so, please reference it here and I'll close this bug. Else, we can move this bug to 'documentation' component, prefixing the title with [Director]

Comment 17 jliberma@redhat.com 2015-11-18 15:24:12 UTC
Dan, please see: https://bugzilla.redhat.com/show_bug.cgi?id=1269329#c2

This is the new default journal location syntax for ceph with OSPd 7.1.

This update would go into section 6.3.6 of the product docs:

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/7/html/Director_Installation_and_Usage/sect-Advanced-Scenario_3_Using_the_CLI_to_Create_an_Advanced_Overcloud_with_Ceph_Nodes.html

Change:

ceph::profile::params::osds:
   '/dev/sdb':
       journal: {}
   '/dev/sdc':
       journal: {}

to

ceph::profile::params::osds: 
 '/dev/sdb': {}
 '/dev/sdc': {}

Comment 18 Giulio Fidente 2015-12-10 00:21:53 UTC
The syntax in the docs for the scenario where there are no separate journal disks seems good to me. Dan, can we close this?


Note You need to log in before you can comment on or make changes to this bug.