Bug 1510470 - Containerized OSDs don't start - fail to find the Journal device
Summary: Containerized OSDs don't start - fail to find the Journal device
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Ansible
Version: 3.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: 3.0
Assignee: Guillaume Abrioux
QA Contact: ceph-qe-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-07 13:57 UTC by Daniel Messer
Modified: 2018-02-12 12:40 UTC (History)
13 users (show)

Fixed In Version: RHEL: ceph-ansible-3.0.10-2.el7cp Ubuntu: ceph-ansible_3.0.10-2redhat1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-05 23:49:35 UTC
Embargoed:


Attachments (Terms of Use)
all.yml (16.44 KB, text/x-vhdl)
2017-11-07 13:57 UTC, Daniel Messer
no flags Details
osds.yml (8.44 KB, text/x-vhdl)
2017-11-07 13:59 UTC, Daniel Messer
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible pull 2152 0 None None None 2017-11-09 10:13:51 UTC
Red Hat Product Errata RHBA-2017:3387 0 normal SHIPPED_LIVE Red Hat Ceph Storage 3.0 bug fix and enhancement update 2017-12-06 03:03:45 UTC

Description Daniel Messer 2017-11-07 13:57:02 UTC
Created attachment 1348966 [details]
all.yml

Description of problem:

With ceph-ansible a containerized setup does not provide a functional cluster. In a non-collocated file store scenario the installation playbook runs through successfully but the OSD containers are constantly restarting.

Version-Release number of selected component (if applicable):

ceph-ansible-3.0.9-1.el7cp
ceph-3.0-rhel-7-docker-candidate-61072-20171104225422


How reproducible:

Steps to Reproduce:
1. Set up ceph-ansible according to https://doc-stage.usersys.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html-single/container_guide/#additional_resources_7
2. Choose non-collocated as the osd_sceario and populate devices, dedicated_devices with with the OSD and Journal devices
3. run the installation playbooks site-docker.yml

Actual results:

Cluster deployed but OSD containers restarting. Looking at the log output of those containers they seem to wait on a device with a PARTUUID that does not exist:

Waiting for /dev/disk/by-partuuid/f98ac6cf-bcf8-4276-995a-7cdb7e0ae5d0 to show up

Looking at the blkid output of this host the correct partitioning appears but the PARTUUIDs are different:

blkid
/dev/xvda1: PARTUUID="3c387322-88aa-42ba-8c46-ddb0e76f1054"
/dev/xvda2: UUID="de4def96-ff72-4eb9-ad5e-0847257d1866" TYPE="xfs" PARTUUID="a34cf35b-104d-49b0-ae11-f664a286af07"
/dev/xvdg1: UUID="0c7b0218-be01-452f-bc53-e4c0a1599f6c" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="8c871e7e-2b97-4829-9e67-a27fb0e3c208"
/dev/xvdf1: UUID="5653689a-b654-4308-b3e7-d2400bad1054" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="6fa2c6a0-4d04-4d13-bdfe-8c9f3ade661e"
/dev/xvdi1: UUID="68f241e9-5054-4b63-af01-37e491c81eff" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="d650c775-c2be-421d-b69a-37cb69cfdbe2"
/dev/xvdd1: UUID="8e2e99fa-5678-4b86-ac76-e1fb2afb8124" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="23641025-a6f9-4515-aa83-47e7cc387d7b"
/dev/xvdc1: UUID="874dd414-73cd-484f-ab55-fd63a3b1425e" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="ef474883-14df-420e-b09d-20d3760de8a9"
/dev/xvdb1: UUID="7ae5ba65-3805-4b0b-ba46-b0402c38bd29" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="1c1947d6-0eb8-4f8e-9561-bee41df7358b"
/dev/xvdh1: UUID="7840d2c1-c6a5-4c1c-a6c6-b84246304786" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="9c8ea86d-6c1f-48e3-84f0-c80d3de7e577"
/dev/xvde1: UUID="a86090ca-c60a-405a-a851-b379aa5093ce" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="ba3a233b-e6bb-442e-b0d7-79f2d7450dbc"
/dev/xvdj1: PARTLABEL="ceph journal" PARTUUID="029b1cc0-f5e0-47a1-bc5e-1c94b8661a8f"
/dev/xvdj2: PARTLABEL="ceph journal" PARTUUID="40b70e9b-b774-46c5-89fd-d800c50cef94"
/dev/xvdj3: PARTLABEL="ceph journal" PARTUUID="9f701674-6c62-461d-85fa-93e94b68b094"
/dev/xvdj4: PARTLABEL="ceph journal" PARTUUID="17ff2492-4475-4fe8-b529-55abdde206ec"
/dev/xvdj5: PARTLABEL="ceph journal" PARTUUID="351839f8-e70f-4cac-a752-a071b2f36db2"
/dev/xvdj6: PARTLABEL="ceph journal" PARTUUID="7a8d8a3a-7507-46e5-ad58-c9c6d8fb1fe2"
/dev/xvdj7: PARTLABEL="ceph journal" PARTUUID="90d14afd-164b-4861-86f6-b7208b84f3e9"
/dev/xvdj8: PARTLABEL="ceph journal" PARTUUID="9fcddc08-2cce-4ffb-8c63-d5d1a2427fab"

The PARTUUID the container is referring to comes from it's env variable OSD_JOURNAL. The id does not exist on any nodes/devices in the entire cluster.

Expected results:

The installation finishes and OSD containers are up and running. OSD_JOURNAL either points to an existing block device or symbolic link below /dev/disk/by-partuuid/ which points to an existing device.

Additional info:

- current RHEL 7.4
- current nightly build of RHCS 3.0 beta
- ceph-ansible from nightly builds
- group_vars/all.yml attached
- group_vars/osds.yml attached

Comment 3 Daniel Messer 2017-11-07 13:59:31 UTC
Created attachment 1348967 [details]
osds.yml

Comment 4 Daniel Messer 2017-11-07 14:00:41 UTC
I should add that this environment had several install attempts before and cleaned with purge-site-docker.yml. The PARTUUID that the OSDs are looking for however stay the same. The fetch directory is cleaned between runs.
The problem persists even when changing to collocated setups.

Comment 5 Sébastien Han 2017-11-08 04:43:41 UTC
This is not something we see in our CI.

Can we access this env?
Thanks!

Comment 6 Guillaume Abrioux 2017-11-08 09:59:33 UTC
Could you provide the full playbook run log?

Thanks

Comment 7 Guillaume Abrioux 2017-11-08 12:12:14 UTC
I tried to reproduce your issue with ceph-ansibe v3.0.9 and ceph-3.0-rhel-7-docker-candidate-61072-20171104225422 container image, the deployment worked fine.

OSDs are UP:

[root@osd0 ~]# docker ps -a
CONTAINER ID        IMAGE                                                                                                               COMMAND             CREATED             STATUS                      PORTS               NAMES
ff6266f26745        brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:ceph-3.0-rhel-7-docker-candidate-61072-20171104225422   "/entrypoint.sh"    28 minutes ago      Up 28 minutes                                   ceph-osd-osd0-sdb
cea920b57eca        brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:ceph-3.0-rhel-7-docker-candidate-61072-20171104225422   "/entrypoint.sh"    28 minutes ago      Up 28 minutes                                   ceph-osd-osd0-sda
299226e51fd4        brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:ceph-3.0-rhel-7-docker-candidate-61072-20171104225422   "/entrypoint.sh"    28 minutes ago      Exited (0) 28 minutes ago                       ceph-osd-prepare-osd0-sdb
0a103838a516        brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:ceph-3.0-rhel-7-docker-candidate-61072-20171104225422   "/entrypoint.sh"    28 minutes ago      Exited (0) 28 minutes ago                       ceph-osd-prepare-osd0-sda
[root@osd0 ~]#


[root@mon0 ~]# docker exec -ti ceph-mon-mon0 ceph -s
  cluster:
    id:     915ba53a-1288-4062-aa6d-45b5db0019b2
    health: HEALTH_WARN
            too few PGs per OSD (8 < min 30)

  services:
    mon: 3 daemons, quorum mon0,mon1,mon2
    mgr: mon0(active)
    mds: cephfs-1/1/1 up  {0=mds0=up:active}
    osd: 2 osds: 2 up, 2 in

  data:
    pools:   2 pools, 16 pgs
    objects: 21 objects, 2246 bytes
    usage:   214 MB used, 102133 MB / 102347 MB avail
    pgs:     16 active+clean

[root@mon0 ~]#


I think your multiple attempts to deploy have probably broken something.
I couldn't reproduce your issue, CI and QE didn't catch this issue as well. Could you retry to deploy from scratch? As Sebastien asked, any chance to access your env?

Thanks!

Comment 8 Daniel Messer 2017-11-08 12:38:18 UTC
@Guilaume - this might well be the case. I will send you and leseb the credentials of the environment. It's AWS-based. I could retry to deploy from scratch to, but honestly I don't see what could cause this. I suggest we work in parallel. I will try to re-deploy from scratch and you can try to re-deploy in my environment see where it's choking up. This behavior will likely effect others that run in https://bugzilla.redhat.com/show_bug.cgi?id=1510555 - which is the reason I had to re-deploy so many times.

Comment 9 Guillaume Abrioux 2017-11-09 10:13:52 UTC
Hi Daniel,

the issue here is in purge-docker-cluster.yml playbook.
You tried several times to deploy your cluster; the first time, the osd disk prepare process produced some logs that are used later in ceph-ansible to retrieve journal partition uuid, these logs are supposed to be generated [1] only at initial deployment because they come from the prepare containers logs, if we lose these containers for any reason (reboot or anything else) we can't generate these logs again.

upstream PR: https://github.com/ceph/ceph-ansible/pull/2152

[1] https://github.com/ceph/ceph-ansible/blob/master/roles/ceph-osd/templates/ceph-osd-run.sh.j2#L17-L35

Comment 15 Vasishta 2017-11-15 17:59:01 UTC
Hi Daniel, 

Tried setting up OSDs with dedicated journals (both dmcrypt and non-dmcrypt) using latest builds, 

Container image - ceph-3.0-rhel-7-docker-candidate-36461-20171114235412
Ceph-ansible - ceph-ansible-3.0.11-1.el7cp.noarch

(Because of test environment constraint only two journals (of 2 OSDs) were on dedicated disk)

Initialization and purging were tried thrice back to back on same set of nodes with node reboot after initializing cluster each time. All these time (both Initialization and after reboot) OSDs came up and were running as expected, thus it looks good to me.

Can you please let me know your views on steps followed as part of the bug fix verification ? 

Regards,
Vasishta

Comment 16 Vasishta 2017-11-17 13:53:19 UTC
Hi,

I'm moving the BZ to VERIFIED as per suggestions I got based on Comment 15.

Please feel free to let me know if there are any concerns.

Regards,
Vasishta

Comment 19 errata-xmlrpc 2017-12-05 23:49:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3387


Note You need to log in before you can comment on or make changes to this bug.