Created attachment 1223732 [details] storage environment Description of problem: In setup with 3 storage nodes with 15 osds all full flash with nvme disk as journal pre partitioned every deployment osd disks arent prepared and activated in ceph cluster How reproducible: Steps to Reproduce: 1.Deploy overcloud with heat environments files : -e /home/stack/templates/storage-environment.yaml and included wipe_disk.yaml Actual results: Ceph status after successful deployment: [root@overcloud-cephstorage-0' heat-admin]# ceph -s cluster 16eb2f62-ac3c-11e6-807b-001e67e2527d health HEALTH_ERR 664 pgs stuck inactive 664 pgs stuck unclean no osds monmap e1: 3 mons at {overcloud-controller-0=192.168.1.18:6789/0,overcloud-controller-1=192.168.1.14:6789/0,overcloud-controller-2=192.168.1.15:6789/0} election epoch 6, quorum 0,1,2 overcloud-controller-1,overcloud-controller-2,overcloud-controller-0 osdmap e4: 0 osds: 0 up, 0 in pgmap v5: 664 pgs, 4 pools, 0 bytes data, 0 objects 0 kB used, 0 kB / 0 kB avail 664 creating Drives listening after deployment: /dev/nvme0n1 : /dev/nvme0n1p1 ceph journal /dev/nvme0n1p2 ceph journal /dev/nvme0n1p3 ceph journal /dev/nvme0n1p4 ceph journal /dev/nvme0n1p5 ceph journal /dev/sda other, unknown /dev/sdb other, unknown /dev/sdc other, unknown /dev/sdd other, unknown /dev/sde other, unknown /dev/sdf other, unknown /dev/sdf1 other, iso9660 /dev/sdf2 other, ext4, mounted on / Expected results: I'm expect ceph prepared disk for cluster and activated all osd disks instead I have only partitioned disk Additional info:
what's the partition status of the not-configured drives?
(In reply to seb from comment #1) > what's the partition status of the not-configured drives? /dev/sda other, unknown /dev/sdb other, unknown /dev/sdc other, unknown /dev/sdd other, unknown /dev/sde other, unknown /dev/sdf other, unknown Not prepared for ceph cluster need to manualy perform : ceph-disk prepare .......
It looks like OSD have been skipped, is it possible to get some debug logs from the puppet execution? Otherwise, that's going to be difficult to debug...
Hi Thank you for answer but the setup was deployed over 1 month ago and any of the puppet log files have not been preserved.
If there is any way you can reproduce this let me know, is it still an issue? Have you got successful deployments since then? If so, please close this bug :).
Hello We are starting to deploy new stack, please let me know which exactly log you would like to have. Regards M.Rembas
I don't know yet but are you still experiencing the same problem on a new deployment?
Yes still the same issue with osd fresh installation looks: [root@overcloud-cephstorage-0 heat-admin]# ceph -s cluster 9f526dec-f517-11e6-8c38-001e67e2527d health HEALTH_ERR 664 pgs stuck inactive 664 pgs stuck unclean no osds monmap e1: 3 mons at {overcloud-controller-0=192.168.1.19:6789/0,overcloud-controller-1=192.168.1.14:6789/0,overcloud-controller-2=192.168.1.17:6789/0} election epoch 6, quorum 0,1,2 overcloud-controller-1,overcloud-controller-2,overcloud-controller-0 osdmap e4: 0 osds: 0 up, 0 in pgmap v5: 664 pgs, 4 pools, 0 bytes data, 0 objects 0 kB used, 0 kB / 0 kB avail 664 creating
Created attachment 1252704 [details] dump_of_os-collect-config logs from ceph node after fresh deployment
(In reply to M.Rembas from comment #9) > Created attachment 1252704 [details] > dump_of_os-collect-config > > logs from ceph node after fresh deployment Can you attach the custom env file where you specify the ceph::profile::params::osds list and report about the version of puppet-ceph installed in the overcloud ndoes?
We don't have the needinfo requested on 2017-07-19 so I am going to go ahead and close this bug. Feel free to re-open if you have the requested information.