Created attachment 1120790 [details] installation log Description of problem: The overcloud was successfully installed without 1 Ceph storage node installed properly. The overcloud required topology is: 1 controller 2 compute nodes 3 Ceph storage nodes, each with 3 hard drives, vdb, vdc ,vdd. The ceph.yaml file is configured: ceph::profile::params::osd_journal_size: 1024 ceph::profile::params::osd_pool_default_pg_num: 128 ceph::profile::params::osd_pool_default_pgp_num: 128 ceph::profile::params::osd_pool_default_size: 3 ceph::profile::params::osd_pool_default_min_size: 1 ceph::profile::params::osds: '/dev/vdb': journal: '' '/dev/vdc': journal: '' '/dev/vdd': journal: '' ceph::profile::params::manage_repo: false ceph::profile::params::authentication_type: cephx ceph_pools: - volumes - vms - images ceph_classes: [] ceph_osd_selinux_permissive: true The result is the controller is installed on 1 of the servers with the 4 hard drives, 2 Ceph storage nodes are installed properly and an additional Ceph storage node is set with no OSDs. * the installation runs on a virtual setup Version-Release number of selected component (if applicable): python-tripleoclient-0.0.11-5.el7ost.noarch openstack-tripleo-image-elements-0.9.7-1.el7ost.noarch openstack-tripleo-common-0.0.2-4.el7ost.noarch openstack-tripleo-puppet-elements-0.0.2-1.el7ost.noarch openstack-tripleo-0.0.7-1.el7ost.noarch openstack-tripleo-heat-templates-0.8.7-2.el7ost.noarch How reproducible: unknown Steps to Reproduce: 1. Set ceph.yaml with additional hard drives 2. Install overcloud Actual results: The overcloud installation failed - the storage nodes are misconfigured and the OSPD says that installation finished successfully. Expected results: The OSPD should detect the server with the 4 hard drives, install and run the OSDs services Additional info:
Can you paste the deploy command, attach any customized yaml and the output from 'sudo ceph status' from one of the controller nodes?
hi Yogev, is this still a bug? Can you reply to comment #2?
(In reply to Giulio Fidente from comment #3) > hi Yogev, is this still a bug? Can you reply to comment #2? Hi Giulio, I got a workaround this issue: 1) Create a new flavor $openstack flavor create --id auto --ram 4096 --disk 10 --vcpus 1 cephStorage 2) Add a property to the flavor $openstack flavor set --property 'cpu_arch'='x86_64' --property 'capabilities:boot_option'='local' --property 'capabilities:profile'='cephStorage' cephStorage 3) Add a property to the node with ironic $ ironic node-update <ceph storage node uuid> add properties/capabilities='profile:cephStorage,boot_option:local' And the customize yaml file is in the description of the bug
This bug did not make the OSP 8.0 release. It is being deferred to OSP 10.
If disks were previously formatted by/for a different Ceph cluster, the cluster FSID won't match and the OSP Director won't reuse them. Before BZ #1370439, the deployment would not fail in such a scenario, but silently discard pre-owned disks. With recent builds instead (which include the fix for BZ #1370439), the deployment should fail instead. Can you retry formatting the Ceph disks with an empty GPT label during the deployment, as documented in: https://access.redhat.com/documentation/en/red-hat-openstack-platform/9/single/red-hat-ceph-storage-for-the-overcloud/#Formatting_Ceph_Storage_Nodes_Disks_to_GPT
(In reply to Giulio Fidente from comment #9) > If disks were previously formatted by/for a different Ceph cluster, the > cluster FSID won't match and the OSP Director won't reuse them. > > Before BZ #1370439, the deployment would not fail in such a scenario, but > silently discard pre-owned disks. > > With recent builds instead (which include the fix for BZ #1370439), the > deployment should fail instead. > > Can you retry formatting the Ceph disks with an empty GPT label during the > deployment, as documented in: > https://access.redhat.com/documentation/en/red-hat-openstack-platform/9/ > single/red-hat-ceph-storage-for-the-overcloud/ > #Formatting_Ceph_Storage_Nodes_Disks_to_GPT I have tried it and it worked.
A deployment finished successfully, with Giulio's comment