Created attachment 1273973 [details] storage-env Description of problem: Deploying a hyperconverged setup with 1 controller + 3 compute nodes the deployment fails with : http://chunk.io/f/86ce7a96161443dfb97d541edc0a62f5 Version-Release number of selected component (if applicable): puppet-tripleo-5.5.0-4.el7ost.noarch python-tripleoclient-5.4.1-1.el7ost.noarch openstack-tripleo-puppet-elements-5.2.0-2.el7ost.noarch openstack-tripleo-common-5.4.1-1.el7ost.noarch openstack-tripleo-0.0.8-0.2.4de13b3git.el7ost.noarch openstack-tripleo-image-elements-5.2.0-1.el7ost.noarch openstack-tripleo-heat-templates-5.2.0-3.el7ost.noarch openstack-tripleo-ui-1.1.0-1.el7ost.noarch openstack-tripleo-validations-5.1.1-1.el7ost.noarch How reproducible: 100% on this environment Steps to Reproduce: 1. 2. 3. Actual results: Fails with the puppet error message Expected results: Environment deployed successfully Additional info: Relevant files attached
Created attachment 1273974 [details] first-boot
Created attachment 1273975 [details] deploy-command
Status of the disks on every compute node after the failed deployment : [root@overcloud-compute-1 heat-admin]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 111.8G 0 disk ├─sda1 8:1 0 1M 0 part └─sda2 8:2 0 111.8G 0 part / nvme0n1 259:0 0 745.2G 0 disk ├─nvme0n1p1 259:1 0 740.2G 0 part /var/lib/ceph/osd/ceph-0 └─nvme0n1p2 259:2 0 5G 0 part [root@overcloud-compute-1 heat-admin]# ceph-disk list /dev/nvme0n1 : /dev/nvme0n1p2 ceph journal, for /dev/nvme0n1p1 /dev/nvme0n1p1 ceph data, active, cluster ceph, osd.0, journal /dev/nvme0n1p2 /dev/sda : /dev/sda1 other, iso9660 /dev/sda2 other, xfs, mounted on /
And ceph is actually running fine : [root@overcloud-compute-1 heat-admin]# ceph -s cluster d203beee-2208-11e7-9a51-525400fe01b8 health HEALTH_OK monmap e1: 1 mons at {overcloud-controller-0=172.18.0.13:6789/0} election epoch 3, quorum 0 overcloud-controller-0 osdmap e20: 3 osds: 3 up, 3 in flags sortbitwise pgmap v35: 224 pgs, 6 pools, 0 bytes data, 0 objects 105 MB used, 2219 GB / 2219 GB avail 224 active+clean
I think the first question to answer on this BZ is why the following in osd.pp: set -ex if ! test -b /dev/nvme0n1 ; then mkdir -p /dev/nvme0n1 if getent passwd ceph >/dev/null 2>&1; then chown -h ceph:ceph /dev/nvme0n1 fi fi ceph-disk prepare --cluster-uuid d203beee-2208-11e7-9a51-525400fe01b8 /dev/nvme0n1 udevadm settle returned 1 instead of one of [0]. As per the following output: [root@overcloud-compute-1 heat-admin]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 111.8G 0 disk ??sda1 8:1 0 1M 0 part ??sda2 8:2 0 111.8G 0 part / nvme0n1 259:0 0 745.2G 0 disk ??nvme0n1p1 259:1 0 740.2G 0 part /var/lib/ceph/osd/ceph-0 ??nvme0n1p2 259:2 0 5G 0 part /dev/nvme0n1 is a real block device. Thus the following execution: if ! test -b /dev/nvme0n1 ; then should result in the conditional being false because `test -b /dev/nvme0n1` should be true. I tested this on a system with an NVMe SSD and it behaved as I expected. The same as a rotational block device: [root@overcloud-controller-2 ~]# if test -b /dev/sdb ; then echo "is a block device"; fi is a block device [root@overcloud-controller-2 ~]# if test -b /dev/nvme0n1 ; then echo "is a block device"; fi is a block device [root@overcloud-controller-2 ~]# The Ironic introspection data as downloaded from Swift for this system with ironic_id $uuid with: # export PASSWD=$(sudo crudini --get /etc/ironic-inspector/inspector.conf swift password) # swift -q -U service:ironic -K $PASSWD download ironic-inspector inspector_data-$uuid contains the following about the two disks above: # cat cc36f103-8af9-4c89-a7ef-ed212b3591f1 | jq . | less ... { "size": 499558383616, "rotational": true, "vendor": "DELL", "name": "/dev/sdb", "wwn_vendor_extension": "0x1ebd892bda966209", "wwn_with_extension": "0x614187705ddbe0001ebd892bda966209", "model": "PERC H730 Mini", "wwn": "0x614187705ddbe000", "serial": "614187705ddbe0001ebd892bda966209" }, ... { "size": 400088457216, "rotational": false, "vendor": null, "name": "/dev/nvme0n1", "wwn_vendor_extension": null, "wwn_with_extension": null, "model": "Dell Express Flash NVMe 400GB", "wwn": null, "serial": " S1J0NYAGB00174" } The same hardware has been used without issue when putting OSDs on the rotational devices and using one NVMe to host all of the journals for those rotational disks (as NVMe is fast enough to do that): [root@overcloud-osd-compute-3 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 465.3G 0 disk ├─sda1 8:1 0 1M 0 part └─sda2 8:2 0 465.3G 0 part / sdb 8:16 0 465.3G 0 disk └─sdb1 8:17 0 465.3G 0 part /var/lib/ceph/osd/ceph-91 sdc 8:32 0 465.3G 0 disk └─sdc1 8:33 0 465.3G 0 part /var/lib/ceph/osd/ceph-36 sdd 8:48 0 465.3G 0 disk └─sdd1 8:49 0 465.3G 0 part /var/lib/ceph/osd/ceph-60 sde 8:64 0 465.3G 0 disk └─sde1 8:65 0 465.3G 0 part /var/lib/ceph/osd/ceph-46 sdf 8:80 0 465.3G 0 disk └─sdf1 8:81 0 465.3G 0 part /var/lib/ceph/osd/ceph-75 sdg 8:96 0 465.3G 0 disk └─sdg1 8:97 0 465.3G 0 part /var/lib/ceph/osd/ceph-100 sdh 8:112 0 465.3G 0 disk └─sdh1 8:113 0 465.3G 0 part /var/lib/ceph/osd/ceph-123 sdi 8:128 0 465.3G 0 disk └─sdi1 8:129 0 465.3G 0 part /var/lib/ceph/osd/ceph-28 sdj 8:144 0 465.3G 0 disk └─sdj1 8:145 0 465.3G 0 part /var/lib/ceph/osd/ceph-84 sdk 8:160 0 465.3G 0 disk └─sdk1 8:161 0 465.3G 0 part /var/lib/ceph/osd/ceph-54 sdl 8:176 0 465.3G 0 disk └─sdl1 8:177 0 465.3G 0 part /var/lib/ceph/osd/ceph-116 sdm 8:192 0 465.3G 0 disk └─sdm1 8:193 0 465.3G 0 part /var/lib/ceph/osd/ceph-20 sdn 8:208 0 465.3G 0 disk └─sdn1 8:209 0 465.3G 0 part /var/lib/ceph/osd/ceph-110 sdo 8:224 0 465.3G 0 disk └─sdo1 8:225 0 465.3G 0 part /var/lib/ceph/osd/ceph-5 sdp 8:240 0 465.3G 0 disk └─sdp1 8:241 0 465.3G 0 part /var/lib/ceph/osd/ceph-68 sdq 65:0 0 465.3G 0 disk └─sdq1 65:1 0 465.3G 0 part /var/lib/ceph/osd/ceph-12 nvme0n1 259:0 0 372.6G 0 disk ├─nvme0n1p1 259:1 0 5G 0 part ├─nvme0n1p2 259:2 0 5G 0 part ├─nvme0n1p3 259:3 0 5G 0 part ├─nvme0n1p4 259:4 0 5G 0 part ├─nvme0n1p5 259:5 0 5G 0 part ├─nvme0n1p6 259:6 0 5G 0 part ├─nvme0n1p7 259:7 0 5G 0 part ├─nvme0n1p8 259:8 0 5G 0 part ├─nvme0n1p9 259:9 0 5G 0 part ├─nvme0n1p10 259:10 0 5G 0 part ├─nvme0n1p11 259:11 0 5G 0 part ├─nvme0n1p12 259:12 0 5G 0 part ├─nvme0n1p13 259:13 0 5G 0 part ├─nvme0n1p14 259:14 0 5G 0 part ├─nvme0n1p15 259:15 0 5G 0 part └─nvme0n1p16 259:16 0 5G 0 part [root@overcloud-osd-compute-3 ~]# However, feeding puppet-ceph an NVMe to use directly as its OSD seems to be what's different here. Can you share the same information from your ironic introspection data and the output of the bash conditional for a block device?
Re-reading this puppet, and assuming that `if ! test -b /dev/nvme0n1` did return as it should (I'd like to see this for sure as per the needinfo), and your system didn't attempt to `mkdir /dev/nvme0n1`, then that means the following would have failed: ceph-disk prepare --cluster-uuid d203beee-2208-11e7-9a51-525400fe01b8 /dev/nvme0n1 udevadm settle If so, then it might be a ceph-disk issue.
inspection data for the three compute nodes: http://chunk.io/f/7412b2fdb7424c3d8b65896e1dfe5f66 http://chunk.io/f/f2eb0d7194224ef3982b8c008ca5c7ae http://chunk.io/f/5c92678165f848dd8711191b9f852ee5 Thanks
Thanks, I see your NVMe hard drive in the provided data: { "rotational": false, "vendor": null, "name": "/dev/nvme0n1", "wwn_vendor_extension": null, "wwn_with_extension": null, "model": "INTEL SSDPEDMD800G4", "wwn": null, "serial": "CVFT6042001J800CGN", "size": 800166076416 } Are you able to send me the output of the following on the system as a sanity check? if test -b /dev/nvme0n1 ; then echo "is a block device"; fi Can you also send me a gzipped copy of /var/log/messages? For example on my system `grep "ceph-disk prepare" /var/log/messages` contains this: Apr 25 11:47:00 overcloud-osd-compute-7 os-collect-config: #033[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdh]/Exec[ceph-osd-prepare-/dev/sdh]/returns: + ceph-disk prepare --cluster ceph --cluster-uuid eb2bb192-b1c9-11e6-9205-525400330666 /dev/sdh /dev/nvme0n1#033[0m Thanks, John
this works well: [root@overcloud-compute-1 ~]# if test -b /dev/nvme0n1 ; then echo "is a block device"; fi is a block device and here you have the messages file: http://chunk.io/f/1c84738b4f384339a10ee7d4bd6b9445 Thanks Pablo
As per /var/log/messages, ceph-disk prepare and ceph-disk activate ran without trouble at 12:54:51 [1]. So your Ceph cluster was activated as you reported. However, four minutes later ceph-disk prepare tried to run again [2] and failed because the disk was already in use and that made the deploy fail. So the bug is: ceph-disk prepare should not have run again John [1] Apr 26 12:54:51 overcloud-compute-1 os-collect-config: #033[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/nvme0n1]/Exec[ceph-osd-prepare-/dev/nvme0n1]/returns: + ceph-disk prepare --cluster-uuid d203beee-2208-11e7-9a51-525400fe01b8 /dev/nvme0n1#033[0m Apr 26 12:54:51 overcloud-compute-1 os-collect-config: #033[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/nvme0n1]/Exec[ceph-osd-activate-/dev/nvme0n1]/returns: + ceph-disk activate /dev/nvme0n1#033[0m [2] Apr 26 12:58:56 overcloud-compute-1 os-collect-config: #033[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/nvme0n1]/Exec[ceph-osd-prepare-/dev/nvme0n1]/returns: + ceph-disk prepare --cluster-uuid d203beee-2208-11e7-9a51-525400fe01b8 /dev/nvme0n1#033[0m Apr 26 12:58:56 overcloud-compute-1 os-collect-config: #033[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/nvme0n1]/Exec[ceph-osd-prepare-/dev/nvme0n1]/returns: ceph-disk: Error: Device is mounted: /dev/nvme0n1p1#033[0m
Pablo, Can you confirm the version of puppet-ceph that you are running with a `rpm -q puppet-ceph` ? John
It should not try to prepare the disk twice because there is an unless statement which checks first that the disk does not come up according to `ceph-disk list`: https://github.com/openstack/puppet-ceph/blob/master/manifests/osd.pp#L139 Here's an example from my system: [root@overcloud-osd-compute-7 ~]# disk=$(readlink -f ${data}) [root@overcloud-osd-compute-7 ~]# ceph-disk list | egrep " *${disk}1? .*ceph data, (prepared|active)" /dev/sdh1 ceph data, active, cluster ceph, osd.126, journal /dev/nvme0n1p16 [root@overcloud-osd-compute-7 ~]# Your copy of osd.pp within puppet-ceph should be doing something similar, though your copy is older. My copy above is from the latest version of puppet-ceph. On your partially deployed system, can you read osd.pp and see how the same commands work? If that doesn't make sense, then when I know your exact version of the puppet-ceph package, I will extract the osd.pp file and give you a few commands to try that might flush out the issue you are having.
kschinck conjectured the following: - data is /dev/nvme0n1 - the data partition is /dev/nvme0n1p1 - the unless is looking for /dev/nvme0n11 because of how the grep is written So the following check [1] which is supposed to prevent ceph-disk prepare from running on an already activated OSD device would fail: ceph-disk list | egrep \" *\${disk}1? .*ceph data, (prepared|active)\" We might need to improve that grep; e.g. maybe ${disk}[p|]1 As per the needinfo, let's see your osd.pp and the output of your variation of the above command on your system which has an OSD already prepared. Would you please also attach the output of `ceph-disk list`? [1] https://github.com/openstack/puppet-ceph/blob/master/manifests/osd.pp#L139
Jhon, There you go: [stack@undercloud-10 templates-pablo]$ rpm -q 'puppet-ceph' puppet-ceph-2.2.1-3.el7ost.noarch [root@overcloud-compute-1 ~]# ceph-disk list /dev/nvme0n1 : /dev/nvme0n1p2 ceph journal, for /dev/nvme0n1p1 /dev/nvme0n1p1 ceph data, active, cluster ceph, osd.0, journal /dev/nvme0n1p2 /dev/sda : /dev/sda1 other, iso9660 /dev/sda2 other, xfs, mounted on / osd.pp from the same compute node: http://chunk.io/f/5dc8765710eb48b7adec654cb908dd0c THanks
Pablo, Thank you. This looks like it will be sufficient to reproduce the problem. The line from osd.pp is the following: ceph-disk list | grep -E ' *${data}1? .*ceph data, (prepared|active)' We know what `ceph-disk list` looks like on your machine. Saving it I see the following: [jfulton@skagra ~]$ cat ceph-disk-list /dev/nvme0n1 : /dev/nvme0n1p2 ceph journal, for /dev/nvme0n1p1 /dev/nvme0n1p1 ceph data, active, cluster ceph, osd.0, journal /dev/nvme0n1p2 /dev/sda : /dev/sda1 other, iso9660 /dev/sda2 other, xfs, mounted on / [jfulton@skagra ~]$ data=/dev/nvme0n1 [jfulton@skagra ~]$ cat ceph-disk-list | grep -E ' *${data}1? .*ceph data, (prepared|active)' [jfulton@skagra ~]$ There's the bug. Your prepared OSD didn't pop out. Here's an ad hoc grep of what the fix might be like as the active disk pops out which would prevent ceph-disk prepare from getting run twice: [jfulton@skagra ~]$ cat ceph-disk-list | grep -E "${data}p1.*ceph data, (prepared|active)" /dev/nvme0n1p1 ceph data, active, cluster ceph, osd.0, journal /dev/nvme0n1p2 [jfulton@skagra ~]$ The grep has changed upstream since the version you are using: https://github.com/openstack/puppet-ceph/blob/master/manifests/osd.pp#L138-L139
Hi John, I'm working with Pablo on this case. I tried to use the upstream osd.pp version and got the following errors http://chunk.io/f/98d8b1485b144345bd4d1d9c4914da59 When applying the proosed modifications: http://chunk.io/f/5067cd8fc42f47e6b6b4b6c65ec92197 I ended up with a failure at step 3 and a non functional ceph cluster http://chunk.io/f/e20807c6e403420db66d604c12cdf08d Thx
Hi Karim, Thanks for the update. Just to clarify, I don't think we'll just be able to drop the newest osd.pp with the rest of the older puppet-ceph module to make your deploy work. Instead I think making the upstream version handle that device is the right move and then backporting the fix to OSP10. Would you please send me the ceph-disk list output from this new failed deploy? John
Karim, please also include the full /var/log/messages of the failed deploy on a node running the OSDs?
Hi, Here are the infos: [root@overcloud-compute-0 ~]# ceph-disk list /dev/nvme0n1 : /dev/nvme0n1p2 ceph journal, for /dev/nvme0n1p1 /dev/nvme0n1p1 ceph data, active, cluster ceph, osd.2, journal /dev/nvme0n1p2 /dev/sda : /dev/sda1 other, iso9660 /dev/sda2 other, xfs, mounted on / and /var/log/messages http://chunk.io/f/47725680a4f9475ca231f4670caace8e I used upstream osd.pp for this one. Thx Karim
Status update: Used deploy artifacts [0] with OSP11 RC to test updated regex from the proposed fix [1] on system with NVMe SSD and it configured the NVMe SSD as an OSD without failing the deploy [2]. At this point I am working on getting the fix past CI so it can merge upstream and then it can make its way down to OSP11 and OSP10 via a Z-stream. [0] https://hardysteven.blogspot.com/2016/08/tripleo-deploy-artifacts-and-puppet.html [1] https://review.openstack.org/#/c/462991/ [2] [stack@b10-h25-r620 ~]$ grep -A 1 ceph::profile::params::osds custom-templates/ceph.yaml ceph::profile::params::osds: '/dev/nvme0n1': {} [stack@b10-h25-r620 ~]$ [stack@b10-h25-r620 ~]$ cat deploy.sh source ~/stackrc pushd ~ ; upload-puppet-modules -d puppet-modules ; popd time openstack overcloud deploy --templates --timeout 240 \ -r ~/custom-templates/custom-roles.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml \ -e ~/custom-templates/network.yaml \ -e ~/custom-templates/ceph.yaml \ -e ~/custom-templates/compute.yaml \ -e ~/custom-templates/layout.yaml [stack@b10-h25-r620 ~]$ grep p1 puppet-modules/ceph/manifests/osd.pp ceph-disk list | egrep \" *(\${disk}1?|\${disk}p1?) .*ceph data, (prepared|active)\" || if ! test -b \$disk || ! test -b \${disk}1 || ! test -b \${disk}p1 ; then if test -f ${udev_rules_file}.disabled && ( test -b \${disk}1 || test -b \${disk}p1 ); then [stack@b10-h25-r620 ~]$ ./deploy.sh ... 2017-05-18 12:36:58Z [overcloud.AllNodesDeploySteps.ControllerPostPuppet]: CREATE_COMPLETE state changed 2017-05-18 12:36:58Z [overcloud.AllNodesDeploySteps]: CREATE_COMPLETE Stack CREATE completed successfully 2017-05-18 12:36:58Z [overcloud.AllNodesDeploySteps]: CREATE_COMPLETE state changed 2017-05-18 12:36:58Z [overcloud]: CREATE_COMPLETE Stack CREATE completed successfully Stack overcloud CREATE_COMPLETE /home/stack/.ssh/known_hosts updated. Original contents retained as /home/stack/.ssh/known_hosts.old Overcloud Endpoint: http://172.21.0.10:5000/v2.0 Overcloud Deployed real 53m22.265s user 0m11.661s sys 0m0.978s [stack@b10-h25-r620 ~]$ [stack@b10-h25-r620 ~]$ ssh heat-admin.24.62 [heat-admin@overcloud-osd-compute-0 ~]$ sudo su - [root@overcloud-osd-compute-0 ~]# ceph-disk list /dev/nvme0n1 : /dev/nvme0n1p2 ceph journal, for /dev/nvme0n1p1 /dev/nvme0n1p1 ceph data, active, cluster ceph, osd.0, journal /dev/nvme0n1p2 /dev/nvme1n1 other, unknown /dev/sda other, unknown /dev/sdaa other, unknown /dev/sdab other, unknown /dev/sdac other, unknown /dev/sdad other, unknown /dev/sdae other, unknown /dev/sdaf other, unknown /dev/sdag other, unknown /dev/sdah other, unknown /dev/sdai other, unknown /dev/sdaj other, unknown /dev/sdak other, unknown /dev/sdal : /dev/sdal1 other, iso9660 /dev/sdal2 other, xfs, mounted on / /dev/sdb other, unknown /dev/sdc other, unknown /dev/sdd other, unknown /dev/sde other, unknown /dev/sdf other, unknown /dev/sdg other, unknown /dev/sdh other, unknown /dev/sdi other, unknown /dev/sdj other, unknown /dev/sdk other, unknown /dev/sdl other, unknown /dev/sdm other, unknown /dev/sdn other, unknown /dev/sdo other, unknown /dev/sdp other, unknown /dev/sdq other, unknown /dev/sdr other, unknown /dev/sds other, unknown /dev/sdt other, unknown /dev/sdu other, unknown /dev/sdv other, unknown /dev/sdw other, unknown /dev/sdx other, unknown /dev/sdy other, unknown /dev/sdz other, unknown [root@overcloud-osd-compute-0 ~]#
Merged: https://review.openstack.org/#/c/462991/
According to our records, this should be resolved by openstack-puppet-modules-10.0.0-1.el7ost. This build is available now.
*** Bug 1410554 has been marked as a duplicate of this bug. ***
OSP11 is now retired, see details at https://access.redhat.com/errata/product/191/ver=11/rhel---7/x86_64/RHBA-2018:1828