Bug 1371218
Summary: | [puppet-ceph] When deploying a large number of OSDs not all OSDs are activated but all are prepared | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | John Fulton <johfulto> |
Component: | puppet-ceph | Assignee: | Giulio Fidente <gfidente> |
Status: | CLOSED ERRATA | QA Contact: | Yogev Rabl <yrabl> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 10.0 (Newton) | CC: | bengland, gfidente, jjoyce, jomurphy, jschluet, jtaleric, psanchez, rsussman, sclewis, slinaber, smalleni, srevivo, tvignaud, twilkins |
Target Milestone: | rc | Keywords: | Reopened |
Target Release: | 11.0 (Ocata) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | puppet-ceph-2.3.0-2.el7ost.noarch | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-05-17 19:32:37 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1413723 | ||
Attachments: |
Description
John Fulton
2016-08-29 15:24:44 UTC
Created attachment 1195418 [details] TripleO preboot script to workaround ceph-disk partprobe race condition Workaround for https://github.com/ceph/ceph/commit/3d6d36a12bd4823352dc58e2135d03f261d18dbe for those using http://buildlogs.centos.org/centos/7/storage/x86_64/ceph-jewel/ This is a backport of an existing fix Created attachment 1195420 [details] patch containing backport of existing fix from https://github.com/ceph/ceph/commit/3d6d36a12bd4823352dc58e2135d03f261d18dbe NOT a patch for _this_ bug but for a separate bug. This patch is provided to help in reproducing this bug so that a separate issue is not conflated with this one. I made this by updating /usr/lib/python2.7/site-packages/ceph_disk/main.py as provided by the following RPMs: [root@overcloud-novacompute-2 ~]# rpm -qa | grep ceph ceph-base-10.2.2-0.el7.x86_64 ceph-mds-10.2.2-0.el7.x86_64 ceph-common-10.2.2-0.el7.x86_64 ceph-mon-10.2.2-0.el7.x86_64 ceph-10.2.2-0.el7.x86_64 python-cephfs-10.2.2-0.el7.x86_64 libcephfs1-10.2.2-0.el7.x86_64 ceph-selinux-10.2.2-0.el7.x86_64 ceph-osd-10.2.2-0.el7.x86_64 [root@overcloud-novacompute-2 ~]# To have the changes described in the following: https://github.com/ceph/ceph/commit/3d6d36a12bd4823352dc58e2135d03f261d18dbe (In reply to John Fulton from comment #0) > VII. Additional info: > > 1. When reproducing this it's possible to run into a separate ceph-disk race > condition with partprobe described in > https://github.com/ceph/ceph/commit/3d6d36a12bd4823352dc58e2135d03f261d18dbe, > though fixing this problem does not eliminate this bug (I will follow up on > this separate issue in a separate BZ for a separate project) The first two attachments and comments 1 and 2 above are exclusively about the issue quoted above. A patch for _this_ bug is still needed. Created attachment 1197477 [details]
patch to osd.pp to fix reported problem when combined with other patch
This patch to osd.pp from puppet-ceph solves the problem in my env provided that I combine it with the patch containing the backport to ceph-disk I posted earlier.
I am going to hold off on sending this to review as I'd like to see if I can also workaround the backport not being there; i.e. make osd.pp look for `ceph-disk prepare` failure and manage it.
Created attachment 1198340 [details] update to osd.pp.patch to not use udev for any Jewel until fix version is known There are two bugs. The following patch to ceph-disk will solve ONE of them. https://github.com/ceph/ceph/commit/3d6d36a12bd4823352dc58e2135d03f261d18dbe The other bug needs to be filed (I will do that next). Until that other bug is fixed, have osd.pp tell the install to not use udev. Thus, the osd.pp patch disables udev for 10.2.0 =< $version < X. X may not be 10.2.3. When the version with both fixes is known, osd.pp should be updated with a value for X. todo: verify second bug with non-opensack Ceph install. This has been hard to reproduce. Recent testing in the scale lab has shown similar symptoms when deploying a node with 36 OSDs. I will re-review logs from this testing after Ocata M3. The problem we seem to have is that the execs in osd.pp go in background, causing puppet to move on to the next resource when it should not. I am attaching a reproducer which executes 10 times the same resource dumping the execution timestamp into a set of files in /tmp; each resource sleeps 2 seconds and the files are named date_background when the sleep goes in background and date_nobackground when the sleep does not ... the execution time for the _nobackground resources skips 2 seconds as expected, the execution time for the _background resources is just within the same second for all 10 resources. Created attachment 1239574 [details]
parallelism.pp reproduces the issue demonstrating the background issue
I'm attempting a change in puppet-ceph which could mitigate this https://review.openstack.org/#/c/434330/ Would be nice to test it at scale, using DeployArtifacts, to see if / how much it helps. This bugzilla has been removed from the release and needs to be reviewed and Triaged for another Target Release. Giulio, do you have the change in a puddle somewhere that could be used to deploy in the scale lab? Or would we have to patch the undercloud to try this? This patch looks like it might apply cleanly because it is so small. Does it apply cleanly to OSP10 by any chance? Tim Wilkinson is working on a deployment right now to get HCI working on the supermicro 6048R servers in the scale lab, on a smaller scale than we did before. If this goes well, it would be at a much greater scale than what's in the original post, up to 36 drives/server x 9 servers = 324 OSDs. The scale lab can deploy up to 30 x 36 drives > 1000 OSDs in theory this way (but someone has to put their request in the queue). For now, Tim's current cloud09 deployment would be enough to test this patch right? Ben, Yes, I want to try this patch with Tim. I'll jump on his undercloud to help set it up when the deploy is ready. I propose the following: 1. reprodcue the bug (this should happen) 2. ping me and I'll SSH into your undercloud to set up use deploy articfacts (or even first-boot if necessary) to have the patch applied before puppet-ceph runs 3. we'll observe the results of the patch and update the bug John There was an update requested on this: - We one change to test which was merged for Ocata https://review.openstack.org/#/c/434330/1/manifests/osd.pp - A similar test could be conducted with backgrounding as per #8 but only if the first point doesn't help - We can put time into reproducing with the scale lab team when they're ready - We can put time into reproducing in a virtual env too I am focussing on higher priorities this week but can return to this the week of 3/13 unless asked to do so earlier. Testing in scale lab yesterday with OSP10 but also with deploy artifacts to use a newer version of puppet-ceph which included the following: https://review.openstack.org/#/c/434330/ We were able to deploy without any issue 3 times in a row using 8 Ceph storage servers with 34 OSDs each. Im getting the same error, deploying the latest OSP10 repos with a hyperconverged setup compute+ceph with only one osd per compute : http://chunk.io/f/86ce7a96161443dfb97d541edc0a62f5 Pablo, Thanks, though this doesn't look like the same error. puppet-ceph exec'd the following shell commands: [1;31mError: /bin/true # comment to satisfy puppet syntax requirements set -ex if ! test -b /dev/nvme0n1 ; then mkdir -p /dev/nvme0n1 if getent passwd ceph >/dev/null 2>&1; then chown -h ceph:ceph /dev/nvme0n1 fi fi ceph-disk prepare --cluster-uuid d203beee-2208-11e7-9a51-525400fe01b8 /dev/nvme0n1 udevadm settle returned 1 instead of one of [0] Does the device /dev/nvme0n1 exist on your system? If not then it shouldn't be in the Heat templates. The above looks a little more like this issue: https://bugzilla.redhat.com/show_bug.cgi?id=1422191 FYI: Neither this fix (no fixed-in-flag yet) not the above have landed in OSP10. Do you want to open a new bug this and provide your Heat templates and the output of lsblk on your overcloud nodes? verified in the scale lab Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1245 |