Note, the three submissions linked by the BZ are for three different components:
They will all need to be backported into the newton branches.
The following went cleanly:
There was a merge conflict for:
I will follow up as it I don't think it should be too hard too add the following three lines to tripleo-heat-templates/puppet/services/nova-libvirt.yaml in Newton at first glance.
Newton backports ready to be tested. After I test I will take out of WIP status for CI and then review:
- https://review.openstack.org/#/c/448122 THT
- https://review.openstack.org/#/c/442970 puppet-tripleo
- https://review.openstack.org/#/c/442969 puppet-nova
Upstream changes for this BZ merged:
verified on openstack-tripleo-heat-templates-5.2.0-18.el7ost.noarch
Does bz status need to change? or is this a new bz?
the changes above did alter the config, but the problem is still happening :-( Ceph librados saw that pthread_create did not succeed. This is RHOSP 11 now.
Thread::try_create(): pthread_create failed with error 11common/Thread.cc: In function 'void Thread::create(const char*, size_t)' thread 7fb744feb700 time 2017-06-08 13:44:55.748364
common/Thread.cc: 160: FAILED assert(ret == 0)
ceph version 10.2.5-37.el7cp (033f137cde8573cfc5a4662b4ed6a63b8a8d1464)
1: (()+0x175375) [0x7fb7634d6375]
2: (()+0x198d2a) [0x7fb7634f9d2a]
3: (()+0x3362c5) [0x7fb7636972c5]
4: (()+0x33697e) [0x7fb76369797e]
5: (()+0xd1b6e) [0x7fb763432b6e]
6: (()+0xd27d7) [0x7fb7634337d7]
7: (()+0xd5992) [0x7fb763436992]
8: (()+0xd5cad) [0x7fb763436cad]
9: (()+0xa960b) [0x7fb76340a60b]
10: (librados::IoCtx::aio_operate(std::string const&, librados::AioCompletion*, librados::ObjectWriteOperation*, unsigned long, std::vector<unsigned long, std::allocator<unsigned long> >&)+0xe1) [0x7fb7633d7341]
11: (()+0x88159) [0x7fb76cb6a159]
12: (()+0x8867b) [0x7fb76cb6a67b]
13: (()+0x89f8e) [0x7fb76cb6bf8e]
14: (()+0x8b0ad) [0x7fb76cb6d0ad]
15: (()+0x77f69) [0x7fb76cb59f69]
16: (()+0x9036a) [0x7fb76cb7236a]
17: (()+0x9ec6d) [0x7fb7633ffc6d]
18: (()+0x87019) [0x7fb7633e8019]
19: (()+0x174526) [0x7fb7634d5526]
20: (()+0x7dc5) [0x7fb75e65edc5]
21: (clone()+0x6d) [0x7fb75e38d73d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2017-06-08 13:44:56.108+0000: shutting down
I noticed that for qemu the ulimit -u value for processes was 4096, much lower than the desired limit for qemu-kvm guests. Which value takes precedence? This one?
# more /etc/security/limits.d/20-nproc.conf
# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.
* soft nproc 4096
root soft nproc unlimited
or this one:
# tail -2 /etc/libvirt/qemu.conf
max_files = 32768
max_processes = 131072
Sorry, wrong bz, Tim and I were running Ocata.
$ rpm -q openstack-tripleo-heat-templates
OSP10 --> https://bugzilla.redhat.com/show_bug.cgi?id=1430002
OSP11 --> https://bugzilla.redhat.com/show_bug.cgi?id=1372589
But they both had similar fixes, right? Switching to 1372589.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
*** Bug 1263828 has been marked as a duplicate of this bug. ***