| Summary: | director should increase kernel.pid_max on ceph backed compute nodes | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Tim Wilkinson <twilkins> |
| Component: | openstack-tripleo-heat-templates | Assignee: | Giulio Fidente <gfidente> |
| Status: | CLOSED ERRATA | QA Contact: | Yogev Rabl <yrabl> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 10.0 (Newton) | CC: | abond, bengland, dwilson, ekuvaja, gfidente, hbrock, jharriga, johfulto, jschluet, jslagle, mburns, nlevinki, pgrist, pmyers, rhel-osp-director-maint, rsussman, seb, shan, wusui |
| Target Milestone: | rc | Keywords: | Triaged |
| Target Release: | 10.0 (Newton) | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-tripleo-heat-templates-5.0.0-1.7.el7ost | Doc Type: | Enhancement |
| Doc Text: |
Feature:
Allows for custom values for the kernel.pid_max sysctl key via KernelPidMax Heat parameter and defaults it to 1048576.
Reason:
On nodes working as Ceph clients there might be a large number of running threads, depending on the number of ceph-osd instances in which case the max value of pid_max might be hit causing I/O errors.
Result:
The pid_max key has a higher default and can be customized via KernelPidMax parameter.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-12-14 16:26:20 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Tim Wilkinson
2016-10-27 18:23:58 UTC
cc'ing Jeff Brown in RHS. This problem impacts OpenStack-Ceph scalability and specifically use of Ceph in the OpenStack scale lab. I think it defaults to 32K for compatibility with 32bit systems; for 64bit systems it is limited to 4M instead, as it consumes a more memory for higer values do you think defaulting to 1M would be reasonable? On OSP10, the systemd unit file of an OSD already has "TasksMax=infinity" so this shouldn't be happening anymore. I don't think we need to change kernel.pidmax as explained in the systemd doc: Specify the maximum number of tasks that may be created in the unit. This ensures that the number of tasks accounted for the unit (see above) stays below a specific limit. This either takes an absolute number of tasks or a percentage value that is taken relative to the configured maximum number of tasks on the system. If assigned the special value "infinity", no tasks limit is applied. This controls the "pids.max" control group attribute. Sebastien, to clarify, this problem is occurring on compute nodes, not the Ceph nodes. The compute nodes are where we are running qemu-kvm processes that have librbd linked into them. The configuration is OSP 10 layered on an externally configured Ceph cluster. But I think this would be relevant even if OSPd was deploying Ceph nodes. Right sorry I kinda missed the hypervisor part. So next step is to increase kernel.pid_max to a very large value, something like 4194303? Thanks! With the new builds including this change, the kernel.pid_max value will default to 1048576.
It will be possible to customize this value using an environment file at deployment time. The environment file should look like the following:
parameter_defaults:
KernelPidMax: 4194303
Thanks for fixing this. verified on openstack-tripleo-heat-templates-5.1.0-3.el7ost.noarch #cat /proc/sys/kernel/pid_max 1048576 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-2948.html |