Bug 1750781
Summary: | Containers not pinned to host cpus | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Christophe Fontaine <cfontain> |
Component: | openstack-tripleo-heat-templates | Assignee: | Emilien Macchi <emacchi> |
Status: | CLOSED ERRATA | QA Contact: | David Rosenfeld <drosenfe> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 13.0 (Queens) | CC: | aschultz, djuran, drosenfe, eelena, emacchi, fbaudin, fherrman, fiezzi, gconsalv, hakhande, jraju, marjones, mburns, mschuppe, ndeevy, owalsh, supadhya |
Target Milestone: | --- | Keywords: | Triaged, ZStream |
Target Release: | --- | ||
Hardware: | All | ||
OS: | All | ||
Whiteboard: | |||
Fixed In Version: | python-paunch-2.5.0-9.el7ost openstack-tripleo-heat-templates-8.4.1-24.el7ost | Doc Type: | If docs needed, set a value |
Doc Text: |
Before this update, all OpenStack containers floated across all system CPUs, ignoring tuned cpu-partitioning profiles and the `isolcpus` boot parameter. This meant that containers could preempt CPUs that were dedicated to VMs (vCPUs) or OVS-DPDK, resulting in packet loss on VNFs or on OVS-DPDK.
This bug affected mainly NFV and other use cases that required isolated vCPUs. With this update, the issue is resolved.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2020-03-10 11:22:02 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Christophe Fontaine
2019-09-10 13:14:26 UTC
If we apply the workaround above, we cannot start VMs (openstack server show on failed VM): | fault | {u'message': u"Unable to write to '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\\x2d1\\x2dinstance\\x2d00000001.scope/vcpu0/cpuset.cpus': Permission denied", u'code': 500, u'details': u' File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1858, in _do_build_and_run_instance\n filter_properties, request_spec)\n File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2142, in _build_and_run_instance\n instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'created': u'2019-09-10T18:44:28Z'} | | flavor | vnfc (9b09d8fa-b32f-466c-909d-4aff9afe2d00) Libvirt will need to be excluded from the workaround above. Indeed, pinning nova containers may need to issues, here is the proper command line which re-pins all containers but nova*: docker ps -q | grep -v -E $(docker ps -q --filter='name=nova' | paste -sd "|" - ) | xargs docker update --cpuset-cpus=$(cat /proc/self/status | awk '/Cpus_allowed_list/ {print $2}') We can check before/after the remaining processes dangling on all CPUs: for s in /proc/[0-9]* ; do if [[ $(grep $(lscpu | awk '/On-line/ {print $4}') $s/status) ]]; then cat $s/cmdline ; echo ''; fi ;done | sort -u On compute-0 server: [heat-admin@compute-0 ~]$ cat /proc/self/status | grep Cpus_allowed_list Cpus_allowed_list: 0-1 [heat-admin@compute-0 ~]$ sudo docker inspect nova_compute |grep CpusetCpus "CpusetCpus": "0,1", Containers on compute-0 are pinned. On controller-0 (To verify it was not broken): [heat-admin@controller-0 ~]$ cat /proc/self/status | grep Cpus_allowed_list Cpus_allowed_list: 0-7 [heat-admin@controller-0 ~]$ sudo docker inspect nova_api_db_sync |grep CpusetCpus "CpusetCpus": "0,1,2,3,4,5,6,7", Containers on controller-0 are not pinned. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0760 For the record, the initial implementation created a regression for PPC, see https://bugzilla.redhat.com/show_bug.cgi?id=1813091 |