1813091 – z11 update fails on PPC64LE compute node due to "cpuset_cpus: all"

Bug 1813091 - z11 update fails on PPC64LE compute node due to "cpuset_cpus: all"

Summary: z11 update fails on PPC64LE compute node due to "cpuset_cpus: all"

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-tripleo-heat-templates
Sub Component:
Version:	13.0 (Queens)
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	async
Target Release:	13.0 (Queens)
Assignee:	Emilien Macchi
QA Contact:	David Rosenfeld
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-03-12 22:52 UTC by Chris Smart
Modified:	2023-09-07 22:22 UTC (History)
CC List:	16 users (show)
Fixed In Version:	openstack-tripleo-heat-templates-8.4.1-51.el7ost python-paunch-2.5.3-3.el7ost
Doc Type:	Bug Fix
Doc Text:	Before this update, when upgrading Red Hat OpenStack Platform (RHOSP) 13 on 64-bit PowerPC processors to the latest maintenance release, Paunch would fail to create the `nova_libvirt` container, report the following error, and cause the upgrade to fail: /usr/bin/docker-current: Error response from daemon: Requested CPUs are not available…. The value of the RHOSP parameter, `cpuset_cpus`, in nova-libvirt.yaml defaults to all CPUs. In cases where the simultaneous multithreading (SMT) control was disabled, the CPUs were exposed differently and the RHOSP upgrade was failing. Two changes have been made to resolve this issue. First, by default, Docker automatically determines for you which CPUs are available. The second change is that you can use a new role-based parameter, `ContainerCpusetCpus` to override Docker. For more information, see https://access.redhat.com/solutions/4917021.
Clone Of:
Environment:
Last Closed:	2020-04-02 10:05:44 UTC
Target Upstream Version:
Embargoed:
Flags:	emacchi: needinfo+

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
OpenStack gerrit	713120	None	MERGED	Don't set cpuset_cpus if empty	2020-12-17 14:14:24 UTC
OpenStack gerrit	713121	None	MERGED	[TRAIN and before] Introduce ContainerCpusetCpus	2020-12-17 14:14:24 UTC
OpenStack gerrit	713932	None	MERGED	Do not set cpuset-cpus if cconfig['cpuset_cpus'] == 'all'	2020-12-17 14:14:54 UTC
OpenStack gerrit	713991	None	MERGED	Do not set cpuset-cpus if cconfig['cpuset_cpus'] == 'all'	2020-12-17 14:14:24 UTC
Red Hat Knowledge Base (Solution)	4917021	None	None	None	2020-03-19 22:53:34 UTC
Red Hat Product Errata	RHBA-2020:1297	None	None	None	2020-04-02 10:05:50 UTC

Description Chris Smart 2020-03-12 22:52:45 UTC

Description of problem:

When upgrading RHOSP13 cloud to z11 release, the following step fails:

`openstack overcloud update run --nodes ComputePPC64LE`

This appears to be related to nova_libvirtd, which is throwing an error related to CPUs:

         "stderr: /usr/bin/docker-current: Error response from daemon:
Requested CPUs are not available - requested 0-19, available:
0,8,16,24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152.


Version-Release number of selected component (if applicable):

openstack-tripleo-heat-templates-8.4.1-42.el7ost

How reproducible:
Always

Steps to Reproduce:
1. Update existing PPC64LE compute node to z11

Actual results:

Update fails with the error above and nova_libvirt container is not running.

Expected results:

The nova_libvirt container should start and Upgrade should complete successfully.

Additional info:

docker/services/nova-libvirt.yaml in openstack-tripleo-heat-templates-8.4.1-42.el7ost appears to have added a new cpuset_cpus setting for nova_libvirt with the z11 update:

          nova_libvirt:
            start_order: 1
            image: {get_param: DockerNovaLibvirtImage}
            ulimit: {get_param: ContainerNovaLibvirtUlimit}
            net: host
            pid: host
            privileged: true
            restart: always
            cpuset_cpus: all
            ...

I assume this is maybe picked up by Ansible cloud module and applies cpuset settings to starting the nova_libvirt container.

On x86 compute nodes this works fine, but on PPC compute node with 20 CPUs while it's expecting this to be (0-19), on a PPC node this is more like (0,8,16,24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152).

Thus, nova_libvirt cannot start because CPUs 0-19 are not available.

Comment 1 Chris Smart 2020-03-12 22:53:03 UTC

https://access.redhat.com/support/cases/#/case/02599881

Comment 4 Brendan Shephard 2020-03-13 05:16:16 UTC

The issue was worked around using this:

while true ; do sudo sed -i '/.*cpuset_cpus.*/d' /var/lib/tripleo-config/hashed-docker-container-startup-config-step_3.json ; done

Moving forward, would we say that this can be removed from nova-libvirt.yaml entirely? Or is it going to cause issues for CPU pinning as per this commit?
https://review.opendev.org/#/c/686027/

Comment 12 Brendan Shephard 2020-03-17 00:44:21 UTC

Summarising the private comments here into a Public comment:

The removal of cpuset_cpus results in a failure from libvirt during live-migrations in the following form:

/var/log/containers/nova/nova-compute.log:2020-03-16 14:54:24.936 8 ERROR nova.compute.manager [instance: 5a721258-1f47-491b-ba25-a8de2f48b6ca] libvirtError: Unable to write to '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d45\x2dinstance\x2d00026e5c.scope/emulator/cpuset.cpus': Permission denied

To fix this, it was necessary to re-add the cpuset_cpus to the nova-libvirt container configuration and reboot the node. Restarting Docker, rebuilding the container were not sufficient. The whole node  had to be rebooted. At this point, the migrations of VM's were possible.

At this stage, I would expect the initial removal of this param from PowerPC nodes has broken the ability to live migrate between those nodes as well. So far, this testing has been conducted on x86_64 nodes since it is not possible to create the nova_libvirt container on the PPC nodes once the cpuset_cpus has been added.

Comment 17 Emilien Macchi 2020-03-18 16:07:40 UTC

What I tested today:

1) Deploy OSP13 trunk, 3 controllers and 2 computes, with defaults
   Note: computes have 2 vcpu

2) On the computes, observe that nova_libvirt container is configured with
   "CpusetCpus": "0-1" (docker inspect) and "cpuset_cpus": "all" in Paunch
   config.

3) On the computes, run:
   paunch debug --file /var/lib/tripleo-config/hashed-docker-container-startup-config-step_3.json --overrides '{"cpuset_cpus": "0"}' --container nova_libvirt --action run

   Observe that "docker inspect nova_libvirt" reports "CpusetCpus": "0-1",

4) Reboot
   Observe that "docker inspect nova_libvirt" still reports "CpusetCpus": "0-1",

5) Remove nova_livirt container

6) Run:
   paunch debug --file /var/lib/tripleo-config/hashed-docker-container-startup-config-step_3.json --overrides '{"cpuset_cpus": "0"}' --container nova_libvirt --action run
   Observe that "docker inspect nova_libvirt" now reports "CpusetCpus": "0",

7) Remove nova_livirt container and run:
   paunch debug --file /var/lib/tripleo-config/hashed-docker-container-startup-config-step_3.json --overrides '{"cpuset_cpus": "all"}' --container nova_libvirt --action run

   Observe that "docker inspect nova_libvirt" now reports "CpusetCpus": "0-1",

8) Conclusion: container has to be removed before reconfiguring CpusetCpus.


Note: I haven't tested live migration yet (next step).

Comment 18 Emilien Macchi 2020-03-18 16:30:56 UTC

so I tried migration (not live, I don't have shared storage) and it worked fine when I reconfigured the nova_libvirt container; without a reboot.

Here is what I did:

1) On the 2 compute nodes, run:

docker stop nova_libvirt && docker rm nova_libvirt && paunch debug --file /var/lib/tripleo-config/hashed-docker-container-startup-config-step_3.json --overrides '{"cpuset_cpus": "0"}' --container nova_libvirt --action run

Verify docker container config: "CpusetCpus": "0"

2) Run VM migrate:

openstack server migrate test-vm2

3) Result:

VM was migrated to the other node and is active.


I didn't reboot or restart any service. Just removed the container and applied a new config. Did I miss something beside the fact I'm not using ppc and shared storage?

Comment 22 Brendan Shephard 2020-03-18 22:05:34 UTC

(In reply to Emilien Macchi from comment #18)
> so I tried migration (not live, I don't have shared storage) and it worked
> fine when I reconfigured the nova_libvirt container; without a reboot.
> 
> Here is what I did:
> 
> 1) On the 2 compute nodes, run:
> 
> docker stop nova_libvirt && docker rm nova_libvirt && paunch debug --file
> /var/lib/tripleo-config/hashed-docker-container-startup-config-step_3.json
> --overrides '{"cpuset_cpus": "0"}' --container nova_libvirt --action run
> 
> Verify docker container config: "CpusetCpus": "0"
> 
> 2) Run VM migrate:
> 
> openstack server migrate test-vm2
> 
> 3) Result:
> 
> VM was migrated to the other node and is active.
> 
> 
> I didn't reboot or restart any service. Just removed the container and
> applied a new config. Did I miss something beside the fact I'm not using ppc
> and shared storage?


In our case, cpuset_cpus was removed entirely from the JSON file:
paunch debug --file /var/lib/tripleo-config/hashed-docker-container-startup-config-step_3.json --container nova_libvirt --action dump-json > nova_libvirt.json

Then we just got rid of the line entirely before rebuilding the container from that json file.


So I guess it's possible as per Sean's comment that completely removing it was the main issue we were hitting with x86_64 nodes?

Comment 23 Chris Smart 2020-03-18 23:34:45 UTC

On my PPC nodes, when cpuset_cpus is set to all it fails.(In reply to Emilien Macchi from comment #17)
> What I tested today:
> 
> 1) Deploy OSP13 trunk, 3 controllers and 2 computes, with defaults
>    Note: computes have 2 vcpu
> 
> 2) On the computes, observe that nova_libvirt container is configured with
>    "CpusetCpus": "0-1" (docker inspect) and "cpuset_cpus": "all" in Paunch
>    config.
> 
> 3) On the computes, run:
>    paunch debug --file
> /var/lib/tripleo-config/hashed-docker-container-startup-config-step_3.json
> --overrides '{"cpuset_cpus": "0"}' --container nova_libvirt --action run
> 
>    Observe that "docker inspect nova_libvirt" reports "CpusetCpus": "0-1",
> 
> 4) Reboot
>    Observe that "docker inspect nova_libvirt" still reports "CpusetCpus":
> "0-1",
> 
> 5) Remove nova_livirt container
> 
> 6) Run:
>    paunch debug --file
> /var/lib/tripleo-config/hashed-docker-container-startup-config-step_3.json
> --overrides '{"cpuset_cpus": "0"}' --container nova_libvirt --action run
>    Observe that "docker inspect nova_libvirt" now reports "CpusetCpus": "0",
> 
> 7) Remove nova_livirt container and run:
>    paunch debug --file
> /var/lib/tripleo-config/hashed-docker-container-startup-config-step_3.json
> --overrides '{"cpuset_cpus": "all"}' --container nova_libvirt --action run
> 
>    Observe that "docker inspect nova_libvirt" now reports "CpusetCpus":
> "0-1",
> 
> 8) Conclusion: container has to be removed before reconfiguring CpusetCpus.
> 

Thanks Emilien. I was stopping and deleting the existing nova_libvirt container and then re-creating it with paunch from a modified json file. Doing that worked, as you've found.

However the problem isn't so much that making the change didn't make the setting stick, the problem was more that when cpuset_cpus was set to "all" the container will not start (error in bz above).

My PPC nodes do have 20 *online* cpus. When cpuset_cpus is set to "all" nova_libvirt tries to use the 20 cores, but hard codes it to a sequential "0-19." This causes nova_libvirt to fail because the online CPUs on my PPC node aren't sequential because PPC runs in 8 thread mode (i.e. SMT off).

On PPC compute node with 20 CPUs, nova_libvirt is trying to use (0-19), however on a PPC node this is more like (0,8,16,24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152).

To be clear, if I set this, it works:
"cpuset_cpus": "0,8,16,24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152"

I note that in your testing, the container *is* starting when you have this set to "all", which is interesting and means you can't replicate the problem yet. I assume your ppc node is set to to SMT off?

I.e.:
  $ sudo ppc64_cpu --smt
  SMT is off

Also, do you mind posting the results of lscpu?

Here's mine:

$ lscpu
Architecture:          ppc64le
Byte Order:            Little Endian
CPU(s):                160
On-line CPU(s) list:   0,8,16,24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152
Off-line CPU(s) list:  1-7,9-15,17-23,25-31,33-39,41-47,49-55,57-63,65-71,73-79,81-87,89-95,97-103,105-111,113-119,121-127,129-135,137-143,145-151,153-159
Thread(s) per core:    1
Core(s) per socket:    5
Socket(s):             4
NUMA node(s):          4
Model:                 2.1 (pvr 004b 0201)
Model name:            POWER8E (raw), altivec supported
CPU max MHz:           3690.0000
CPU min MHz:           2061.0000
L1d cache:             64K
L1i cache:             32K
L2 cache:              512K
L3 cache:              8192K
NUMA node0 CPU(s):     0,8,16,24,32
NUMA node1 CPU(s):     40,48,56,64,72
NUMA node16 CPU(s):    80,88,96,104,112
NUMA node17 CPU(s):    120,128,136,144,152

Thanks!

Comment 25 Emilien Macchi 2020-03-19 12:15:27 UTC

(In reply to Chris Smart from comment #23)
(...) 
> I note that in your testing, the container *is* starting when you have this
> set to "all", which is interesting and means you can't replicate the problem
> yet. I assume your ppc node is set to to SMT off?

I didn't test on PPC but on x86_64. I don't have access to such hardware unfortunately.

Comment 26 Emilien Macchi 2020-03-19 12:51:56 UTC

Chris, please run "lscpu -p=cpu" and report back the result.

Comment 37 Chris Smart 2020-03-20 03:34:02 UTC

(In reply to Emilien Macchi from comment #26)
> Chris, please run "lscpu -p=cpu" and report back the result.

[heat-admin@compute-dev-822l-0 ~]$ lscpu -p=cpu
# The following is the parsable format, which can be fed to other
# programs. Each different item in every column has an unique ID
# starting from zero.
# CPU
0
8
16
24
32
40
48
56
64
72
80
88
96
104
112
120
128
136
144
152

Comment 38 Chris Smart 2020-03-20 03:34:31 UTC

(In reply to Emilien Macchi from comment #25)
> (In reply to Chris Smart from comment #23)
> (...) 
> > I note that in your testing, the container *is* starting when you have this
> > set to "all", which is interesting and means you can't replicate the problem
> > yet. I assume your ppc node is set to to SMT off?
> 
> I didn't test on PPC but on x86_64. I don't have access to such hardware
> unfortunately.

OK yeah, this is only a problem on PPC :-S

Comment 55 David Rosenfeld 2020-03-31 14:03:08 UTC

Thanks to Tony Breed for helping df team verify:

Hey David,
     I built a small cloud with 1 director (VM), 1 controller (VM) [both x86_64] and one ppp64le baremetal node.   I deployed RHOS-13 from the z11 compose.   The deploy failed due to the ppc64le node having disabled the secondary threads and docker refusing to start the container (I attached the failure to the bug a few days ago).   Then yesterday I rebuilt the whole setup with the latest compose and the deploy ran to completion.


I know that the customer had also seen related issues to running instances on the deployed cloud so I launched one and verified it booted and was able to talk to the 'net.

I did not run tempest against the deployed overcloud as the primary bug was related to deploy/upgrade.

Tony.

Comment 65 errata-xmlrpc 2020-04-02 10:05:44 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1297

Note You need to log in before you can comment on or make changes to this bug.