RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1174125 - Unable to write cpuset.cpus: Permission denied
Summary: Unable to write cpuset.cpus: Permission denied
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.1
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: rc
: ---
Assignee: Martin Kletzander
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-12-15 08:30 UTC by Jincheng Miao
Modified: 2015-08-07 10:25 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-08-07 10:25:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
libvirtd debug log (537.22 KB, text/plain)
2014-12-15 08:31 UTC, Jincheng Miao
no flags Details
success for vcpu auto placement and vcpupin (422.50 KB, text/plain)
2014-12-17 09:49 UTC, Jincheng Miao
no flags Details
failure for vcpu auto placement and vcpupin (348.19 KB, text/plain)
2014-12-17 09:49 UTC, Jincheng Miao
no flags Details
libvirtd log on ppc64le (297.19 KB, text/plain)
2015-07-01 02:21 UTC, Wayne Sun
no flags Details

Description Jincheng Miao 2014-12-15 08:30:40 UTC
Description of problem:
When configure guest vcpupin and memory placement='auto', guest start will fail:
error: Failed to start domain r6
error: Unable to write to '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2dr6.scope/vcpu0/cpuset.cpus': Permission denied

Version-Release number of selected component (if applicable):
libvirt-1.2.8-9.el7.x86_64
qemu-kvm-rhev-2.1.2-6.el7.x86_64
kernel-3.10.0-210.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. configure guest CPU NUMA and vcpupin
  <vcpu placement='auto'>4</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='0-3'/>
    <vcpupin vcpu='1' cpuset='0-3'/>
    <vcpupin vcpu='2' cpuset='0-3'/>
    <vcpupin vcpu='3' cpuset='0-3'/>
  </cputune>
  <numatune>
    <memory mode='strict' placement='auto'/>
  </numatune>


2. start guest
# virsh start r6
error: Failed to start domain r6
error: Unable to write to '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2dr6.scope/vcpu0/cpuset.cpus': Permission denied


Actual results:
Fail to start 

Expected results:
could start

Additional info:
The attachment is log file.

Comment 1 Jincheng Miao 2014-12-15 08:31:11 UTC
Created attachment 968824 [details]
libvirtd debug log

Comment 2 Jincheng Miao 2014-12-17 09:48:13 UTC
Because the vcpu placement is 'auto', so cpuset is set according to numad.
But vcpupin is specified, if vcpupin is not included in the result numad returned. "cpuset.cpus: Permission denied" will happen.

So should we add some clear error message or documents to explain to confused user. 

I will post two log: success and fail of this case.

Comment 3 Jincheng Miao 2014-12-17 09:49:12 UTC
Created attachment 970002 [details]
success for vcpu auto placement and vcpupin

Comment 4 Jincheng Miao 2014-12-17 09:49:44 UTC
Created attachment 970003 [details]
failure for vcpu auto placement and vcpupin

Comment 8 Wayne Sun 2015-06-30 08:33:17 UTC
also met this problem on ppc64le

pkgs:
libvirt-1.2.16-1.el7.ppc64le
qemu-kvm-rhev-2.3.0-5.el7.ppc64le
kernel-3.10.0-282.el7.ppc64le

steps: 
1.
# virsh dumpxml virt-tests-debug-clone
...
  <vcpu placement='auto'>5</vcpu>
  <cputune>
    <shares>262144</shares>
  </cputune>
  <numatune>
    <memory mode='strict' placement='auto'/>
  </numatune>
...

2. start domain
# virsh start virt-tests-debug-clone
error: Failed to start domain virt-tests-debug-clone
error: Unable to write to '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2dvirt\x2dtests\x2ddebug\x2dclone.scope/emulator/cpuset.cpus': Permission denied

it's not 100% persent reproducible though

Comment 9 Martin Kletzander 2015-06-30 09:56:29 UTC
(In reply to Wayne Sun from comment #8)
Could you post the debug logs for that, too?  It does look like a slightly different issue.  It would be nice to gdb to libvirt, break it right after this error and do 'head /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2dvirt\x2dtests\x2ddebug\x2dclone.scope/cpuset.cpus /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2dvirt\x2dtests\x2ddebug\x2dclone.scope/*/cpuset.cpus' so we know what's happening.  And also know what value libvirt was trying to write there.  But debug logs will be fine if you don't want to debug the process.

Comment 10 Wayne Sun 2015-07-01 02:21:47 UTC
Created attachment 1044853 [details]
libvirtd log on ppc64le

Attached the libvirtd log

Comment 11 Martin Kletzander 2015-07-01 06:32:44 UTC
The problem is that you are using automatic placement and the nodes numad offers (0) are incompatible with the values of upper cpuset.cpus.  You have weird setting there anyway, could you post the output of the following command, please?

head /sys/fs/cgroup/cpuset/{,machine.slice/}cpuset.{mems,cpus}

Comment 12 Wayne Sun 2015-07-01 06:59:15 UTC
fail to break gdb exactly at the error, so it take a little time, and the value fail for set with emulator cpuset.cpus:

Breakpoint 1, virCgroupSetValueStr (group=<optimized out>, controller=<optimized out>, key=<optimized out>, value=0x3fff5c005730 "0,8,16,24,32") at util/vircgroup.c:739
739	    ret = 0;
(gdb) 
Continuing.

"0,8,16,24,32" is from node 0, it failed at here.


The value under scope and inherit to under dir:
# cat /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\\x2dvirt\\x2dtests\\x2ddebug\\x2dclone.scope/cpuset.cpus /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\\x2dvirt\\x2dtests\\x2ddebug\\x2dclone.scope/*/cpuset.cpus
24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152
24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152


don't know why "0,8,16" are missing.

Check on host:
# lscpu
Architecture:          ppc64le
Byte Order:            Little Endian
CPU(s):                160
On-line CPU(s) list:   0,8,16,24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152
Off-line CPU(s) list:  1-7,9-15,17-23,25-31,33-39,41-47,49-55,57-63,65-71,73-79,81-87,89-95,97-103,105-111,113-119,121-127,129-135,137-143,145-151,153-159
Thread(s) per core:    1
Core(s) per socket:    5
Socket(s):             4
NUMA node(s):          4
Model:                 8247-22L
L1d cache:             64K
L1i cache:             32K
L2 cache:              512K
L3 cache:              8192K
NUMA node0 CPU(s):     0,8,16,24,32
NUMA node1 CPU(s):     40,48,56,64,72
NUMA node16 CPU(s):    80,88,96,104,112
NUMA node17 CPU(s):    120,128,136,144,152

Comment 13 Wayne Sun 2015-07-01 07:08:07 UTC
so after figured out the problem might be on node 0, change domain xml to:
# virsh dumpxml virt-tests-debug-clone
...
  <vcpu placement='static' cpuset='0,8,16,24,32'>2</vcpu>
...

# virsh start virt-tests-debug-clone
error: Failed to start domain virt-tests-debug-clone
error: Unable to write to '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2dvirt\x2dtests\x2ddebug\x2dclone.scope/emulator/cpuset.cpus': Permission denied

Breakpoint 1, virCgroupSetValueStr (group=<optimized out>, controller=<optimized out>, key=<optimized out>, value=0x3fff4c00aa80 "0,8,16,24,32") at util/vircgroup.c:739
739	    ret = 0;
(gdb) c
Continuing.

# cat /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\\x2dvirt\\x2dtests\\x2ddebug\\x2dclone.scope/cpuset.cpus /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\\x2dvirt\\x2dtests\\x2ddebug\\x2dclone.scope/*/cpuset.cpus
24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152
24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152

so confirmed it did cause by node 0

Comment 14 Wayne Sun 2015-07-01 07:10:00 UTC
# head /sys/fs/cgroup/cpuset/{,machine.slice/}cpuset.{mems,cpus}
==> /sys/fs/cgroup/cpuset/cpuset.mems <==
0-1,16-17

==> /sys/fs/cgroup/cpuset/cpuset.cpus <==
0,8,16,24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152

==> /sys/fs/cgroup/cpuset/machine.slice/cpuset.mems <==
0-1,16-17

==> /sys/fs/cgroup/cpuset/machine.slice/cpuset.cpus <==
24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152

Comment 15 Martin Kletzander 2015-08-06 18:25:14 UTC
I was searching through the code and found out that the first problem here is most probably fixed by v1.2.14-2-gc9f9fa25d3a2 and v1.2.14-4-gf0fa9080d47b:

commit c9f9fa25d3a26629e0b3c7ce3139ee2f7c47f127
Author: Peter Krempa <pkrempa>
Date:   Fri Mar 27 10:11:00 2015 +0100

    qemu: cgroup: Store auto cpuset instead of re-creating it on demand

commit f0fa9080d47b7aedad6f4884b8879d88688752a6
Author: Peter Krempa <pkrempa>
Date:   Fri Mar 27 10:23:19 2015 +0100

    qemu: cgroup: Properly set up vcpu pinning

Still have to look at the issue Wayne is having in comment #12.  Could you make sure this is reproducible with current libvirt (1.2.17-3, I guess)?  Also please check the issue you had in comment #12, not the test in comment #13.  If you need further assistance, let me know.  I cannot reproduce the issue.

Comment 16 Wayne Sun 2015-08-07 08:04:10 UTC
(In reply to Martin Kletzander from comment #15)

> 
> Still have to look at the issue Wayne is having in comment #12.  Could you
> make sure this is reproducible with current libvirt (1.2.17-3, I guess)? 
> Also please check the issue you had in comment #12, not the test in comment
> #13.  If you need further assistance, let me know.  I cannot reproduce the
> issue.

I think the issue in comment #12 is as comment #11 decribed, that incompatible value with upper cpuset.cpus, not numad related as test in comment #13 confirmed.

The incompatible could be simply set by:

1. check on a host without vm started
# rpm -q libvirt kernel
libvirt-1.2.17-3.el7.x86_64
kernel-3.10.0-302.el7.x86_64

# numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7
node 0 size: 65514 MB
node 0 free: 60859 MB
node 1 cpus: 8 9 10 11 12 13 14 15
node 1 size: 65536 MB
node 1 free: 61649 MB
node distances:
node   0   1 
  0:  10  11 
  1:  11  10 


# head /sys/fs/cgroup/cpuset/{,machine.slice/}cpuset.{mems,cpus}
==> /sys/fs/cgroup/cpuset/cpuset.mems <==
0-1

==> /sys/fs/cgroup/cpuset/cpuset.cpus <==
0-15
head: cannot open ‘/sys/fs/cgroup/cpuset/machine.slice/cpuset.mems’ for reading: No such file or directory
head: cannot open ‘/sys/fs/cgroup/cpuset/machine.slice/cpuset.cpus’ for reading: No such file or directory

2. start a vm
# virsh start virt-tests-vm1
Domain virt-tests-vm1 started

# head /sys/fs/cgroup/cpuset/{,machine.slice/}cpuset.{mems,cpus}
==> /sys/fs/cgroup/cpuset/cpuset.mems <==
0-1

==> /sys/fs/cgroup/cpuset/cpuset.cpus <==
0-15

==> /sys/fs/cgroup/cpuset/machine.slice/cpuset.mems <==
0-1

==> /sys/fs/cgroup/cpuset/machine.slice/cpuset.cpus <==
0-15

3. destroy the vm
# virsh destroy virt-tests-vm1
Domain virt-tests-vm1 destroyed

4. set small set of cpu in cpuset.cpus
# echo 3-15 > /sys/fs/cgroup/cpuset/machine.slice/cpuset.cpus

# head /sys/fs/cgroup/cpuset/{,machine.slice/}cpuset.{mems,cpus}
==> /sys/fs/cgroup/cpuset/cpuset.mems <==
0-1

==> /sys/fs/cgroup/cpuset/cpuset.cpus <==
0-15

==> /sys/fs/cgroup/cpuset/machine.slice/cpuset.mems <==
0-1

==> /sys/fs/cgroup/cpuset/machine.slice/cpuset.cpus <==
3-15

5. start vm with cpuset config

# virsh dumpxml virt-tests-vm1
...
  <vcpu placement='static' cpuset='0-7'>2</vcpu>
...

# virsh start virt-tests-vm1
error: Failed to start domain virt-tests-vm1
error: Unable to write to '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2dvirt\x2dtests\x2dvm1.scope/emulator/cpuset.cpus': Permission denied

so incompatible value caused the fail.

After reset cpuset.cpus value back with:
# echo 0-15 > /sys/fs/cgroup/cpuset/machine.slice/cpuset.cpus

then start the vm:
# virsh start virt-tests-vm1
Domain virt-tests-vm1 started

So, I don't think there is a problem. 
Well, anyway if libvirt could tweak the value to solve the incompatible problem, that'll be better.

Martin, what do you think?

Comment 17 Martin Kletzander 2015-08-07 08:12:45 UTC
The difference is that if you have only CPUs 3-15 available and you explicitly request cpus 0-7 to be used, then we error out.  That makes sense.  The problem here is that when libvirt is recalculating NUMA nodes to CPUs, then it has to omit offlined cpus (if you offline the cpus, that's how it will get out of the cpuset.cpus).  So I tested it like this:

I moved /usr/bin/numad to /usr/bin/numad_, created wrapper script in /usr/bin/numad that looks like this for example:

#!/usr/bin/env python

import os
import sys

if len(sys.argv) == 3 and sys.argv[1] == '-w':
    print(1)
    sys.exit(0)

os.execv('/usr/bin/numad_', sys.argv)
sys.exit(1)


And then offlined one cpu in numa node 1, and tried to start a domain with vcpu placement='auto'.

Comment 18 Wayne Sun 2015-08-07 08:28:56 UTC
(In reply to Martin Kletzander from comment #17)

Ok use the script for numad and offline cpu8 in node1

# cat /sys/devices/system/cpu/cpu8/online 
1

# echo 0 > /sys/devices/system/cpu/cpu8/online 

# virsh dumpxml virt-tests-vm1
...
  <vcpu placement='auto'>2</vcpu>
...

# virsh start virt-tests-vm1
error: Failed to start domain virt-tests-vm1
error: Invalid value '8-15' for 'cpuset.cpus': Invalid argument

so libvirt did not exclude the offline cpu, and it supposed to, right?

Comment 19 Martin Kletzander 2015-08-07 08:32:57 UTC
Well, yes, but now the problem is 'Invalid argument' and not 'Permission denied'.

Comment 20 Wayne Sun 2015-08-07 10:07:27 UTC
Filed a new bug for tracking the problem in comment #18
Bug 1251445 - Fail to start vm with placement auto after offline certain cpu without restart libvirtd 
https://bugzilla.redhat.com/show_bug.cgi?id=1251445

Comment 21 Martin Kletzander 2015-08-07 10:25:33 UTC
So I went through the problem once again from the start.  The original problem reported is not a big deal, but mainly it's not a bug.  The thing is that we properly error out as the whole specified cpuset (0-3) is not usable because some of those cpus are not usable (not in the system, offline or just removed from the cgrpus cpuset.cpus).  Thus I'm closing this as a NOTABUG.

All the other issues reported in this BZ are various with various root causes, but not related to this BZ. The issues are described in Bug 1251445.

If you encounter similar problem, do not report it into this BZ, but rather search for one already created or create a new one. Thanks.


Note You need to log in before you can comment on or make changes to this bug.