Red Hat Bugzilla – Bug 1284087
ps showing the wrong cgroup/cpuset for threads
Last modified: 2016-11-04 02:36:46 EDT
Description of problem: ps is showing all threads in the same cgroup/cpuset when they are on different cgroup/cpuset. Version-Release number of selected component (if applicable): procps-ng-3.3.10-3.el7.x86_64 How reproducible: Always Steps to Reproduce: On a RHEL7 minimal install (I am using auditd just as an example of multi-thread app). 1. Create a cpuset: # cd /sys/fs/cgroup/cpuset # mkdir app # cd app/ # echo 0 > cpuset.cpus # echo 0 > cpuset.mems 3. select a multi-threaded proc (e.g auditd): # ps -eLo pid,lwp,comm,cgroup | grep audit 108 108 kauditd - 596 596 auditd 1:name=systemd:/system.slice/auditd.service 596 602 auditd 1:name=systemd:/system.slice/auditd.service 4. Insert the thread with the same pid/lwp in this new cgroup/cpuset: # echo 596 > tasks # cat tasks 596 5. List the threads showing its cgroup/cpuset: # ps -eLo pid,lwp,comm,cgroup | grep audit 108 108 kauditd - 596 596 auditd 2:cpuset:/app,1:name=systemd:/system.slice/auditd.service 596 602 auditd 2:cpuset:/app,1:name=systemd:/system.slice/auditd.service Actual results: ps shows all threads in the same cgroup/cpuset, but they are not in the same cgroup/cpuset. Expected results: ps showing the correct cgroup/cpuset for threads. Additional info: Stracing ps: It opens the proc/PID/ dir and checks the status of the proc. stat("/proc/596", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0 open("/proc/596/stat", O_RDONLY) = 6 read(6, "596 (auditd) S 1 596 596 0 -1 42"..., 2048) = 359 close(6) = 0 open("/proc/596/status", O_RDONLY) = 6 read(6, "Name:\tauditd\nState:\tS (sleeping)"..., 2048) = 1039 close(6) = 0 It reads the proc cgroup (but never the threads cgroup file): open("/proc/596/cgroup", O_RDONLY) = 6 read(6, "10:hugetlb:/\n9:perf_event:/\n8:bl"..., 131072) = 159 read(6, "", 130913) = 0 close(6) = 0 openat(AT_FDCWD, "/proc/596/task", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 6 getdents(6, /* 4 entries */, 32768) = 96 stat("/proc/596/task/596", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0 open("/proc/596/task/596/stat", O_RDONLY) = 7 read(7, "596 (auditd) S 1 596 596 0 -1 42"..., 1024) = 359 close(7) = 0 open("/proc/596/task/596/status", O_RDONLY) = 7 read(7, "Name:\tauditd\nState:\tS (sleeping)"..., 1024) = 1024 read(7, "t_switches:\t15\n", 1024) = 15 close(7) = 0 write(1, " 596 596 auditd 2:cp"..., 86) = 86 stat("/proc/596/task/602", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0 open("/proc/596/task/602/stat", O_RDONLY) = 7 read(7, "602 (auditd) S 1 596 596 0 -1 10"..., 2048) = 364 close(7) = 0 open("/proc/596/task/602/status", O_RDONLY) = 7 read(7, "Name:\tauditd\nState:\tS (sleeping)"..., 2048) = 1040 close(7) = 0 write(1, " 596 602 auditd 2:cp"..., 86) = 86 getdents(6, /* 0 entries */, 32768) = 0 Additional info 2: I opened a similar BUG Report for RHEL6 (BZ1284076), but ps behaves differently.
This bug reproduces on Fedora, Fedora BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1284091
This needs to be forward-ported from RHEL6. The problem here is that the procps-ng-3.3.10 has a support for cgroups in the library and drops '/' cgroups. We need to avoid touching the library to keep the interface compatible and re-implement the filtering logic in order to stay consistent with the recent code.
devel_ack for 7.3
Created attachment 1177704 [details] New "thcgr" option enhancement It is needed to apply https://bugzilla.redhat.com/attachment.cgi?id=969096 first because of the changed context of the patch. Without this, the patch could not be added into the RPM package.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2447.html