Bug 1284076

Summary: [RFE] Support for thread cgroups
Product: Red Hat Enterprise Linux 6 Reporter: Daniel Bristot de Oliveira <daolivei>
Component: procpsAssignee: Ondrej Vasik <ovasik>
Status: CLOSED ERRATA QA Contact: Branislav Náter <bnater>
Severity: high Docs Contact: Petr Bokoc <pbokoc>
Priority: unspecified    
Version: 6.8CC: albert, bnater, daolivei, pbokoc
Target Milestone: rcKeywords: FutureFeature, Patch, RFE
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: procps-3.2.8-36.el6 Doc Type: Enhancement
Doc Text:
*ps* can now display thread cgroups This update introduces a new format specifier `thcgr`, which can be used to display the cgroup of each listed thread.
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-05-11 00:29:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
procps-3.2.8-ps-thread-cgroups.patch none

Description Daniel Bristot de Oliveira 2015-11-20 17:51:30 UTC
Description of problem:

  ps is showing all threads in the same cgroup/cpuset when they are on different cgroup/cpuset.

Version-Release number of selected component (if applicable):
  procps-3.2.8-34.el6_7.x86_64

How reproducible:
  Always

Steps to Reproduce:
  On a RHEL6, with auditd installed (just to show an example):

1 Mount cgroup interface
  # mkdir /cgroup/
  # mount -t tmpfs cgroup_root /cgroup/
  # mkdir /cgroup/cpuset
  # mount -t cgroup -ocpuset cpuset /cgroup/cpuset/

2. Create a cpuset:
  # cd /cgroup/cpuset/
  # mkdir app
  # cd app/
  # echo 0 > cpuset.cpus 
  # echo 0 > cpuset.mems 

3. select a multi-threaded proc (e.g auditd):

  # ps -eLo pid,lwp,comm,cgroup | grep audit
   1025  1025 kauditd         cpuset:/
   1364  1364 auditd          cpuset:/
   1364  1365 auditd          cpuset:/

4. Insert the thread with the same pid/lwp in this new cgroup/cpuset:
  # echo 1364 > tasks 
  # cat tasks 
  1364

5. List the threads showing its cgroup/cpuset:
  # ps -eLo pid,lwp,comm,cgroup | grep audit
   1025  1025 kauditd         cpuset:/
   1364  1364 auditd          cpuset:/app
   1364  1365 auditd          cpuset:/app

Actual results:
ps shows all threads in the same cgroup/cpuset, but they are not in the same cgroup/cpuset.

Expected results:
ps showing the correct cgroup/cpuset for threads.

Additional info:

Stracing the ps, it the following:
  It opens the proc/PID/ dir and checks the status of the proc.
	stat("/proc/1364", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
	open("/proc/1364/stat", O_RDONLY)       = 5
	read(5, "1364 (auditd) S 1 1364 1364 0 -1"..., 1023) = 249
	close(5)                                = 0
	open("/proc/1364/status", O_RDONLY)     = 5
	read(5, "Name:\tauditd\nState:\tS (sleeping)"..., 1023) = 870
	close(5)                                = 0

  It checks the threads of the proc:
	open("/proc/1364/task", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 5
	getdents(5, /* 4 entries */, 32768)     = 96

  It checks first thread:
	stat("/proc/1364/task/1364", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
	open("/proc/1364/task/1364/stat", O_RDONLY) = 6
	read(6, "1364 (auditd) S 1 1364 1364 0 -1"..., 1023) = 249
	close(6)                                = 0
	open("/proc/1364/task/1364/status", O_RDONLY) = 6
	read(6, "Name:\tauditd\nState:\tS (sleeping)"..., 1023) = 870
	close(6)                                = 0

  Then it checks the cgroup file of the process (not the thread)
	open("/proc/1364/cgroup", O_RDONLY)     = 6
	fstat(6, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
	mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5a8567f000
	read(6, "1:cpuset:/\n", 1024)           = 11
	read(6, "", 1024)                       = 0
	close(6)                                = 0
	munmap(0x7f5a8567f000, 4096)            = 0
	write(1, " 1364  1364 auditd          cpus"..., 37) = 37

  It checks first thread:
	stat("/proc/1364/task/1365", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
	open("/proc/1364/task/1365/stat", O_RDONLY) = 6
	read(6, "1365 (auditd) S 1 1364 1364 0 -1"..., 1023) = 252
	close(6)                                = 0
	open("/proc/1364/task/1365/status", O_RDONLY) = 6
	read(6, "Name:\tauditd\nState:\tS (sleeping)"..., 1023) = 868
	close(6)                                = 0

  Then it checks the cgroup file of the process (not the thread) AGAIN???
	open("/proc/1364/cgroup", O_RDONLY)     = 6
	fstat(6, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
	mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5a8567f000
	read(6, "1:cpuset:/\n", 1024)           = 11
	read(6, "", 1024)                       = 0
	close(6)                                = 0
	munmap(0x7f5a8567f000, 4096)            = 0
	write(1, " 1364  1365 auditd          cpus"..., 37) = 37
	getdents(5, /* 0 entries */, 32768)     = 0

  Than, it goes to the next thread....
	stat("/proc/1398", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0


IMHO, the "open("/proc/1364/cgroup", O_RDONLY)" twice in the same file is the
problem, I think that it was inteded to open the "/proc/PID/tasks/TID/cgroup"
file, and not the "/proc/PID/cgroup" file.

Comment 2 Jaromír Cápík 2015-12-16 16:20:20 UTC
Hello Daniel.

The manual only mentions cgroups in relation with processes, not threads. That means it only considers the cgroup of the parent.
It isn't difficult to change the behaviour according to your needs, but the question is whether it is wise to do so in already released products like RHEL6 and RHEL7 as some of the procps(-ng) users might expect the current behaviour or even use custom scripts requiring the current behaviour and such change might cause a lot of troubles in the future ... more than we both think. There would have to be a new switch introduced that would change the behaviour conditionally and that also means it needs to go upstream first and be discussed in the upstream mailing list.

Comment 3 Jaromír Cápík 2015-12-17 07:52:32 UTC
I was thinking about that and we could introduce a new header for thread cgroups and export as a new field in the format specifier. That way the old behaviour would be kept intact and co-exist with the new one without conflicts.

Comment 4 Jaromír Cápík 2015-12-17 15:43:14 UTC
Created attachment 1106759 [details]
procps-3.2.8-ps-thread-cgroups.patch

This patch adds the 'thcgr' format specifier bound to the 'THCGR' header.

Results:

# ps -eLo pid,lwp,comm,cgroup | grep audit
  846   846 kauditd         cpuset:/
 1121  1121 auditd          cpuset:/app
 1121  1122 auditd          cpuset:/app

# ps -eLo pid,lwp,comm,thcgr | grep audit
  846   846 kauditd         cpuset:/
 1121  1121 auditd          cpuset:/app
 1121  1122 auditd          cpuset:/

Comment 5 Jaromír Cápík 2015-12-17 15:46:08 UTC
Hello Daniel.

Could you please test the above patch and let me know whether it suits your needs?

Thanks,
Jaromir.

Comment 6 Daniel Bristot de Oliveira 2015-12-17 18:04:29 UTC
Hi Jaromir

I agree on your opinion about not to change cgroup option, and I also agree on the proposed solution.

Patch tested and approved.

Thank you very much.

-- Daniel

Comment 13 errata-xmlrpc 2016-05-11 00:29:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0904.html