Red Hat Bugzilla – Bug 221351
Kill orphan processes
Last modified: 2013-01-09 23:09:54 EST
Description of problem:
Currently rpm building by mock can get stuck if some stale processes remain
running. mock(1) tries to read all the input and these orphans have their fds
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. mock --debug -r fedora-6-i386-core --no-clean rebuild
DEBUG: Executing /usr/sbin/mock-helper chroot
/var/lib/mock/fedora-6-i386-core/root /sbin/runuser - root -c "cd
/;/sbin/runuser -c 'rpmbuild --rebuild --target i386 --nodeps
With ps(1) showing:
3044 pts/7 S+ 0:00 \_ /usr/bin/python -tt /usr/bin/mock
--debug -r fedora-6-i386-core --no-clean rebuild
3415 pts/7 Z+ 0:00 \_ [sh] <defunct>
3516 ? S 0:00 sleep 24h
l-wx------ 1 jkratoch jkratoch 64 Jan 4 01:03 /proc/3516/fd/1 -> pipe:
l-wx------ 1 jkratoch jkratoch 64 Jan 4 01:03 /proc/3516/fd/2 -> pipe:
lr-x------ 1 jkratoch jkratoch 64 Jan 4 01:03 /proc/3044/fd/5 -> pipe:
and "strace -s200 -f -q -p 3044":
With the patch the output contains the informational debug line(s):
+ cd /builddir/build/BUILD
+ exit 0
mock-helper: warning: Killed -9 orphan PID 14127: sleep 24h
Created attachment 144756 [details]
Fix implementing "mock-helper orphanskill <chrootdir>".
Created attachment 144757 [details]
Trivia .src.rpm for the bug reproducibility.
The relevant content is the .spec part:
sleep 24h &
Created attachment 144758 [details]
Fix implementing "mock-helper orphanskill <chrootdir>". (update)
As a workaround, is the test disabled?
It affects various testcases, not sure about their count:
29126 ? T 0:10
6968 ? T 0:00
And in many cases I just even can't figure out which testcases caused it:
28906 ? T 0:00
28907 ? Z 0:00 \_ [gdb] <defunct>
29194 ? T 0:00
29195 ? Z 0:00 \_ [gdb] <defunct>
That `bt-clone-stop' testcase is mine and the testcase source is perfectly valid
so there must be bug in the testsuite framework.
Still the testcases just spawn various asynchronous processes by:
set testpid [eval exec $binfile &]
And the TCL/expect/testsuite has no clue which processes were forked asynchronously.
Extending mock-helper is something we've been resisting fairly strenuously,
since it's a setuid root program and has the potential to be a cracker's attack
vector. I have mixed feelings about adding it, since I see it's utility in the
GDB testsuite case, but I'm not sure that it's a generally useful command. I'll
take it up with my co-maintainers and see what they think.
Just so I understand the intent of the patch, the orphanskill command to
mock-helper processes all the task entries in /proc, finds any task with a
"root" link that matches the current chroot, and sends a kill(pid, SIGKILL) to
that task. Is this correct?
you are right regarding the `orphanskill' command functionality.
I agree it is a testsuite bug, any build should not leave any stale processes.
In the GDB case the testsuite is just too big with no general possibility to fix
it, one would have to review all the 400 testcases / 4MB of sources there. Due
to the effort costs Red Hat + upstream decided not to review+fix the testsuite.
Still I believe it is a clearly detectable failure of a build - if the direct
child process dies and any stale process exists. Another decision is if the
processes should be silently killed or just aborting the build as a failed one.
It is uneasy to write such `orphanskill' command outside of the mock as the
spawned process may and does change everything making it undetectable without
root privileges (they setsid(), they create new ptys, parents dying reparenting
their children to init(8)). Killing an unrelated user's process would be a pity.
Fortunately it is NO LONGER A BLOCKER FOR ME as I wrote today a workaround - it
cannot kill all the orphan processes but it kills those causing the mock hang
(using open mock fd).
It helps builds outside of the mock but it still may not (does not? unaware now)
kill all the stale processes.
I'm glad you have a workaround and I apologize for taking so long to address it.
I"ve sent a message to the fedora-buildsys-list asking if anyone else could make
use of the orphanskill functionality.
We'll have a big argument and then decide if it's general purpose enough for
mock or not. Film at 11. :)
orphanskill logic was added in the Great Mock Rewrite done by Michael. Closing