Red Hat Bugzilla – Bug 417521
bad getpid(): race from fork-like clone() to updating %gs:PID cache
Last modified: 2008-07-29 02:29:01 EDT
Description of problem: The value returned by getpid() is wrong if called from a
signal handler when the signal is delivered to the child immediately after the
"int $0x80" for __NR_clone and the flags parameter to clone() does not contain
CLONE_VM. This is a fork()-like clone(), which gets a new pid. However, glibc
forgot to poison the %gs:PID cache before the __NR_clone, and does not update
the cache in the child until several instructions later. If the signal is
delivered in the meantime, then the code at getpid does not realize that the
cached value is incorrect.
Section 2.4.3 of the Single UNIX Specification says that getpid is
async-signal-safe, and therefore a signal handler may call getpid() legally.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. establish a signal handler
2. call clone() with flags that omit CLONE_VM [thus, a fork-like clone() whose
child gets a new pid]
3. deliver the signal to the child immediately after __NR_clone and before
%gs:PID gets updated
4. call getpid() from the signal handler
The call to getpid() from the signal handler returns the pid of the parent.
The call to getpid() from the signal handler returns the pid of the child.
Additional info: One fix is to poison the cache before calling "int $0x80" for
a fork-like __NR_clone.
The suggested fix in the original report also has a race between the poisoning
and the __NR_clone. If a signal handler in the parent calls getpid() in that
interval, then the %gs:PID cache will be re-validated, leaving the child
vulnerable to the original race. So if poisoning is used then it must be done
with all signals blocked, and both the parent and child must restore the
previous signal state after __NR_clone (and the child must re-cache %gs:PID
It would be safest if the pid cache were maintained by the same code that
changes the pid: namely, the operating system kernel itself. Put the pid cache
on a data page as part of the VDSO, and let the kernel maintain it.
Changing version to '9' as part of upcoming Fedora 9 GA.
More information and reason for this action is here:
VDSO is the same for all processes, if I understand it right. It can contain
changing data, e.g. current time, but this data should be the same for all
processes. It seems that PID can't be there.
Why does glibc cache PID anyway? The only reason can be some applications which
do getpid() thousands times per second.
> It would be safest if the pid cache were maintained by the same code that
changes the pid: namely, the operating system kernel itself.
It does maintain PID there. You access it with getpid() syscall. I distinctly
remember Linus saying that libc-level PID cache is, eh, let's say "not so
clever" (he was more direct) and "useful mostly for benchmark cheating". I
personally don't feel strongly either way, just pointing out that implementing
kernel-side cache is likely to be not welcomed by kernel guys.
If you call clone() you're on your own. There are far too many problems with
clone to make attempt to fix work around one issue.