From Bugzilla Helper: User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/125.4 (KHTML, like Gecko) Safari/125.9 Description of problem: The code in kernel/pid.c function next_free_map() has a bug that results in every other map entry being skipped when allocating pids beyond the first map (i.e. >32767). If /proc/sys/kernel/pid_max is increased to 100000, say, and then lots of forks are done to take last_pid over 32767, then instead of allocating the next map to hold pids 32768 to 65535, that map is skipped and the next pid allocated is 65536. The problem happens due to the pre-increment of 'map' in next_free_map() when the calling function (alloc_pidmap) has already pointed the map it's considering to the next map (computed from last_pid + 1). The pre-increment then takes 'map' to the next-next-free map instead of the next-free map. Here's my suggested fix: $ diff -Naur kernel/pid.c kernel/pid.c.new --- kernel/pid.c 2004-08-20 05:20:48.000000000 -0500 +++ kernel/pid.c.new 2004-08-30 10:40:59.000000000 -0500 @@ -73,7 +73,7 @@ static inline pidmap_t *next_free_map(pidmap_t *map, int *max_steps) { while (--*max_steps) { - if (++map == map_limit) + if (map == map_limit) map = pidmap_array; if (unlikely(!map->page)) { unsigned long page = get_zeroed_page(GFP_KERNEL); @@ -93,6 +93,7 @@ } if (atomic_read(&map->nr_free)) return map; + map++; } return NULL; } Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. echo 100000 > /proc/sys/kernel/pid_max 2. while :; do /bin/true; done & 3. Watch the pids allocated. When the pid should go to 32768, it uses 65536 instead. Additional info:
ouch; this is a nasty bug; since you really shouldn't go above 65536 this bug effectively halves the pid space....
A fix for this problem was committed to the RHEL3 U4 patch pool on 29-Sep-2004 (in kernel version 2.4.21-20.14.EL). This bug is closely related to bug 120889, which reported a different symptom caused by the same underlying problem. The fix was first released in Update 4, which was announced with the following Errata System notice: "An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2004-550.html" Obviously, at this point it would be preferable to upgrade to the latest post-U5 security erratum, which is advisory RHSA-2005:472.