A task using a lot of memory, and compiled with gprof, may never be able to complete a fork() because the timed (10 ms) profiling signal constantly causes the fork to be restarted. This change appears to have been introduced by commit 4a2c7a7837da1b91468e50426066d988050e4d56 "make fork() atomic wrt pgrp/session signals" Our customer considers this a kernel bug.
Partial strace from reproducer: 11160 10:17:46.930243 --- SIGPROF (Profiling timer expired) @ 0 (0) --- 11160 10:17:46.930334 rt_sigreturn(0x1b) = 56 <0.000014> 11160 10:17:46.930406 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x2b7acd3225a0) = ? ERESTARTNOINTR (To be restarted) <0.462230>
#include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> int main() { long sz = 4.1*1024*1024*1024; char *p = (char*)malloc(sz); memset(p, 0, sz); fork(); return 0; } If you save it to a file fork.c and compile with /usr/bin/g++ -pg fork.c
Well. The problem is quite obvious. Everything is clear and works "as expected". What is not clear to me is what can we do ;) And given that SIGPROF can happen at any time -pg can lead to the spurious (and perhaps unexpected) -EINTR from other syscalls... Otoh I understand why the customer dislikes this. I'll try to think more...
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Patch(es) available in kernel-2.6.18-265.el5 You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed.
Verified with -267.el5.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-1065.html
Hi, Is it possible for anyone to attach the patch which contains the fix for this issue ? Thanks in Advance, Murali
Created attachment 1023457 [details] Reconstructed patch From looking at the closest released versions before and after this patch has been added, I would expect it to be this.