Bug 1323317

Summary: RFC: SIGPROF keeps a large task from ever completing a fork()
Product: Red Hat Enterprise Linux 6 Reporter: Paulo Andrade <pandrade>
Component: glibcAssignee: Carlos O'Donell <codonell>
Status: CLOSED WONTFIX QA Contact: qe-baseos-tools-bugs
Severity: medium Docs Contact:
Priority: medium    
Version: 6.7CC: ashankar, fweimer, mnewsome, pfrankli
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-02 00:44:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Paulo Andrade 2016-04-01 21:03:06 UTC
A large program (that has allocated a lot of memory) may
enter an infinite loop if compiled with -pg, due to
restarting the clone syscall, and never ending.

Previously Red Hat 5 had a patch to workaround it, to
correct gprof issues, in kernel:
"""
commit 122c17ac54c9b3f53e80bc6f0786cc5f2a8dc486
Author: Stefan Ring <str>
Date:   Fri May 8 13:19:55 2015 +0200

    the patch (v2.6.18)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 34ed0d9..808f79d 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1478,6 +1478,7 @@ static inline int lock_need_resched(spinlock_t *lock)
 
 extern FASTCALL(void recalc_sigpending_tsk(struct task_struct *t));
 extern void recalc_sigpending(void);
+extern int  fork_recalc_sigpending(void);
 
 extern void signal_wake_up(struct task_struct *t, int resume_stopped);
 
diff --git a/kernel/fork.c b/kernel/fork.c
index f9b014e..21f9a0d 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1193,8 +1193,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 	 * A fatal signal pending means that current will exit, so the new
 	 * thread can't slip out of an OOM kill (or normal SIGKILL).
  	 */
- 	recalc_sigpending();
-	if (signal_pending(current)) {
+	if (fork_recalc_sigpending()) {
 		spin_unlock(&current->sighand->siglock);
 		write_unlock_irq(&tasklist_lock);
 		retval = -ERESTARTNOINTR;
diff --git a/kernel/signal.c b/kernel/signal.c
index bfdb568..bd7e794 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -227,6 +227,31 @@ void recalc_sigpending(void)
 	recalc_sigpending_tsk(current);
 }
 
+int fork_recalc_sigpending(void)
+{
+	struct task_struct *tsk = current;
+	int pending;
+
+	recalc_sigpending();
+	if (likely(!signal_pending(tsk)))
+		return 0;
+
+	pending = 1;
+	/*
+	 * HACK. If SIGPROF is the sole reason for TIF_SIGPENDING
+	 * we assume it was sent by ITIMER_PROF and return false,
+	 * otherwise fork() can never succeed if it takes more than
+	 * it_prof_incr. bz645528.
+	 */
+	if (!sigismember(&tsk->blocked, SIGPROF)) {
+		sigaddset(&tsk->blocked, SIGPROF);
+		pending = recalc_sigpending_tsk(tsk);
+		sigdelset(&tsk->blocked, SIGPROF);
+	}
+
+	return pending;
+}
+
 /* Given the mask, find the first available signal that should be serviced. */
 
 static int
"""
But newer rhel does not use the above patch.

After some discussion in https://bugzilla.redhat.com/show_bug.cgi?id=1309789,
and related user preferring to use perf, for profiling, it was
suggested that SIGPROF could be blocked in glibc during the
clone syscall.

Gprof is still a useful tool, and just telling users that it
is a known failure and tell them to use perf may not be a good
solution, or not a viable one, on other architectures.

Comment 2 Carlos O'Donell 2016-04-02 00:44:18 UTC
Use perf to profile your application. Signal based profiling has limited uses.

Assume any signal occuring with period Tp. Assume any restartable syscall taking time Ts. When Tp < Ts you always have an infinite loop. There is nothing that userspace or the kernel can do in general. There are no forward guarantees for this scenario. Hardware transactional memory suffers similar problems.

Given that no general solution exists, any changes in userspace or the kernel would penalize the vast majority of programs which probably have Tp > Ts and don't suffer from infinite restart loops. Any solution to block SIGPROF would increasing signal latency and degrading SIGPROF results, not to mention adding latency costs to clone.

If you *must* use -pg and can't use perf then my only suggestion is that the application block SIGPROF before calling such syscalls as it might expect to take a long time, or detecting that the application has made no forward progress, block SIGPROF (sigprocmask, pthread_sigmask), and then later enable it at a further progress checkpoint (this will skew -pg results which are based on statistical profiling). If other libraries use clone, and I believe ASAN might, you will need to talk to the author of those libraries to determine how they want to handle the general problem as noted above without impacting all of userspace.

Again, neither glibc nor the kernel can fix this problem. And adding latency to clone for the sake of -pg is not acceptable.

Please use perf to profile your application.

Comment 3 Paulo Andrade 2016-04-04 12:48:19 UTC
Thanks for the comments.

I will report the issue to upstream. Probably it should be
handled only when built with -pg, by generating stubs with
gcc, and may require some option for which syscalls to
block SIGPROF.

Comment 4 Florian Weimer 2016-04-04 12:51:19 UTC
(In reply to Paulo Andrade from comment #3)
> Thanks for the comments.
> 
> I will report the issue to upstream.

Please report is a kernel bug.  There is little glibc can do to work around this without distorting profiling.

Comment 5 Paulo Andrade 2016-04-04 13:07:08 UTC
I just reported it at
https://sourceware.org/bugzilla/show_bug.cgi?id=19904