From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0) Programs which worked with redhat 7.0 and previous versions now have issues with threaded processes get stuck in rt_sigsuspend/fillonedir/exit_notify. This raises serious issues for the product I am working on. The bug doesn't happen 100% of the time, in fact, it happens frequently, but not each and every time. Reproducible: Sometimes Steps to Reproduce: a description of what the program does: the program runs, creates a pipe and then forks. The parent - immediately starts reading on the pipe, waiting for the child to say "OK" - When it gets the OK, the parent exits cleanly. The child - calls setsid() to become a process group leader - loads a shared library (ascf.so) - calls pthread_sigmask(SIG_BLOCK, ...) for SIGINT, SIGTERM, SIGHUP - starts a monitor thread which * Proceeds to fork and exec the other server processes * After it does this successfully, it writes "OK" to the pipe * Proceeds to call wait() waiting to any of its children to die The main thread continues on to: - Starts up two more worker threads - calls sigaction(..) to establish a signal handler function for SIGINT, SIGTERM & SIGHUP - calls pthread_sigmask(SIG_UNBLOCK, ...) for SIGINT, SIGTERM, SIGHUP - calls pause() to wait for any incoming signals Actual Results: Now, the problem is that the child threads/processes do not seem to be returning the OK, instead, they seem to simply hang, here is some sample strace output [snipped for brevity]: main process: read(4, "1860", 4096) = 4 read(4, "", 4096) = 0 close(4) = 0 munmap(0x4012e000, 4096) = 0 kill(1860, SIG_0) = -1 ESRCH (No such process) pipe([4, 6]) = 0 fork() = 2048 close(6) = 0 read(4, [it seems to be waiting for the response] then: first process: old_mmap(NULL, 89056, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x40314000 mprotect(0x40326000, 15328, PROT_NONE) = 0 old_mmap(0x40326000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4, 0x11000) = 0x40326000 close(4) = 0 mprotect(0x40314000, 73728, PROT_READ|PROT_WRITE) = 0 mprotect(0x40314000, 73728, PROT_READ|PROT_EXEC) = 0 rt_sigprocmask(SIG_BLOCK, [HUP INT TERM], NULL, 8) = 0 getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=2147483647}) = 0 setrlimit(RLIMIT_STACK, {rlim_cur=2044*1024, rlim_max=2147483647}) = 0 pipe([4, 7]) = 0 clone(child_stack=0x81a9234, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_PTRACE) = 2049 write(7, "\364\336\0@\5\0\0\0\0\0\0\0\0\0\0\0\0\336\0@\0\0\0\0\220"..., 148) = 148 rt_sigprocmask(SIG_SETMASK, NULL, [HUP INT TERM CHLD RT_0 RT_5], 8) = 0 write(7, "`\177\24@\0\0\0\0\0\0\0\0\330&\6\10\0\0\0\0\3@\1\200\20"..., 148) = 148 rt_sigprocmask(SIG_SETMASK, NULL, [HUP INT TERM CHLD RT_0 RT_5], 8) = 0 rt_sigsuspend([HUP INT TERM CHLD RT_5] Expected Results: The main process should have exited after getting the OK Check the linux-kernel mailing list archives, checking for the following string: [BUG] threaded processes get stuck in rt_sigsuspend/fillonedir/exit_notify
Minor question: You don't seem to write to fd 6 but to 7 ! and if you are using stdio, are you calling fflush() to flush the (g)libc buffers to the kernel ?
Here is the pipe code from cfexec... I included the main() code and the MonitorThread() code in the hops that this will help you track down what Red Hat or we are doing wrong. The Start() calls do the fork()/exec() to start up the other CF processes. To answer their questions directly: - The write(7, ...) is not done by our program (see code below), the bytes written do not contain the string "OK" which is literally what we write (see MonitorThread() below). Maybe pthreads does this? - We do in fact use fflush() to flush verbose message, but in the normal mode of operation (that still hangs) we do not use it. Let me know if you need any more info or have further questions about the code. -- jnoller static int filedes[2]; main() { ....... // At this point, we put ourselves into the background // but the parent waits till the child says everything is OK if (!debugging) { int ret = pipe(filedes); if (fork()) { close(filedes[1]); // wait for child to give us the go ahead char buf[10]; int n = read(filedes[0], buf, sizeof(buf)); // parent exits exit(0); } close(filedes[0]); // become process group leader setsid(); // Change to log directory for core dumps const char *p = getenv("CFHOME"); if (!p) { p = DEFAULT_ROOT; } CString strDir(p); strDir += "/log"; chdir(strDir); } // Indicate that I am running ExecServiceRunning(true); #if defined(sun) // For Solaris, set the concurency level // Threads: // Main thread, Monitor thread, Scheduler, Client timeout // Plus DataAccess Garbage collector, and any requests thr_setconcurrency(5); #endif // Create the server objects CFserver = new CProcess(CProcess::cfserver); IDEserver = new CProcess(CProcess::cfide); #if !defined(hpux) JRunserver = new CProcess(CProcess::jrun); #endif #if defined(sun) CFProxy = new CSecurityProxyProcess(); #endif // sun // Initialize the executive // Since we are not an NT service, we fake it CCFExecutiveService service ; DWORD dwError; char lpszError[1024]; // Call service start routine if (service.OnStart(dwError, lpszError) != true) { fprintf(stderr, "Error: Unable to start cfexec. %s\n", lpszError); exit(1); } // Block the termination signals in all subthreads sigset_t set; sigemptyset(&set); sigaddset(&set, SIGINT); sigaddset(&set, SIGTERM); sigaddset(&set, SIGHUP); #if defined(hpux10) if (sigprocmask(SIG_BLOCK, &set, NULL) != 0) #else if (pthread_sigmask(SIG_BLOCK, &set, NULL) != 0) #endif { perror("pthread_sigmask"); exit(1); } // Start servers in Monitor thread if (pthread_create(&monitorThread, DEFAULT_ATTR, MonitorThread, NULL) != 0) { ReportError(strerror(errno), "attempting to create monitor thread") ; } // Run the Main thread which will return to us service.MainThread((LPVOID)s_hTerminationEvent); // Wait for signals #if defined(linux) // For Linux, use a signal handler routine because the incredibly // broken sigwait will block all signals (include SEGV!) struct sigaction act; memset(&act, 0, sizeof(act)); act.sa_handler = SignalHandler; sigaction(SIGINT, &act, NULL); sigaction(SIGTERM, &act, NULL); sigaction(SIGHUP, &act, NULL); // Unblock signals if (pthread_sigmask(SIG_UNBLOCK, &set, NULL) != 0) { perror("pthread_sigmask"); exit(1); } // This will never return, as the signal handler will exit process pause(); #else WaitForSignal(); #endif // Shut servers down Shutdown(); // Indicate we are no longer running ExecServiceRunning(false); // Finish up exit(0); } // // Monitor thread - watches for the termination of IDE or CFserver // void *MonitorThread(void* arg) { pid_t pid; int status; bool started; // Start the ColdFusion server bool CFstarted = CFserver->Start(); if ( !CFstarted) { fprintf(stderr, "Error: ColdFusion failed to start.\n" "See %s/log for details\n", (const char *)CFserver->GetServerRoot()); exit(1); } // success! if (verbose) printf("ColdFusion server started (pid %u)\n", CFserver- >m_dwPid); // Start the IDE server if (! no_rds) { bool IDEstarted = IDEserver->Start(); if ( !IDEstarted) { fprintf(stderr, "Error: ColdFusion RDS server failed to start.\n" "See %s/log for details\n", (const char *)IDEserver->GetServerRoot()); // Shutdown other server too - give it a few seconds to start up sleep(5); kill(CFserver->m_dwPid, SIGTERM); // Stop the Advanced Security Proxy // Both Stop and Start of the proxy are never failing // no-ops if advanced security is not enabled #if defined(sun) CFProxy->Stop(); #endif exit(1); } // success! if (verbose) printf("ColdFusion RDS server started (pid %u)\n", IDEserver->m_dwPid); } #if !defined(hpux) // Start the JRun server if (! no_jrun) { bool JRunstarted = JRunserver->Start(); if ( !JRunstarted) { fprintf(stderr, "Error: JRun server failed to start.\n" "See %s/log for details\n", (const char *)IDEserver->GetServerRoot()); exit(1); } // success! if (verbose) printf("JRun server started (pid %u)\n", JRunserver- >m_dwPid); } #endif // !hpux // Give parent to OK to exit if (!debugging) { write(filedes[1], "OK", sizeof("OK")); close(filedes[1]); } ....... }
Just checking for an Update, or any ideas.
I just returned from a short vacation, and will try to make a smaller testcase asap. I'm still not conviced this is a kernel problem; could just as well be a glibc issue or an applicationbug. (And yes your code has several, unrelated to this problem, bugs)
Thanks! On the glibc issues, note, that this is all C++, and we depend on Libstdc++, on 7.0 and 7.1, notably, the compat-libstdc++ libs MUSt be installed, as we have coded this for 6.2 natively. once again, thanks, and throw any bugs you see at me, I can take it! =]
Any updates?
I've made your example into a self-running app, and it seems to work fine.... I've attached it to this bug.
Created attachment 18731 [details] example program
Created attachment 18790 [details] Strace output
Here is some more information regarding this issue. I am at a loss as to what is causing this, so any ideas are appreciated. Begin Paste: C- ok. So let's pipe all this stuff through STRACE to see what in the devil is going on! I am editing the Start script with the following strace flags: -f Trace child processes as they are created by currently traced processes as a result of the fork(2) system call. The new process is attached to as soon as its pid is known (through the return value of fork(2) in the parent process). This means that such children may run uncontrolled for a while (especially in the case of a vfork(2)), until the parent is scheduled again to complete its (v)fork(2) call. If the parent process decides to wait(2) for a child that is currently being traced, it is suspended until an appropriate child process either terminates or incurs a signal that would cause it to terminate (as determined from the child's current signal disposition). -o filename Write the trace output to the file filename rather than to stderr. Use filename.pid if -ff is used. If the argument begins with `|' or with `!' then the rest of the argument is treated as a command and all output is piped to it. This is convenient for piping the debugging output to a program without affecting the redirections of executed programs. So, the edited runline looks like: # # Run as user specified with only our environment variables set. # env -i $ENV_LIST strace -f -o /tmp/try1 $CFHOME/bin/cfexec On try 1, it starts fine. [root@skyfire bin]# ./start Process 9442 attached Process 9443 attached Process 9442 detached Process 9443 detached On try 2, it starts fine. [root@skyfire bin]# ./start Process 9670 attached Process 9671 attached Process 9670 detached Process 9671 detached AHA! On try three, it did the following: [root@skyfire bin]# ./start Process 9812 attached Process 9813 attached Process 9812 detached ---HANG--- ps -ae shows: 9796 pts/1 00:00:00 start 9810 pts/1 00:00:00 strace 9811 pts/1 00:00:00 cfexec 9812 ? 00:00:00 cfexec <defunct> 9817 ? 00:00:00 cfserver 9818 ? 00:00:00 cfserver 9819 ? 00:00:00 cfserver 9821 ? 00:00:00 cfserver 9822 ? 00:00:00 cfserver 9823 ? 00:00:00 cfserver 9824 ? 00:00:00 cfserver Here is a snip of the STRACE output file: 9812 munmap(0x402d8000, 17067) = 0 9812 open("/usr/lib/coldfusion/ascf.so", O_RDONLY) = 4 9812 read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\240\213"..., 1024) = 1024 9812 fstat(4, {st_mode=S_IFREG|0555, st_size=116338, ...}) = 0 9812 old_mmap(NULL, 89056, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x40315000 9812 mprotect(0x40327000, 15328, PROT_NONE) = 0 9812 old_mmap(0x40327000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4, 0x11000) = 0x40327000 9812 close(4) = 0 9812 munmap(0x82b1000, 61440) = 0 9812 munmap(0x82c1000, 61440) = 0 9812 mprotect(0x40315000, 73728, PROT_READ|PROT_WRITE) = 0 9812 mprotect(0x40315000, 73728, PROT_READ|PROT_EXEC) = 0 9812 rt_sigprocmask(SIG_BLOCK, [HUP INT TERM], NULL, 8) = 0 9812 getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=2147483647}) = 0 9812 setrlimit(RLIMIT_STACK, {rlim_cur=2044*1024, rlim_max=2147483647}) = 0 9812 pipe([4, 7]) = 0 9812 clone(child_stack=0x8189808, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_PTRACE) = 9813 9812 write(7, "\364\336\0@\5\0\0\0\0\0\0\0\0\0\0\0\0\336\0@\0\0\0\0\220"..., 148) = 148 9812 rt_sigprocmask(SIG_SETMASK, NULL, [HUP INT TERM CHLD RT_0 RT_5], 8) = 0 9812 write(7, "`\217\24@\0\0\0\0\0\0\0\0\230+\6\10\0\0\0\0\3@\1\200\20"..., 148) = 148 9812 rt_sigprocmask(SIG_SETMASK, NULL, [HUP INT TERM CHLD RT_0 RT_5], 8) = 0 9812 rt_sigsuspend([HUP INT TERM CHLD RT_5] What's interesting to note, is the following paste from the output file: 9811 open("/etc/ld.so.cache", O_RDONLY) = 4 9811 fstat(4, {st_mode=S_IFREG|0644, st_size=17067, ...}) = 0 9811 old_mmap(NULL, 17067, PROT_READ, MAP_PRIVATE, 4, 0) = 0x402d8000 9811 close(4) = 0 9811 open("/lib/libnss_nis.so.2", O_RDONLY) = 4 9811 read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\0 \0\000"..., 1024) = 1024 9811 fstat(4, {st_mode=S_IFREG|0755, st_size=294539, ...}) = 0 9811 old_mmap(NULL, 43464, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x4030a000 9811 mprotect(0x40314000, 2504, PROT_NONE) = 0 9811 old_mmap(0x40314000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4, 0x9000) = 0x40314000 9811 close(4) = 0 9811 munmap(0x402d8000, 17067) = 0 9811 uname({sys="Linux", node="skyfire", ...}) = 0 9811 setgroups(1, [99]) = 0 9811 setgid(99) = 0 9811 setuid(99) = 0 9811 semop(512, 0x4012d170, 1) = 0 9811 getpid() = 9811 9811 semop(512, 0x4012d176, 1) = 0 9811 open("/home/coldfusion/log/pids/executive.pid", O_RDONLY) = 4 9811 fstat(4, {st_mode=S_IFREG|0644, st_size=4, ...}) = 0 9811 old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4012f000 9811 read(4, "9670", 4096) = 4 9811 read(4, "", 4096) = 0 9811 close(4) = 0 9811 munmap(0x4012f000, 4096) = 0 9811 kill(9670, SIG_0) = -1 ESRCH (No such process) 9811 pipe([4, 6]) = 0 9811 fork() = 9812 9812 getpid() = 9812 9811 close(6) = 0 9812 getrlimit(0x3, 0xbffff82c, 0xbffff81c, 0xbffff82c, 0xbffff82c <unfinished ...> 9811 read(4, <unfinished ...> 9812 <... getrlimit resumed> ) = -1 ENOSYS (Function not implemented) 9812 getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=2147483647}) = 0 9812 setrlimit(0x200000 /* RLIMIT_??? */, {rlim_cur=2147483647, rlim_max=2147483647}) = -1 EINVAL (Invalid argument) 9812 close(4) = 0 9812 setsid() = 9812 9812 chdir("/home/coldfusion/log") = 0 9812 semop(512, 0x4012d170, 1) = 0 9812 getpid() = 9812 9812 semop(512, 0x4012d176, 1) = 0 9812 open("/home/coldfusion/log/pids/executive.pid", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4 9812 getpid() = 9812 9812 fstat(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 9812 old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4012f000 So, basically, the original 1st CFEXEC process goes fine, and then it actually hangs on the second one in 'rt_sigsuspend' if we do a ps -ae on looking for the 'hung' process (Number 9812) we see: 9811 pts/1 00:00:00 cfexec 9812 ? 00:00:00 cfexec <defunct> [Quick side note, Tom, when running through the Strace output, i found:] 9811 socket(PF_UNIX, SOCK_STREAM, 0) = 4 9811 connect(4, {sin_family=AF_UNIX, path="/var/run/.nscd_socket"}, 110) = -1 ECONNREFUSED (Connection refused) [Good, bad?] So, after getting it to hang while in Strace, I ^C the hung exec And the following is the last dump to the Strace output file: 9812 pipe([4, 7]) = 0 9812 clone(child_stack=0x8189808, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_PTRACE) = 9813 9812 write(7, "\364\336\0@\5\0\0\0\0\0\0\0\0\0\0\0\0\336\0@\0\0\0\0\220"..., 148) = 148 9812 rt_sigprocmask(SIG_SETMASK, NULL, [HUP INT TERM CHLD RT_0 RT_5], 8) = 0 9812 write(7, "`\217\24@\0\0\0\0\0\0\0\0\230+\6\10\0\0\0\0\3@\1\200\20"..., 148) = 148 9812 rt_sigprocmask(SIG_SETMASK, NULL, [HUP INT TERM CHLD RT_0 RT_5], 8) = 0 9812 rt_sigsuspend([HUP INT TERM CHLD RT_5] <unfinished ...> 9812 --- SIGRT_0 (Real-time signal 0) --- 9811 <... read resumed> 0xbffffde4, 10) = ? ERESTARTSYS (To be restarted) 9811 --- SIGINT (Interrupt) --- 9811 +++ killed by SIGINT +++ I hope this helps, I am at a loss. However, I am going to put all of this in the Red Hat bug report. I have attached the full STRACE output to this report.
1) Does the example program work for you? 2) Does it resemble what you are doing ?
Hey again- A few more notes, number 1, we think this might be a dual cpu vs single cpu issue, possibly a problem with timing on dual procs. Also, yes, your test case works, however, the code I gave you is C++, everything is C++ in our program. Your test case is not quite as complicated as our actual program is, so, that also might be a factor.
I am the C++ engineer looking at this problem on our end. As far as I can tell the problem lies with fork(). We start a thread (the monitor thread in the code above) and it fork/exec a new process (the "Start()" calls). Sometimes the parent thread just never comes back after the fork. What it does do it seems, is sit and spin - racking up CPU time. It just never returns control back to our code. The child seems to run along fine. We are calling pthread_kill_other_threads_np() before the fork, but removing this call seems to have no effect on behavior (good or bad). I have tried to duplicate with with your test program and so far I haven't been able to. Also note that we are building on 6.2 with the default egcs compiler, and running this on a 7.0 system. Tom Jordahl
Based on this information, provided through email: > building on 6.2 > It works fine on 6.2, both uni and smp > It works fine on 7.0 and 7.1 uni only > It only fails on 7.0 and 7.1 smp This problem is not related to the kernel. I am re-assigning this bug to glibc, since the bug is either in the application or in glibc, and therefore glibc makes more sense than kernel. I guess it more likely to be an application bug but Jakub will be in a better position to judge the likelihood of that.
Try your code again on RHL9. signals, fork, and libstdc++ all had problems on older releases, especially the 7.x series. The new C++ library and thread code in RHL9 should fix all these problems *iff* they were problems in the runtime.
No reply in almost 6 months. Closing it now as NOTABUG since we couldn't reproduce it.