Bug 734180
Summary: | Ruby hangs when making certain uses of fork | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Casey Dahlin <cdahlin> |
Component: | ruby | Assignee: | Vít Ondruch <vondruch> |
Status: | CLOSED WONTFIX | QA Contact: | BaseOS QE - Apps <qe-baseos-apps> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 6.1 | CC: | james.brown, jduncan, lwang, rprice, vanhoof |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2014-10-07 22:34:56 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 782183 |
Description
Casey Dahlin
2011-08-29 16:10:05 UTC
I tested the reproducer with latest Ruby in Fedora and I can reproduce the issue. Lets see what is the upstream going to say about the issue [1]. [1] http://redmine.ruby-lang.org/issues/3100 Hi I did handled this issue at upstream a year ago. And I can't reproduce this issue on both ruby_1_8_7 branch and RHEL 6.2 latest and internal RC. forktest.rb completely works. Can anyone provide detailed reproduce instruction? (In reply to comment #6) > Hi > > I did handled this issue at upstream a year ago. And I can't reproduce this > issue on both ruby_1_8_7 branch and RHEL 6.2 latest and internal RC. > forktest.rb completely works. > > Can anyone provide detailed reproduce instruction? Hi, good to see you in Red Hat :) I hope I am not too late to notice that. I can reproduce it with: 1) Fedora 16 $ rpm -q ruby ruby-1.8.7.352-1.fc16.x86_64 $ ruby forktest.rb 5 I wrote like 5 lines of dots until it frozen. 2) Mock for RHEL-6.2 on F16, prepared using following brew repository: baseurl=http://download.englab.brq.redhat.com/brewroot/repos/RHEL-6.2-build/latest/x86_64 # ruby -v ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux] # ruby forktest.rb 5 Something like whole page of dots until lockup. 3) RHEL 6.1 VM with Ruby 1.8.7 from RHEL-6.2 from following build: https://brewweb.devel.redhat.com/buildinfo?buildID=175740 $ rpm -q ruby ruby-1.8.7.352-3.el6.x86_64 $ ruby forktest.rb 10000 I can run the forktest even with 10000 and it lockups immediately, i.e. it spawns the 4 children, it runs once the info.each do |r,w,*e| but hangs on the r.gets line for second time. It was able to run if further just once from a few attempts. Hmmm.. I still have no luck. Does this issue have hardware configuration dependency? I have reserved one virtual machine in beaker: sgi-xe500-01.rhts.eng.bos.redhat.com I ran the test for several times and I observed two scenarios. 1) The test fails with following error: http://pastebin.test.redhat.com/69235 2) Hang, sluggish response The hang is of strange nature. Once it happened, I tried to spawn another ssh connection to look what is going on and I had to wait like two minutes before the prompt appeared. At the end, before I was able to do anything else, the test suddenly continued. So this might be something completely unrelated to Ruby, however similar to the issue what the original reporter observes. However, you might want to try by yourself, by running "$ ssh root.eng.bos.redhat.com 'ruby forktest.rb 2'". The machine should be available at lease for another 95 hours and it is possible to extend the period. Hi I played some time sgi-xe500-01 and I could only reproduced (1). Thank you. Unfortunately, I have no time awhile. I have another serious issue now. I plan to resume this investigation 1 or 2 week later. Current my guess is, it is tty related issue because my KVM guest couldn't reproduce the issue. but I don't have any evidence yet. my memo sgi-xe500-01 RHEL6.2-20111117.0 x86_64 (In reply to comment #8) > Hmmm.. I still have no luck. Does this issue have hardware configuration > dependency? I was able to reproduce this on a KVM virtual machine running RHEL 6.2. Can provide specs if desired. # rpm -qa ruby ruby-1.8.7.352-4.el6_2.x86_64 # uname -a Linux test.duncan.net 2.6.32-220.4.2.el6.x86_64 #1 SMP Mon Feb 6 16:39:28 EST 2012 x86_64 x86_64 x86_64 GNU/Linux from /usr/share/doc/ruby-1.8.7.352/ChangeLog Tue Jun 8 12:37:56 2010 NAKAMURA Usaku <usa> * eval.c (thread_timer, rb_thread_stop_timer): check the timing of stopping timer. patch from KOSAKI Motohiro <kosaki.motohiro _AT_ jp.fujitsu.com> via IRC. * eval.c (rb_thread_start_timer): NetBSD5 seems to be hung when calling pthread_create() from pthread_atfork()'s parent handler. * io.c (pipe_open): workaround for NetBSD5. stop timer thread before fork(), and restart it after fork() on parent, and on child if needed. * process.c (rb_f_fork, rb_f_system): ditto. these changes are tested by naruse. fixed [ruby-dev:40074] from http://bugs.ruby-lang.org/projects/ruby-187/repository/revisions/28203/: merge revision(s) 26371,26373,26374,26972: * eval.c (thread_timer, rb_thread_stop_timer): check the timing of stopping timer. patch from KOSAKI Motohiro <kosaki.motohiro _AT_ jp.fujitsu.com> * eval.c (rb_thread_start_timer): NetBSD5 seems to be hung when calling pthread_create() from pthread_atfork()'s parent handler. * io.c (pipe_open): workaround for NetBSD5. stop timer thread before fork(), and start it if needed. * process.c (rb_f_fork, rb_f_system): ditto. fixed [ruby-dev:40074] jp.fujitsu.com> via IRC. fork(), and restart it after fork() on parent, and on child if needed. these changes are tested by naruse. fixed [ruby-dev:40074] * io.c, eval.c, process.c: add linux to r26371's condition. patched by Motohiro KOSAKI [ruby-core:28151] So it SEEMS to have been addressed in the current RHEL 6 Ruby release. Tried to reproduce the issue with http://redmine.ruby-lang.org/attachments/929/forktest.rb (actually http://bugs.ruby-lang.org/attachments/download/929/forktest.rb) root 2459 0.0 0.0 103300 816 pts/5 S+ 09:21 0:00 grep ruby [root@rhev-m ~]# ps aux |grep ruby root 2416 0.1 0.0 40356 2720 pts/0 Sl+ 09:19 0:00 ruby forktest.rb 1 root 2420 0.0 0.0 40200 1580 pts/2 Ss+ 09:19 0:00 ruby forktest.rb 1 root 2461 0.0 0.0 103300 820 pts/5 S+ 09:21 0:00 grep ruby [root@rhev-m ~]# strace 2420 strace: 2420: command not found [root@rhev-m ~]# strace -p 2420 Process 2420 attached - interrupt to quit futex(0x7f3bb3e4aa20, FUTEX_WAIT_PRIVATE, 2, NULL^C <unfinished ...> Process 2420 detached [root@rhev-m ~]# strace -p 2416 Process 2416 attached - interrupt to quit select(6, [5], [], [], {0, 479205}) = 0 (Timeout) clock_gettime(CLOCK_MONOTONIC, {2054, 723746600}) = 0 select(6, [5], [], [], {0, 0}) = 0 (Timeout) wait4(2420, 0x7fff5e21923c, WNOHANG, NULL) = 0 select(6, [5], [], [], {0, 0}) = 0 (Timeout) clock_gettime(CLOCK_MONOTONIC, {2054, 723916265}) = 0 clock_gettime(CLOCK_MONOTONIC, {2054, 723969007}) = 0 select(6, [5], [], [], {0, 999947}) = 0 (Timeout) clock_gettime(CLOCK_MONOTONIC, {2055, 725152353}) = 0 select(6, [5], [], [], {0, 0}) = 0 (Timeout) wait4(2420, 0x7fff5e21923c, WNOHANG, NULL) = 0 select(6, [5], [], [], {0, 0}) = 0 (Timeout) clock_gettime(CLOCK_MONOTONIC, {2055, 725469671}) = 0 clock_gettime(CLOCK_MONOTONIC, {2055, 725501775}) = 0 select(6, [5], [], [], {0, 999967}) = 0 (Timeout) clock_gettime(CLOCK_MONOTONIC, {2056, 726757643}) = 0 select(6, [5], [], [], {0, 0}) = 0 (Timeout) wait4(2420, 0x7fff5e21923c, WNOHANG, NULL) = 0 select(6, [5], [], [], {0, 0}) = 0 (Timeout) clock_gettime(CLOCK_MONOTONIC, {2056, 726927175}) = 0 clock_gettime(CLOCK_MONOTONIC, {2056, 726960758}) = 0 select(6, [5], [], [], {0, 999966}) = 0 (Timeout) clock_gettime(CLOCK_MONOTONIC, {2057, 728139687}) = 0 select(6, [5], [], [], {0, 0}) = 0 (Timeout) wait4(2420, 0x7fff5e21923c, WNOHANG, NULL) = 0 select(6, [5], [], [], {0, 0}) = 0 (Timeout) clock_gettime(CLOCK_MONOTONIC, {2057, 728415349}) = 0 clock_gettime(CLOCK_MONOTONIC, {2057, 728449740}) = 0 select(6, [5], [], [], {0, 999965}) = 0 (Timeout) clock_gettime(CLOCK_MONOTONIC, {2058, 729603479}) = 0 select(6, [5], [], [], {0, 0}) = 0 (Timeout) wait4(2420, 0x7fff5e21923c, WNOHANG, NULL) = 0 select(6, [5], [], [], {0, 0}) = 0 (Timeout) clock_gettime(CLOCK_MONOTONIC, {2058, 729772152}) = 0 clock_gettime(CLOCK_MONOTONIC, {2058, 729805878}) = 0 select(6, [5], [], [], {0, 999966}) = 0 (Timeout) This request was not resolved in time for the current release. Red Hat invites you to ask your support representative to propose this request, if still desired, for consideration in the next release of Red Hat Enterprise Linux. This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development. This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4. This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate, in the next release of Red Hat Enterprise Linux. This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate, in the next release of Red Hat Enterprise Linux. |