Description of problem: /tset/LSB.os/mfiles/mmap_P/T.mmap_P hangs during an LSB test run on x86 platform. Version-Release number of selected component (if applicable): HP Proliant DL 560 4CPU Xeon 2GHz/3GB memory $ rpm -q redhat-lsb lsb-runtime-test redhat-lsb-1.3-3 lsb-runtime-test-1.3.6-3 Installed tree: RHEL3-U1-re1211.nightly/i386/i386-as Everything. kernel: Linux pro4.lab.boston.redhat.com 2.4.21-6.ELsmp #1 SMP Tue Dec 9 14:53:23 EST 2003 i686 i686 i386 GNU/Linux How reproducible: I am re-running a full LSB run now, but using a reduced scen.exec file I was able to reproduce this every time. Steps to Reproduce: 1. 2. 3. I ran this through the Test Grid but it can be run by hand as follows: # rpm -ivh lsb-runtime-test-1.3.6-3.i386.rpm o Set the password for the vsx0 user ( typically set to vsx0). # passwd vsx0 o Start the LSB tests. Takes about 6-8 hours to run depending on your system. Login as the vsx0 user. IMPORTANT: Actually log in as user vsx0 ---- DO NOT ---- "su - vsx0" $ run_tests Take defaults for all except where input is required. Actual results: /tset/LSB.os/mfiles/mmap_P/T.mmap_P hangs Expected results: /tset/LSB.os/mfiles/mmap_P/T.mmap_P completes allowing harness to move onto further tests. Additional info:
This regression is still in the latest build tree. Installed tree RHEL3-U1-re0107.0/i386/i386-as Everything # rpm -q redhat-lsb lsb-runtime-test redhat-lsb-1.3-3 lsb-runtime-test-1.3.6-3
This regression is still present with U2. kernel-2.4.21-9.10.EL redhat-lsb-1.3-3 lsb-runtime-test-1.3.6-3
Donald - Can you add this to the U2 Blocker list
The two T.mmap_P processes are not actually hung but rather are chewing up cpu like crazy in the signal path. They both have signal handlers registered for SIGSEGV and are apparently faulting on a bad address in user space. The signal handler fires, returns, and the bad dereference is re-executed causing the cycle to continue indefinitely. From the evidence, I cannot determine whether this is a test design problem or a latent issue elicited by a kernel regression but I lean toward the former given the failure mode. Bill, can you help me obtain the sources for this test?
A copy of the tests can be obtained via: # wget - N \ http://tg1.boston.redhat.com/tg/tests/lsb-runtime-test-1.3.6-3.i386.rpm Install the collected rpm: # rpm -ivh lsb-runtime-test-1.3.6-3.i386.rpm The sources are installed from the rpm in user account vsx0 # su - vsx0
This bug report is ancient. Is it still a relevent issue?
Ping?
Haven't yet had time to investigate this.
Martin, could you please set up this test for me on a test machine with a RHEL3 U5 environment? I would like to see if the problem is still reproducible and (if so) evaluate the test source code. Please send me the name of the machine and access instructions, and hopefully I'll be able to investigate this within the next few business days. Thanks in advance. -ernie
Also, is it okay if I make this bug public?
Ernie this is still reproducable; I am not sure about about making this public. I have setup system edge1.lab to reproduce the hang $ ssh vsx0 (NOTE: login directly as the user do not su - vsx0) $ run_tests The hang will reproduce within 2 minutes. fyi; I created a trimed down LSB-1.3 test input scen.exec file (to speed up reproduction of this bug) as follows; $ ssh vsx0 $ mv scen.exec scen.exec.save $ echo all > scen.exec $ grep mfiles scen.exec.save >> scen.exec If you login as root in another window and kill the hanging T.mmap_P processes the remaining test will finnish in about 2 minutes.
It seems that edge1.lab.boston.redhat.com is being identified as 192.168.76.147, which is not the IP address currently being used by the machine. What gives?
Martin, I've worked around the DNS issue by putting this in my /etc/hosts: 192.168.78.50 edge1.lab.boston.redhat.com edge1 I've been able to reproduce the test hang by using the scen.exec version that you created (thanks!). So now I need to locate the source code for test (and be able to rebuild the test executable with various debugging code). Where can I find the source? The executable resides in /home/tet/test_sets/TESTROOT/tset/LSB.os/mfiles/mmap_P.
The source can be found at: http://ftp.freestandards.org/pub/lsb/test_suites/released-1.3.0/source/runtime/
Please see bug 106330 comment #3 for an explanation of the bug in the LSB test source code. Closing as dup of a NOTABUG. *** This bug has been marked as a duplicate of 106330 ***