Description of problem: When the CORE and MEMORY tests are run from HTS on a fully virtualized, 32-bit RHEL 5 guest the tests fail. When the tests are run manually, they pass. This has usually been seen on systems which have relatively little (1-2GB) of memory assigned to the guest. Sometimes increasing the amount of memory allows the tests to pass, sometimes not. When the tests do not pass, no OOM messages are seen.
output.log from core test: Running ./CORE2: Checking clock jitter ... Single CPU detected. No clock jitter testing necessary. clock direction test: start time 1207927725, stop time 1207927785, sleeptime 60, delta 0 PASSED stress: FAIL: [29038] (416) <-- worker 29065 got signal 11 stress: WARN: [29038] (418) now reaping child worker processes stress: FAIL: [29038] (422) kill error: No such process stress: FAIL: [29038] (416) <-- worker 29059 got signal 11 stress: WARN: [29038] (418) now reaping child worker processes stress: FAIL: [29038] (422) kill error: No such process stress: FAIL: [29038] (416) <-- worker 29056 got signal 11 stress: WARN: [29038] (418) now reaping child worker processes stress: FAIL: [29038] (422) kill error: No such process stress: FAIL: [29038] (416) <-- worker 29047 got signal 11 stress: WARN: [29038] (418) now reaping child worker processes stress: FAIL: [29038] (422) kill error: No such process stress: FAIL: [29038] (416) <-- worker 29050 got signal 11 stress: WARN: [29038] (418) now reaping child worker processes stress: FAIL: [29038] (422) kill error: No such process stress: FAIL: [29038] (416) <-- worker 29053 got signal 11 stress: WARN: [29038] (418) now reaping child worker processes stress: FAIL: [29038] (422) kill error: No such process stress: FAIL: [29038] (416) <-- worker 29041 got signal 11 stress: WARN: [29038] (418) now reaping child worker processes stress: FAIL: [29038] (422) kill error: No such process stress: FAIL: [29038] (416) <-- worker 29062 got signal 11 stress: WARN: [29038] (418) now reaping child worker processes stress: FAIL: [29038] (422) kill error: No such process stress: FAIL: [29038] (416) <-- worker 29071 got signal 11 stress: WARN: [29038] (418) now reaping child worker processes stress: FAIL: [29038] (422) kill error: No such process stress: FAIL: [29038] (416) <-- worker 29068 got signal 11 stress: WARN: [29038] (418) now reaping child worker processes stress: FAIL: [29038] (422) kill error: No such process stress: FAIL: [29038] (416) <-- worker 29074 got signal 11 stress: WARN: [29038] (418) now reaping child worker processes stress: FAIL: [29038] (422) kill error: No such process stress: FAIL: [29038] (416) <-- worker 29044 got signal 11 stress: WARN: [29038] (418) now reaping child worker processes stress: FAIL: [29038] (422) kill error: No such process stress: FAIL: [29038] (452) failed run completed in 600s stress: info: [29038] dispatching hogs: 12 cpu, 12 io, 12 vm, 0 hdd Error: stress returned failure (12) ...finished running ./CORE2, exit code=12
Created attachment 304975 [details] System log from when the test ran and failed
random comments... sig11 is a memory failure of some sort be it exhaustion or otherwise. Looking @ the test system, it looks as though without the proper level of swap (1.2GB RAM w/300MB swap looks less than idea) then the memory test will get OOM'd. The core test however looks a bit trickier and adding swap does not resolve.
Closing as system configuration issue.