Description of problem:
While running the KernelTier1 tests /kernel/misc/ktst_msg we receive faliures.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Install RHEL5.1 tree RHEL5.1-Server-20070725.0 install/create xen guest. run
the following RHTS test. rh-tests-kernel-misc-ktst_msg
This test _ONLY_ fails on a xen guest. It works as expected on a Dom0.
I have done the investigation.
This is not a specific issue of ia64 xen guest, but ia64 generic. I think there
are possible three ways to resolve this issue:
1) Increase the memory size
2) Create a swap device whose size is more than 2GB
3) Run 'ulimit -Hs unlimited' before the program running
Why it occurred only on ia64 xen guest is because the guest met all the three
1) the memory size for the guset was 512MB (Per my quick test, about 700MB or
less is required.)
2) the swap size for the guest was less than 2GB
3) the default "Hard limit" for processes on the guest was 2097152k (not
(If the three conditions are met, the failures occur on other arches.)
The first two can be applicable to all arches, but the third one is ia64
specific (applies to linux, dom0 and guest.) The default value of "Hard limit"
for processes is somehow 2097152k on ia64. ('unlimited' is set for other
arches.) So, in that sense, this can be considered as an ia64 generic thing.
When a 'make' command is invoked, it sets the same value of "Hard limit" to its
"Soft limit" (maybe that's the spec of make command) and that causes the
failures. pthread_create() tries to allocate (mmap) memory of the same size of
"Soft limit" and there is not enough so...
However, it seems that the allocation completes successfully if there is more
than 2GB swap, which larger than the "Hard limit" size. But, I have not had any
clues on this swap thing yet.
Thanks to Kei for the comprehensive investigation. Looking at kernel.org, it
appears that the default stack hard limit was removed earlier this year:
This patch brings ia64 into alignment with x86 and x86_64, neither of which have
hard stack limits by default. Also Debian stable runs 2.6.18 with this patch,
so it appears to work as expected on the 2.6.18 kernel (we're using Debian
stable on our lab server at HP)
I'm building kernels tonight with this modification and will make them available
for testing in the morning.
Test kernels available at http://free.linux.hp.com/~agriffis/rhel5/bz251043/
I posted the patch for review on rhkernel-list. Requested help in that message
getting this submitted to RHTS since I haven't done that before. When that's
done, I'll change the status to POST
With help from Kei, I was able to run the RHTS test on dom0 on my
2.6.18-45.el5xen (unpatched) -- FAIL
2.6.18-47.el5.bz251043.agriffis.1xen (patched) -- PASS
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
You can download this test kernel from http://people.redhat.com/dzickus/el5
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.
fv_* tests all failed on 2.6.18-227.el5xen.
#virsh console v7ia64
Output for fv_core testing:
+----------ia64 CPU info end----------+
Checking clock jitter ...
Single CPU detected. No clock jitter testing necessary.
clock direction test: start time 1288605280, stop time 1288605340, sleeptime 60, delta 0
audispd invoked oom-killer: gfp_mask=0x200d2, order=0, oomkilladj=0
Swap cache: add 45947, delete 45947, find 158/264, race 0+0
Free swap = 0kB
Total swap = 720864kB
Out of memory: Killed process 2278 (stress).
stress: page allocation failure. order:0, mode:0x280d2
Output for fv_memory testing:
Starting Threaded Memory Test
running for more than free memory at 195 MB for 60 sec.
automount invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0
Swap cache: add 0, delete 0, find 0/0, race 0+0
Free swap = 0kB
Total swap = 0kB
Out of memory: Killed process 2270 (threaded_memtes).