From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.0) Gecko/20020606 Description of problem: The IBM 1.3.1 JDK that ships with Advanced Server freezes on 7.2 and Advanced Server SMP machines. It runs fine on 7.2 and AS single-processor machines. Additionally, the JDK runs fine on 7.3 SMP and single-processor machines. I have a dual-cpu box with 7.2. This box has glibc-2.2.4-24 on it and also the 2.4.9-31smp kernel. When I run a java process on the box, the java process freezes. I found a discussion at http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&selm=acad4i%241coq%241%40news.boulder.ibm.com that suggested trying to set the LD_ASSUME_KERNEL=2.2.5 environment variable to address this issue. When I set this parameter, the java process will run and not freeze. However, the JDK performance is extremely slow when I set this parameter. Additionally, I will see sporadic defunct Java processes remaining after I stop running Java. I do not see this behaviour with the 1.3.1 JDK on 7.3 SMP. One other thing that I have tried is to replace the 2.4.9-31smp kernel on the 7.2 box with a 2.4.18-5smp kernel from a 7.3 box. With this kernel, Java will run without the LD_ASSUMER_KERNEL=2.2.5 parameter set. However, it is still slow. So, there seems to be some glibc bug in 7.2/AS that prevents the JDK from running on smp boxes. The 2.4.9-based SMP kernels with this glibc cause Java to freeze unless the LD_ASSUME_KERNEL=2.2.5 parameter is set. The 2.4.18-based kernels somehow do not freeze with this glibc, but performance is still extremely slow. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. run a Java program with the IBM 1.3.1 JDK Actual Results: The Java process freezes Expected Results: It should run Additional info:
Created attachment 66025 [details] This software will reproduce the JDK error using ant
The Read Me for the IBM Java SDK 1.3.1 states firmly that the environment variable LD_ASSUME_KERNEL must be set for SMP operation. I have included the relevant section below. This explains the "freeze" behaviour described in this bugzilla. The technical justification for this restriction is as follows:- On Intel, there is a kernel bug such that registers can be corrupted when we switch context on an SMP. This is fixed in the 2.4.10 kernel. This affects the JIT such that we have to use an alternate method to store a pointer to our ExecEnv but, this alternate method doesnt work if Linux is using floating stacks. Thus, if you are on an IA32 SMP with floating stacks (eg RedHat) with a 2.4.0-2.4.9 kernel you have to turn off floating stacks using an environment variable. The extract from the Read Me is:- Working with floating stacks Certain Linux distributions - Red Hat, for example - have enabled a GLIBC feature called 'floating stacks'. Because of Linux kernel limitations, the JVM will not run on SMP hardware with floating stacks enabled if the kernel level is less than 2.4.10. In this environment, floating stacks must be disabled before the JVM, or any application that starts the JVM, is started. On Red Hat, you disable floating stacks by exporting an environment variable, thus: export LD_ASSUME_KERNEL=2.2.5 On a non-floating stack Linux system, regardless of what is set for -Xss, a minimum native stack size of 256KB for each thread is provided. On a floating stack Linux system, the -Xss values are honored. Thus, if you are migrating from a non-floating stack Linux system, you must ensure that any -Xss values are large enough and are not relying on a minimum of 256KB.
This readme is inaccurate, since it was not 2.4.10 where the LDT SMP problems were fixed, but 2.4.8 AFAIK.
This has not been updated in over a year. Is this still a problem since the most recent Quartly Update (QU2)?
Someone (Jakub?) fixed the glibc bug and it was pushed out at least six months ago. We're not seeing this bug on our AS2.1 SMP machines any more.
As per comment #5, closing as resolved in CURRENTRELEASE.