Red Hat Bugzilla – Bug 432601
ia32el: 32-bit application (had) causes system freeze on ia32el-1.6-14.EL4
Last modified: 2015-05-04 21:33:43 EDT
A 32-bit application (the High Availability daemon (had) of Veritas Cluster
Server) runs fine on an earlier version of ia32el package [ia32el-1.2.4].
However the version of ia32el shipped with rhel4u4/u5 [ia32el-1.6-14.EL4]
causes a system hang when the application is run.
The problem is only seen when the number of cpus = 1. Also, when the
application is run on strace or gdb, the application starts without any
Some lock in IA-32 EL is implemented with atomic cmpxchg and sched_yield().
HAD is set as a real-time thread and spins on a internal lock used by IA-32 EL
(because IA-32 EL executes code on behalf of HAD), while the lock is hold by
another thread with low priority (so-called translation thread created by IA-
32 EL). As long as Translation Thread does not release the lock, the real-time
thread will running endlessly and system seems freezing.
For this specific application (HAD), Translation Thread is the feature that
exposes the issue; Since it converted a single thread problem to multi-thread,
the spin-lock internally used by IA32EL comes to be a problem. But for real
multi-thread applications, these kind of lock can be a problem even if there
is no Translation Thread within IA32EL, so we plan to provide an ultimate fix
for this problem in the on-going version of IA32EL.
We have disabled Translation Thread in IA32EL shipped with RHEL5.1. So for a
temporary workaround, we recommend customer to use IA-32 EL on RHEL5.1.
The way I read this is that one can work around this problem by using the ia32el
package that is shipped with RHEL5. Assuming so, then perhaps the easiest way
to resolve this bug is to document this in a knowledge base article. I am also
bearin gin mind that Intel is shipping dual cores across the entire product
line, so the case of cpus=1 is rather small.
One correction for you, workaround should be using ia32el package with RHEL 5
U1. And yes, we'd like to document it in knowledge base article, any process
Product Management has reviewed and declined this request. You may appeal this
decision by reopening this request.
The issue is waiting to be verified by customers
the RHEL4.7 release notes deadline is on June 17, 2008 (Tuesday). they will
undergo a final proofread before being dropped to translation, at which point no
further additions or revisions will be entertained.
a mockup of the RHEL4.7 release notes can be viewed here:
please use the aforementioned link to verify if your bugzilla is already in the
release notes (if it needs to be). each item in the release notes contains a
link to its original bug; as such, you can search through the release notes by
Can you provide us with an update?
we have already got the workaround: using ia32-el shipped in RHEL5 instead in
Gary Case (firstname.lastname@example.org) is currently verify the workaround with the
we could close this bug now.
Don (and others), do you think we need to document the workaround in Comment #14
in the release notes before closing this out?
Yes, please. Please note user need this workaround only if threads of their
application use real time priority
Can you please make a specific suggestion of how we should word the release note?
How about the following?
In an X86 application with one or more SCHED_PR threads, it may hang due to a
bug in IA-32 EL V6 shipped with this OS release. The workaround is to use IA-
32 EL V6 Update 1 shipped with RHEL 5.
Partners, I would like to thank you all for your participation in assuring the
quality of this RHEL 4.7 Update Release. My hat's off to you all. Thanks.
Intel will fix this regression in lastest IA-32EL release,