Bug 160121 - hyperthreading causes BUG() in exec.c
hyperthreading causes BUG() in exec.c
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
3
i386 Linux
medium Severity high
: ---
: ---
Assigned To: Ingo Molnar
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-06-10 17:57 EDT by sampath gajawada
Modified: 2007-11-30 17:11 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-10-25 03:48:09 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
windows text file (2.75 KB, text/plain)
2005-06-10 17:57 EDT, sampath gajawada
no flags Details
kernel error message in dmesg (3.11 KB, text/plain)
2005-06-22 14:28 EDT, Kenny Yu
no flags Details

  None (edit)
Description sampath gajawada 2005-06-10 17:57:07 EDT
Description of problem:

We are running 64-bit version of fedora core3 on a Intel's Xeon processor. An 
application executed remotely from another node via rexec failed when it 
spawned another process with exec call. Attached file contains error log 
collected in the messages file.

Version-Release number of selected component (if applicable):
Fedora Core release 3 (Heidelberg)
kernel version: 2.6.11-1.27_FC3smp #1 SMP Tue May 17 20:38:05 EDT 2005 x86_64 
x86_64 x86_64 GNU/Linux


How reproducible:

very often but not everytime

Steps to Reproduce:
1. launch a program on the working node remotely using rexec
2. launch another program from this program using exec
3.
  
Actual results:
Program should not fail to start exec. It works fine on redhat-9

Expected results:
Program should not fail to start exec. It works fine on redhat-9


Additional info: Please check the attached file for the output 
from /var/log/messages file
Comment 1 sampath gajawada 2005-06-10 17:57:08 EDT
Created attachment 115314 [details]
windows text file
Comment 2 Kenny Yu 2005-06-22 14:28:14 EDT
Created attachment 115828 [details]
kernel error message in dmesg

Kernel BUG at "fs/exec.c":776
invalid operand: 0000 [1] SMP
Comment 3 Kenny Yu 2005-06-22 14:34:08 EDT
The same random crash of programs happen on FC3/2.6.11-1.14smp and
FC4/2.6.11-1.1369_FC4smp. Sometimes re-execute the same program twice can work,
but may fail for the third time. It seems this problem only happen on
Intel/EM64T systems. The only common error message is the complaint about kernel
bug at "fs/exec.c":776.
Comment 4 Dave Jones 2005-07-15 14:34:17 EDT
An update has been released for Fedora Core 3 (kernel-2.6.12-1.1372_FC3) which
may contain a fix for your problem.   Please update to this new kernel, and
report whether or not it fixes your problem.

If you have updated to Fedora Core 4 since this bug was opened, and the problem
still occurs with the latest updates for that release, please change the version
field of this bug to 'fc4'.

Thank you.
Comment 5 Kenny Yu 2005-07-21 13:52:15 EDT
Thank you Dave.
Lately I tried the updated kernels for FC4, including 2.6.12-1.1387_FC4smp, 
2.6.12.1-1390_FC4smp as well as 2.6.12-1-1398_FC4smp, the problem still remains 
no matter which version of kernel is used.
However, it seems that I have found the solution to this problem. Just DISABLE 
Hyperthreading in the BIOS before booting the system!! Since I switched off the 
Hyperthreading option, this random 'Kernel BUG at "fs/exec.c"' never appeared 
again and all the formerly failed tasks can normally finish, no matter which 
version of kernel is used. The solution has been verified with motherboards 
including Intel SE7525GP2, Asus NCCH-DL as well as IWILL DH800. As long as HT 
is off in BIOS everything works normally. It is weired that this problem only 
happens on Intel/EM64T systems.

> An update has been released for Fedora Core 3 (kernel-2.6.12-1.1372_FC3) which
> may contain a fix for your problem.   Please update to this new kernel, and
Comment 6 Ingo Molnar 2005-09-15 15:38:58 EDT
this particular BUG_ON() in exec.c is buggy, and has been removed in Linus' tree
as of today. Please check whether it still occurs in the next update kernel.
Comment 7 Dave Jones 2005-09-24 01:39:25 EDT
backported to 2.6.12 FC3 kernel. Will be in next update.
Comment 8 Kenny Yu 2005-10-17 08:23:12 EDT
(In reply to comment #6)
> this particular BUG_ON() in exec.c is buggy, and has been removed in Linus' 
tree
> as of today. Please check whether it still occurs in the next update kernel.

Using Vanilla kernel 2.6.13.4 fixes this bug, for both of 64-bit systems 
including "EM64T with hyperthreading enabled in BIOS" and "Athlon64x2 Dual Core 
CPUs". Former crashes resulted from fs/exec.c do not appear again. Looking 
forward to FC4's official kernel update. Thank you.
Comment 9 Fedora Update System 2005-10-20 10:29:26 EDT
From User-Agent: XML-RPC

kernel-2.6.12-1.1380_FC3 has been pushed for FC3, which should resolve this issue.  If these problems are still present in this version, then please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.