Bug 464019 - Fedora 8 64 bit and Fedora 9 32 Bit freezes by using boinc with 4 parallel seti@home
Fedora 8 64 bit and Fedora 9 32 Bit freezes by using boinc with 4 parallel se...
Status: CLOSED NOTABUG
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
9
All Linux
medium Severity medium
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-09-25 17:41 EDT by Kai Neumann
Modified: 2008-10-14 03:19 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-10-14 03:19:23 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
cpuinfo (2.62 KB, text/plain)
2008-09-25 17:49 EDT, Kai Neumann
no flags Details
dmidecode (11.18 KB, text/plain)
2008-09-25 17:49 EDT, Kai Neumann
no flags Details

  None (edit)
Description Kai Neumann 2008-09-25 17:41:10 EDT
Description of problem:
Fedora 8 (64 bit) and 9 (32 bit) Freezes complete when i calc with boinc 4 seti@home parallel on my core2quad.
All Kernel higher than 2.6.24.7-92.fc8 have this bug.
I have used with Fedora 8 (64bit) this Kernel 3 Month instead of the new kernels because this Kernel dont freeze the system.
Test under kde and gnome. sshd freezes too.

Hardware:
CPU: model name	: Intel(R) Core(TM)2 Quad  CPU   Q9450  @ 2.66GHz
Mainboard: Gigabyte P35-DS4 Firmware 
Ram: 4 GB

I have test Ram with Memtest86+ v2.01. 16 Passes without Errors.
I have used 4 Parallel cpuburn-in to test the cpu. It runs now over 24 hours.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
0. i dont know if core 2 duo freezes too. i have only a core 2 quad.
1. Download Boinc under http://boinc.berkeley.edu/download.php add seti@home and calc 4 seti@home parallel. Between 0 and 16 hours Fedora 9 Frezes complete.
2.
3.
  
Actual results:


Expected results:


Additional info:
Kernel Panic info: 

Message from syslog@localhost at Sep 15 13:38:36 ...
kernel:PANIC: double fault, gdt at c3027000 [255 bytes]

Message from syslog@localhost at Sep 15 13:38:36 ...
kernel:double fault tss at c302a680

Message from syslog@localhost at Sep 15 13:38:36 ...
kernel:eax=00000000, ebx=fff7c000, ecx=00000000, edx=00000000

Message from syslog@localhost at Sep 15 13:38:36 ...
kernel:eip=f6564f61, esp=000032f4

Message from syslog@localhost at Sep 15 13:38:36 ...
kernel:esi=f6564ec8, edi=c048b152

Message from syslog@localhost at Sep 15 13:38:36 ...
kernel:journal commit I/O error
Comment 1 Kai Neumann 2008-09-25 17:49:15 EDT
Created attachment 317738 [details]
cpuinfo
Comment 2 Kai Neumann 2008-09-25 17:49:55 EDT
Created attachment 317739 [details]
dmidecode
Comment 3 Kai Neumann 2008-09-25 17:51:05 EDT
Firmware from Mainboard: Version: F12 Release Date: 02/27/2008
Comment 4 Kai Neumann 2008-09-25 17:57:47 EDT
My Kernel:
uname -a
Linux localhost.localdomain 2.6.26.3-29.fc9.i686 #1 SMP Wed Sep 3 03:42:27 EDT 2008 i686 i686 i386 GNU/Linux
Comment 5 Dave Jones 2008-09-25 17:59:45 EDT
crashes like this under high CPU load are nearly always hardware related.
insufficient power and/or cooling being the usual suspects, though bad ram has
also triggered such crashes.

memtest86 may pick up on something, but I'm not sure if it stresses all cores
simultaneously as boinc does, so it may not be as effective a load test.
Comment 6 Kai Neumann 2008-09-25 18:08:27 EDT
please read the description:
I have test Ram with Memtest86+ v2.01. 16 Passes without Errors.
I have used 4 Parallel cpuburn-in to test the cpu. It runs now over 24 hours.
Comment 7 Chuck Ebbert 2008-09-29 23:37:19 EDT
(In reply to comment #6)
> please read the description:
> I have test Ram with Memtest86+ v2.01. 16 Passes without Errors.
> I have used 4 Parallel cpuburn-in to test the cpu. It runs now over 24 hours.

Do you always get that same doublefault panic when it freezes?
Comment 8 Kai Neumann 2008-09-30 03:25:00 EDT
i dont know it was the first time i have open a terminal to see the kernel panic. in the log inst the panic info...
i have write it to a paper and from it to pastebin and here...

i use now a spezial version for my processor with ssse3 support. i have calc over 30 hours without freezes...

the spezialversion can download from http://calbe.dw70.de/linux32.html
the "AK V8 Linux x32 SSSE3 Intel only"

I will test this version the complete week.
when i get no freezes, the seti binary have the problem not the kernel.
Comment 9 Kai Neumann 2008-10-02 12:24:24 EDT
The Spezialversion runs over 3 Days without freezes. Today i have in the night and for 1 Hour a freeze.

First was with black screen and connectet remote via ssh.
i cannot see one error.
in the /var/log/messages isnt a error.
Comment 10 Chuck Ebbert 2008-10-09 08:38:45 EDT
If you are not getting the same kernel error every time it fails then it's probably a hardware problem.

Monitor the CPU temperature when running your cpuburn program and compare it to the temperature when running SETI. Some of the cpuburn programs don't really stress the system as much as a real application like SETI.
Comment 11 Kai Neumann 2008-10-09 09:05:14 EDT
hi,
i think it was a ram timing problem.
i use corsair xms2 with 4-4-4-12 timing and 2,1 Volt.
The Rams are Certifiziert for this timing with this Volt.

I have use now the Standard Timings 5-5-5-16 with 1,8 Volt and the System runs:
 uptime
 15:04:11 up 4 days, 10:53,  3 users,  load average: 4.51, 4.64, 4.77

I think this was the problem.
But why it works now nothing more with 4-4-4-12 on 2,1 Volt i dont know. The Memtest have checked the Ram with this timing and havent found one error. 16 Passes are ok.

Note You need to log in before you can comment on or make changes to this bug.