Bug 464019 - Fedora 8 64 bit and Fedora 9 32 Bit freezes by using boinc with 4 parallel seti@home
Summary: Fedora 8 64 bit and Fedora 9 32 Bit freezes by using boinc with 4 parallel se...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 9
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-09-25 21:41 UTC by Kai Neumann
Modified: 2008-10-14 07:19 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-10-14 07:19:23 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
cpuinfo (2.62 KB, text/plain)
2008-09-25 21:49 UTC, Kai Neumann
no flags Details
dmidecode (11.18 KB, text/plain)
2008-09-25 21:49 UTC, Kai Neumann
no flags Details

Description Kai Neumann 2008-09-25 21:41:10 UTC
Description of problem:
Fedora 8 (64 bit) and 9 (32 bit) Freezes complete when i calc with boinc 4 seti@home parallel on my core2quad.
All Kernel higher than 2.6.24.7-92.fc8 have this bug.
I have used with Fedora 8 (64bit) this Kernel 3 Month instead of the new kernels because this Kernel dont freeze the system.
Test under kde and gnome. sshd freezes too.

Hardware:
CPU: model name	: Intel(R) Core(TM)2 Quad  CPU   Q9450  @ 2.66GHz
Mainboard: Gigabyte P35-DS4 Firmware 
Ram: 4 GB

I have test Ram with Memtest86+ v2.01. 16 Passes without Errors.
I have used 4 Parallel cpuburn-in to test the cpu. It runs now over 24 hours.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
0. i dont know if core 2 duo freezes too. i have only a core 2 quad.
1. Download Boinc under http://boinc.berkeley.edu/download.php add seti@home and calc 4 seti@home parallel. Between 0 and 16 hours Fedora 9 Frezes complete.
2.
3.
  
Actual results:


Expected results:


Additional info:
Kernel Panic info: 

Message from syslog@localhost at Sep 15 13:38:36 ...
kernel:PANIC: double fault, gdt at c3027000 [255 bytes]

Message from syslog@localhost at Sep 15 13:38:36 ...
kernel:double fault tss at c302a680

Message from syslog@localhost at Sep 15 13:38:36 ...
kernel:eax=00000000, ebx=fff7c000, ecx=00000000, edx=00000000

Message from syslog@localhost at Sep 15 13:38:36 ...
kernel:eip=f6564f61, esp=000032f4

Message from syslog@localhost at Sep 15 13:38:36 ...
kernel:esi=f6564ec8, edi=c048b152

Message from syslog@localhost at Sep 15 13:38:36 ...
kernel:journal commit I/O error

Comment 1 Kai Neumann 2008-09-25 21:49:15 UTC
Created attachment 317738 [details]
cpuinfo

Comment 2 Kai Neumann 2008-09-25 21:49:55 UTC
Created attachment 317739 [details]
dmidecode

Comment 3 Kai Neumann 2008-09-25 21:51:05 UTC
Firmware from Mainboard: Version: F12 Release Date: 02/27/2008

Comment 4 Kai Neumann 2008-09-25 21:57:47 UTC
My Kernel:
uname -a
Linux localhost.localdomain 2.6.26.3-29.fc9.i686 #1 SMP Wed Sep 3 03:42:27 EDT 2008 i686 i686 i386 GNU/Linux

Comment 5 Dave Jones 2008-09-25 21:59:45 UTC
crashes like this under high CPU load are nearly always hardware related.
insufficient power and/or cooling being the usual suspects, though bad ram has
also triggered such crashes.

memtest86 may pick up on something, but I'm not sure if it stresses all cores
simultaneously as boinc does, so it may not be as effective a load test.

Comment 6 Kai Neumann 2008-09-25 22:08:27 UTC
please read the description:
I have test Ram with Memtest86+ v2.01. 16 Passes without Errors.
I have used 4 Parallel cpuburn-in to test the cpu. It runs now over 24 hours.

Comment 7 Chuck Ebbert 2008-09-30 03:37:19 UTC
(In reply to comment #6)
> please read the description:
> I have test Ram with Memtest86+ v2.01. 16 Passes without Errors.
> I have used 4 Parallel cpuburn-in to test the cpu. It runs now over 24 hours.

Do you always get that same doublefault panic when it freezes?

Comment 8 Kai Neumann 2008-09-30 07:25:00 UTC
i dont know it was the first time i have open a terminal to see the kernel panic. in the log inst the panic info...
i have write it to a paper and from it to pastebin and here...

i use now a spezial version for my processor with ssse3 support. i have calc over 30 hours without freezes...

the spezialversion can download from http://calbe.dw70.de/linux32.html
the "AK V8 Linux x32 SSSE3 Intel only"

I will test this version the complete week.
when i get no freezes, the seti binary have the problem not the kernel.

Comment 9 Kai Neumann 2008-10-02 16:24:24 UTC
The Spezialversion runs over 3 Days without freezes. Today i have in the night and for 1 Hour a freeze.

First was with black screen and connectet remote via ssh.
i cannot see one error.
in the /var/log/messages isnt a error.

Comment 10 Chuck Ebbert 2008-10-09 12:38:45 UTC
If you are not getting the same kernel error every time it fails then it's probably a hardware problem.

Monitor the CPU temperature when running your cpuburn program and compare it to the temperature when running SETI. Some of the cpuburn programs don't really stress the system as much as a real application like SETI.

Comment 11 Kai Neumann 2008-10-09 13:05:14 UTC
hi,
i think it was a ram timing problem.
i use corsair xms2 with 4-4-4-12 timing and 2,1 Volt.
The Rams are Certifiziert for this timing with this Volt.

I have use now the Standard Timings 5-5-5-16 with 1,8 Volt and the System runs:
 uptime
 15:04:11 up 4 days, 10:53,  3 users,  load average: 4.51, 4.64, 4.77

I think this was the problem.
But why it works now nothing more with 4-4-4-12 on 2,1 Volt i dont know. The Memtest have checked the Ram with this timing and havent found one error. 16 Passes are ok.


Note You need to log in before you can comment on or make changes to this bug.