Bug 730509

Summary: kernel-2.6.40-4.fc15.i686 causes the PC to freeze
Product: [Fedora] Fedora Reporter: bugfinder <blackcode>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 15CC: gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-07-11 17:49:15 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
oops.txt
none
Oops none

Description bugfinder 2011-08-13 23:40:27 UTC
This happens atleast once a day while transferring files over the network (FTP, HTTP, etc). The system randomly freezes completely. The previous kernel does not have this problem.

Comment 1 bugfinder 2011-08-18 18:29:56 UTC
Upgraded to kernel-2.6.40.3-0.fc15.i686. It still freezes.

Is there any workaround until this bug is fixed?

Comment 2 Josh Boyer 2011-08-18 18:53:01 UTC
Given the original report didn't include any information other than some kernel (which we didn't know the version of) randomly freezes, and the previous kernel (which we also didn't know the version of) doesn't, the workaround is to use the kernel that doesn't freeze.

If we're going to make any progress from here, we need a bit more information.  It's good to know that 2.6.40.3-0.fc15 is one of the versions that freeze.  Which version doesn't?

What kind of network hardware is being used?

Can you be a bit more specific about the workload running?

Can you try the kernel-debug-2.6.40.3-0.fc15 build and see if we can capture some more debug information?

Assuming this is network related, ssh'ing into the machine probably won't be possible but it is worth a try.  If that for some reason works, it would be good to know if dmesg or /var/log/messages has any oops data in them.

Comment 3 bugfinder 2011-08-18 19:05:40 UTC
The other bug I filed is this https://bugzilla.redhat.com/show_bug.cgi?id=708202 and it has a different problem, but eventually that freezes as well.

SSH doesn't work. The PC is completely frozen.

The ethernet card is Marvell Technology Group Ltd. 88E8001 Gigabit Ethernet Controller (rev 13)

Isn't the dmesg output reset when I boot? So I don't see how to get that when the PC is frozen and the only thing possible is a hard reset.

Output of cat /var/log/messages | grep -i oops is attached but there's nothing in there other than selinux errors.

I'll try the debug build now.

Comment 4 bugfinder 2011-08-18 19:06:35 UTC
Created attachment 518923 [details]
oops.txt

Comment 5 bugfinder 2011-08-18 19:10:58 UTC
Sorry, forgot to mention that all kernels cause the system to freeze though this one makes it freeze faster than the others (within an hour or two after boot). 

I've tried kernel-2.6.38.6-26.rc1.fc15.i686 and kernel-2.6.40-4.fc15.i686 before this.

kernel-2.6.38.6-26.rc1.fc15.i686 runs fine for a day or two but once I start getting bug #708202, it could freeze anytime.

Comment 6 bugfinder 2011-08-22 07:59:41 UTC
Looks like a Heisenbug. It stopped freezing when I use kernel-debug.

I've had the system on for the past 3 days and it hasn't frozen yet. I've put the system to sleep several times in between. I've played a few videos and music. I've transferred around 50 GB across the LAN and there seems to me no issue.

I'll leave it running on kernel-debug. Hopefully it'll freeze and I'll get that log we're looking for.

Comment 7 Dave Jones 2011-08-22 21:53:04 UTC
this part in your logs is interesting..

Aug 15 22:02:25 workstation kernel: [144533.030810] Oops: 0000 [#1] SMP 
Aug 15 22:02:26 workstation kernel: [144533.059001]  [<c07d776b>] oops_end+0xa2/0xa8
Aug 16 00:57:09 workstation kernel: [155017.424011] Oops: 0000 [#2] SMP 

There should be stack traces around there.  We'll need to see the full dump to diagnose further.

Comment 8 bugfinder 2011-08-22 23:40:50 UTC
Created attachment 519370 [details]
Oops

Comment 9 Dave Jones 2011-08-23 02:30:20 UTC
ah, that's from the .38 kernel. That particular bug was widely reported, and is fixed in the .40 kernel.   So it doesn't give any clues as to what's causing your .40 lockups.

Hopefully the -debug build will pick up something eventually.

Comment 10 bugfinder 2011-08-28 15:45:47 UTC
The system froze twice today when I was using kernel-debug. But there's no stacktrace or the mention of kernel oops or any bug in /var/log/messages.

Comment 11 bugfinder 2011-08-30 04:59:23 UTC
I'm not sure anymore if the freezing bug is related to the kernel. I spend around 15 hours in front of the computer everyday and it never freezes when I'm using it. The freezes always occur when I'm not using it. This makes me suspect if it has something to do with turning the display off when the computer is left idle (It was set to 1 hour). So I decided to do some tests.

I set it to 1 minute and waited. It went upto the part where the display slowly turns off. I hit capslock a few times to check if the system is frozen, but instead it rebooted the computer for some reason.

I then repeated this test several times and it didn't freeze. After 5 or 6 times of repeated tests, it froze as soon as the display went off and I had to do a hard reset. I repeated the test again several times and it didn't freeze. Then once again it froze.

So this might be related to the display turning off because I can make it freeze more often than before but I can't reproduce it in a consistent manner.

Comment 12 bugfinder 2011-09-05 13:37:48 UTC
A temporary workaround I'm using around this is by adding the following to crontab:

*/1 * * * * env DISPLAY=:0.0 xte 'mousermove 1 1'

Can this bug be reassigned to the appropriate package or do I need to file it again?

Comment 13 Josh Boyer 2012-06-06 17:28:36 UTC
Are you still seeing this with 2.6.43/3.3?

Comment 14 Josh Boyer 2012-07-11 17:49:15 UTC
Fedora 15 has reached it's end of life as of June 26, 2012.  As a result, we will not be fixing any remaining bugs found in Fedora 15.

In the event that you have upgraded to a newer release and the bug you reported is still present, please reopen the bug and set the version field to the newest release you have encountered the issue with.  Before doing so, please ensure you are testing the latest kernel update in that release and attach any new and relevant information you may have gathered.

Thank you for taking the time to file a report.  We hope newer versions of Fedora suit your needs.