Bug 825004

Summary: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/u:3:78]
Product: [Fedora] Fedora Reporter: Rich Peiffer <rich.peiffer>
Component: xorg-x11-drv-atiAssignee: X/OpenGL Maintenance List <xgl-maint>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 16CC: collura, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, xgl-maint
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: first=3.3.6
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-26 15:17:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Cross Section of Syslog Messages When Lockup Occurred none

Description Rich Peiffer 2012-05-24 19:34:26 UTC
Created attachment 586693 [details]
Cross Section of Syslog Messages When Lockup Occurred

Description of problem:

Happens periodically, but always freezes the system within 48 hours of boot.  Most times when the machine is idle (from a user standpoint) overnight it is locked up in the morning, but at times (like today), I start getting messages described in the summary.  Once I start seeing these in the message center (using KDE), the system will lock up tight within 10-15 seconds.  By lock up I mean, the screen is there, but the keyboard, mouse stop responding.  Cannot switch to a virtual console, caps lock / num lock lights not responding, etc.  At times I found that you could still ssh in from another box, but many times it wasn't accessible at all.  Actually has been happening for quite some time under Fedora 16.

Had it happen today (giving me the messages about soft lockup).  I dug them out of the syslog files to report this.

I am a programmer but not a kernel developer.  I would be happy to test and get more information if possbile / it would help.


Version-Release number of selected component (if applicable):

Various, but the current kernel release: 3.3.6-3.fc16.x86_64

How reproducible:

Cannot reproduce on demand, but will always occur with in 48 hours uptime (usually less).

Steps to Reproduce:
1. Boot system up.
2. Use for 24 - 48 hours.
3. System usually locks up over night, but at times right when I am using it with the "soft lockup" messages.
  
Actual results:


Expected results:


Additional info:  From /var/log/messages i have attached what I believe to be the kernel messages related to the issue.

Comment 1 Rich Peiffer 2012-06-14 12:27:17 UTC
Just an update, through several kernel updates this continues to occur.  I'm pretty much guaranteed at least 1 lockup every 24 hours.

Comment 2 Josh Boyer 2012-06-14 14:17:35 UTC
Do you see this without the vmware modules loaded?

Comment 3 Rich Peiffer 2012-06-14 15:08:49 UTC
Hey Josh, not sure as I use VMWare constantly.  I can try unloading them over the weekend and see if it stabilizes the system.  If it does, I've definitely got another delima (what to do about that one), but that's not your issue.

At some point too, I disabled the "virtualization" support settings in the bios.  Can't remember which kernel version it started on, but after the update, when I would boot a VM, and it would cause a kernel panic.  Tested the next few updated kernels, same result.  Disabling the support allowed things to run (or so I thought).

I'll update the task with my findings on Monday.  Any other suggestions you might have are definitely appreciated and thanks for responding!

Comment 4 Rich Peiffer 2012-07-03 17:33:44 UTC
OK - sorry, for the delay.  I was gone for a couple of days and decided it would be the time to try uloading the vmware stuff.  "systemctl stop vmware.service".  Checked and the service was dead.  The stop process actually rmmods the vmware modules.

Came back 2 days later.  System locked up tight.   Checked the system logs (syslog messages).  Logging just stopped late on the night of June 30th.  Nothing further logged until the reboot on July 2nd.  As I said, sometimes I get the CPU stuck message with logging, sometimes I don't.

Anything else I can provide, let me know.  I would love to get my system back to the point where it's stable.

Comment 5 Rich Peiffer 2012-07-16 20:59:46 UTC
More information (may be coincidental, may not).

I have an ATI based graphics card in this machine.  I used to go through the additional work of using the ATI Catalyst drivers (rpmfusion repositories).  Starting with Fedora 16, I stopped using them and just went with the default "mesa" support.

After last system update, I started noticing some very strange behavior (refresh issues on the screen, etc.).  Re-installed the ATI stuff from rpmfusion about 1 week ago.  Since that point the system has not locked up at all.

Again, I'm not a kernel / driver developer, so may be a coincidence.

Comment 6 Fedora End Of Life 2013-01-16 21:05:43 UTC
This message is a reminder that Fedora 16 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 16. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '16'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 16's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 16 is end of life. If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora, you are encouraged to click on 
"Clone This Bug" and open it against that version of Fedora.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 7 Fedora End Of Life 2013-02-26 15:17:26 UTC
Fedora 16 changed to end-of-life (EOL) status on 2013-02-12. Fedora 16 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.