| Summary: | Machine locks up | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Pascal Patry <iscy> | ||||||
| Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
| Severity: | unspecified | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 16 | CC: | gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2012-02-07 19:10:04 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Attachments: |
|
||||||||
Can you recreate this without the nvidia module loaded? Short answer: Yes, long answer.. I have another machine, running on: Linux sheol 2.6.33.6-147.fc13.x86_64 #1 SMP Yes, I agree, a bit old, but it's able to easily get uptime of more than 200 days. That computer doesn't have the same hardware, but it has the same silent graphic card and uses the exact same nvidia module. I know that it taints the kernel, and that the tow kernel are different versions, but it proved itself to be quite stable. If you really want me to disable that module and reproduce it, I can do it. (In reply to comment #2) > Short answer: Yes, long answer.. > > I have another machine, running on: > Linux sheol 2.6.33.6-147.fc13.x86_64 #1 SMP That's irrelevant to this bug report, sorry. > If you really want me to disable that module and reproduce it, I can do it. Disabling the nvidia module and reproducing on the 3.1.5 kernel is really the only way to make progress here. Sure, I also grabbed the debug pkg to have more info. I'll post as soon as I reproduced it. Created attachment 549789 [details]
/var/log/messages
As promised, this is the /var/log/messages including the kernel stack of this problem without 'nvidia' tainting the Kernel. It took 11 days before locking up.
Kernel is 3.1.5-2.fc16.x86_64
We have a similar oops in _raw_spinlock from a different user in bug 771559. They hit this quite a while after they resumed from a suspend. Did you happen to also resume from a suspend/hibernate at some point during the uptime? No, this workstation never goes to sleep/suspend. It runs 24/7 and doesn't even have a screen saver... User interaction and/or having load is not necessary either. Most of the time, it locks up while being used, but it did also happen over night. I also got it after ~38 hours of uptime. Still reproducible with latest kernel pkg (3.1.8-2.fc16.x86_64). Looks like someone has put his finger on this issue a few days ago: https://lkml.org/lkml/2012/1/9/114 Currently on 3.2.2-1.fc16.x86_64 with an uptime of 6 days and an half. No issues to report yet. If the problem was really caused by comment #9, then 3.2.2 has the fix and I shouldn't be able to reproduce it. Agreed. Let's close this one out for now. If you see it again on 3.2.2 or newer, please reopen. |
Created attachment 547848 [details] /var/log/messages Description of problem: Machine locks-up after 24 to 48 hours of uptime. Version-Release number of selected component (if applicable): Fedora 16 - Kernel 3.1.5-1 How reproducible: I haven't notice any other trigger than time. Additional info: Interesting part of /var/log/messages has been attached. I used to get _raw_spin_lock issues on Kernel 3.1.0 and since I updated, this problem started to occur.