Bug 707686 - Kernel md raid code crash on amanda backup
Summary: Kernel md raid code crash on amanda backup
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 15
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-05-25 17:26 UTC by Trever Adams
Modified: 2011-10-14 14:07 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-10-14 14:07:40 UTC
Type: ---


Attachments (Terms of Use)
screen shot of oops (1.68 MB, image/jpeg)
2011-05-25 17:26 UTC, Trever Adams
no flags Details
Additional crash that looks slightly different (1.52 MB, image/jpeg)
2011-05-29 18:09 UTC, Trever Adams
no flags Details
This may or may not have what was requested (122.54 KB, application/octet-stream)
2011-06-18 15:50 UTC, Trever Adams
no flags Details
Another backtrace from a freeze that may or may not be the same bug (7.77 KB, application/octet-stream)
2011-06-20 13:34 UTC, Trever Adams
no flags Details
This has things not seen in others (11.13 KB, application/octet-stream)
2011-06-20 13:55 UTC, Trever Adams
no flags Details
a few more backtraces (21.73 KB, application/octet-stream)
2011-06-20 14:14 UTC, Trever Adams
no flags Details

Description Trever Adams 2011-05-25 17:26:18 UTC
Created attachment 500885 [details]
screen shot of oops

Description of problem:
This may or may not be related to bug 704462. Whenever I run amdump with the virtual tapes on an md raid device, the system crashes. I have attached an image of the crash since it is the only way I have access to it.

Version-Release number of selected component (if applicable):
kernel-2.6.38.6-27.fc15.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Create a raid device
2.  disk -> raid -> lvm -> ext4
3. Setup amanda with virtual tapes on that raid device
4. amdump yourdumpname
5. Crash

Comment 1 Trever Adams 2011-05-29 18:09:31 UTC
Created attachment 501637 [details]
Additional crash that looks slightly different

This may or may not be the same crash, but this one is much cleaner. (Other problems have been fixed which should leave fewer problems).

Comment 2 Chuck Ebbert 2011-06-15 14:33:21 UTC
This looks like a spinlock deadlock.

A program called rs:main is calling sys_futex() which eventually ends up calling _raw_spin_lock(), and it looks like that is deadlocking with a worker thread trying to dispatch a disk request. We'd really need to see several screens of the earlier messages to see exactly where it's deadlocking.

Comment 3 Trever Adams 2011-06-15 15:44:49 UTC
I would gladly do this, but the machine hard locks when this lockup happens, so I cannot scroll back. If you can provide me with instructions on somehow capturing all of this, I will do my best to do it.

Comment 4 Trever Adams 2011-06-18 15:50:14 UTC
Created attachment 505396 [details]
This may or may not have what was requested

I am not sure if this is related or not. It does have much more information.

Comment 5 Trever Adams 2011-06-20 13:34:01 UTC
Created attachment 505612 [details]
Another backtrace from a freeze that may or may not be the same bug

Comment 6 Trever Adams 2011-06-20 13:42:06 UTC
I should mention that any backtraces after June 16 at 6:16 AM MDT is from kernel-2.6.38.8-32.fc15.x86_64

Comment 7 Trever Adams 2011-06-20 13:55:38 UTC
Created attachment 505619 [details]
This has things not seen in others

Comment 8 Trever Adams 2011-06-20 14:14:01 UTC
Created attachment 505622 [details]
a few more backtraces

I do not think I will do anymore. While there are some unique parts, there appears to be a core that is repeated over and over. I imagine the trouble is there.

Comment 9 Trever Adams 2011-07-05 19:46:36 UTC
I switched Realtek 8169 to Intel e100e PCIe card. I have not been able to duplicate any of these problems since, even under very heavy load. The process is also much more idle (nearly completely used w/ 8169 and about 30-70% idle most of the time, more than 50 quite often, with the later card).

I do not know if the 8169 chipset is just broken or if the driver is, but the problem lies with one of the two.

Comment 10 Trever Adams 2011-10-14 14:07:40 UTC
I am closing this as NOTABUG as it is a kernel driver problem/hardware problem with Realtek 8169 and the like.


Note You need to log in before you can comment on or make changes to this bug.