Created attachment 415301 [details]
a patch to fix the problem
Description of problem:
aoe driver causes a panic when the backport patch reported in 593902
Version-Release number of selected component (if applicable):
Please see the LKML post below.
This is reproducible in also non-xen kernel.
Steps to Reproduce:
Please see the above LKML post.
Kernel panic when rmmod is invoked.
The attached patch fixes this problem. It's logically the same thing with
the fix describe in the above post.
The follwing is additional information.
Note that in the case below, I used a backport patched version of aoe.
1. crash command quick results
DATE: Sat May 15 00:40:12 2010
LOAD AVERAGE: 0.00, 0.00, 0.00
VERSION: #1 SMP Tue Mar 16 22:01:26 EDT 2010
MACHINE: x86_64 (2266 Mhz)
MEMORY: 8 GB
PANIC: "Oops: 0002  SMP " (check log for details)
PID: 9673 TASK: ffff8801ffe16040 CPU: 7 COMMAND: "rmmod"
#0 [ffff8801f53afbf0] crash_kexec at ffffffff802a62c4
#1 [ffff8801f53afcb0] __die at ffffffff80265281
#2 [ffff8801f53afcf0] do_page_fault at ffffffff8026791c
#3 [ffff8801f53afde0] error_exit at ffffffff8026082b
[exception RIP: mutex_lock+16]
RIP: ffffffff80263917 RSP: ffff8801f53afe98 RFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000610 RCX: 000000000000000f
RDX: ffffffffff578000 RSI: 0000000000000000 RDI: 0000000000000610
RBP: 0000000000000000 R8: 00007fffd018d7c0 R9: 0000000000000030
R10: ffffffff80736fe0 R11: ffffffff803add23 R12: ffff8801f78e1cf8
R13: 00007fffd018d7c0 R14: 00007fffd018d800 R15: 0000000000000880
ORIG_RAX: ffffffffffffffff CS: e030 SS: e02b
#4 [ffff8801f53afe90] mutex_lock at ffffffff80263914
#5 [ffff8801f53afeb0] blk_cleanup_queue at ffffffff8033674d
#6 [ffff8801f53afed0] aoedev_exit at ffffffff88539f62
#7 [ffff8801f53afef0] cleanup_module at ffffffff8853a0a1
#8 [ffff8801f53aff00] sys_delete_module at ffffffff802a2103
#9 [ffff8801f53aff80] system_call at ffffffff80260106
RIP: 0000003a518d3f37 RSP: 00007fffd018d7b8 RFLAGS: 00000246
RAX: 00000000000000b0 RBX: ffffffff80260106 RCX: ffffffff80260080
RDX: 0000000000000fdf RSI: 0000000000000880 RDI: 00007fffd018d7c0
RBP: 0000000000000003 R8: 00002ab0efdf0240 R9: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffd0190080
R13: 00007fffd0190000 R14: 00000000d018d800 R15: ffff8801f53aff64
ORIG_RAX: 00000000000000b0 CS: e033 SS: e02b
Moving to kernel. The report says it reproduces also on bare-metal.
That kernel posting is from May of last year! Would you please repost your patch to lkml, CC-ing Jens Axboe <email@example.com>?
Any update? Getting this upstream will make it easier to get into RHEL.
(In reply to comment #4)
> Hi, Masanori,
> Any update? Getting this upstream will make it easier to get into RHEL.
Sorry, I've been out of office for several months.
OK. I will repost the patch to upstream cc-ing Jens.
But, please note that even last year there were considerable differences between aoe drivers of RHEL5 kernel and the upstream kernel (in those days, 2.6.34).
In addition, please note we need to take another bug into account in case of RHEL5 kernel.
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.
This bug/component is not included in scope for RHEL-5.11.0 which is the last RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX (at the end of RHEL5.11 development phase (Apr 22, 2014)). Please contact your account manager or support representative in case you need to escalate this bug.
Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in RHEL5 stream. If the issue is critical for your business, please provide additional business justification through the appropriate support channels (https://access.redhat.com/site/support).