Bug 593905 - aoe driver causes a panic [NEEDINFO]
Summary: aoe driver causes a panic
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.5
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Jeff Moyer
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-05-20 02:51 UTC by Masanori ITOH
Modified: 2014-06-02 13:08 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-06-02 13:08:46 UTC
Target Upstream Version:
pm-rhel: needinfo? (masanori.itoh)


Attachments (Terms of Use)
a patch to fix the problem (745 bytes, patch)
2010-05-20 02:51 UTC, Masanori ITOH
no flags Details | Diff

Description Masanori ITOH 2010-05-20 02:51:54 UTC
Created attachment 415301 [details]
a patch to fix the problem

Description of problem:
  aoe driver causes a panic when the backport patch reported in 593902
  is applied.

Version-Release number of selected component (if applicable):
  kernel-xen-2.6.18-194.el5

How reproducible:
  Please see the LKML post below.
    http://marc.info/?l=linux-kernel&m=127425948728422&w=2
  This is reproducible in also non-xen kernel.

Steps to Reproduce:
  Please see the above LKML post.
  
Actual results:
  Kernel panic when rmmod is invoked.

Expected results:
  No panic.

Additional info:
  The attached patch fixes this problem. It's logically the same thing with 
  the fix describe in the above post.

  The follwing is additional information.
  Note that in the case below, I used a backport patched version of aoe.

1. crash command quick results
crash> sys
      KERNEL: /usr/lib/debug/lib/modules/2.6.18-194.el5xen/vmlinux
    DUMPFILE: /var/crash/2010-05-15-00:41/vmcore
        CPUS: 16
        DATE: Sat May 15 00:40:12 2010
      UPTIME: 04:32:08
LOAD AVERAGE: 0.00, 0.00, 0.00
       TASKS: 260
    NODENAME: cb-blsv351
     RELEASE: 2.6.18-194.el5xen
     VERSION: #1 SMP Tue Mar 16 22:01:26 EDT 2010
     MACHINE: x86_64  (2266 Mhz)
      MEMORY: 8 GB
       PANIC: "Oops: 0002 [1] SMP " (check log for details)

crash> bt
PID: 9673   TASK: ffff8801ffe16040  CPU: 7   COMMAND: "rmmod"
 #0 [ffff8801f53afbf0] crash_kexec at ffffffff802a62c4
 #1 [ffff8801f53afcb0] __die at ffffffff80265281
 #2 [ffff8801f53afcf0] do_page_fault at ffffffff8026791c
 #3 [ffff8801f53afde0] error_exit at ffffffff8026082b
    [exception RIP: mutex_lock+16]
    RIP: ffffffff80263917  RSP: ffff8801f53afe98  RFLAGS: 00010246
    RAX: 0000000000000000  RBX: 0000000000000610  RCX: 000000000000000f
    RDX: ffffffffff578000  RSI: 0000000000000000  RDI: 0000000000000610
    RBP: 0000000000000000   R8: 00007fffd018d7c0   R9: 0000000000000030
    R10: ffffffff80736fe0  R11: ffffffff803add23  R12: ffff8801f78e1cf8
    R13: 00007fffd018d7c0  R14: 00007fffd018d800  R15: 0000000000000880
    ORIG_RAX: ffffffffffffffff  CS: e030  SS: e02b
 #4 [ffff8801f53afe90] mutex_lock at ffffffff80263914
 #5 [ffff8801f53afeb0] blk_cleanup_queue at ffffffff8033674d
 #6 [ffff8801f53afed0] aoedev_exit at ffffffff88539f62
 #7 [ffff8801f53afef0] cleanup_module at ffffffff8853a0a1
 #8 [ffff8801f53aff00] sys_delete_module at ffffffff802a2103
 #9 [ffff8801f53aff80] system_call at ffffffff80260106
    RIP: 0000003a518d3f37  RSP: 00007fffd018d7b8  RFLAGS: 00000246
    RAX: 00000000000000b0  RBX: ffffffff80260106  RCX: ffffffff80260080
    RDX: 0000000000000fdf  RSI: 0000000000000880  RDI: 00007fffd018d7c0
    RBP: 0000000000000003   R8: 00002ab0efdf0240   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000246  R12: 00007fffd0190080
    R13: 00007fffd0190000  R14: 00000000d018d800  R15: ffff8801f53aff64
    ORIG_RAX: 00000000000000b0  CS: e033  SS: e02b
crash>

Comment 1 Andrew Jones 2011-03-07 17:34:26 UTC
Moving to kernel. The report says it reproduces also on bare-metal.

Comment 3 Jeff Moyer 2011-08-08 17:42:17 UTC
That kernel posting is from May of last year!  Would you please repost your patch to lkml, CC-ing Jens Axboe <jaxboe@fusionio.com>?

Thanks!

Comment 4 Jeff Moyer 2011-10-24 19:03:12 UTC
Hi, Masanori,

Any update?  Getting this upstream will make it easier to get into RHEL.

Thanks!

Comment 6 Jeff Moyer 2012-08-29 20:06:11 UTC
(In reply to comment #4)
> Hi, Masanori,
> 
> Any update?  Getting this upstream will make it easier to get into RHEL.
> 
> Thanks!

ping

Comment 7 Masanori ITOH 2012-10-16 01:25:52 UTC
Hi Jeffrey,

Sorry, I've been out of office for several months.

OK. I will repost the patch to upstream cc-ing Jens.

But, please note that even last year there were considerable differences between aoe drivers of RHEL5 kernel and the upstream kernel (in those days, 2.6.34).
In addition, please note we need to take another bug into account in case of RHEL5 kernel.

  https://bugzilla.redhat.com/show_bug.cgi?id=593902

Regards,
Masanori

Comment 8 RHEL Program Management 2012-10-30 05:49:16 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 9 RHEL Program Management 2014-03-07 13:40:30 UTC
This bug/component is not included in scope for RHEL-5.11.0 which is the last RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX (at the end of RHEL5.11 development phase (Apr 22, 2014)). Please contact your account manager or support representative in case you need to escalate this bug.

Comment 10 RHEL Program Management 2014-06-02 13:08:46 UTC
Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in RHEL5 stream. If the issue is critical for your business, please provide additional business justification through the appropriate support channels (https://access.redhat.com/site/support).


Note You need to log in before you can comment on or make changes to this bug.