Bug 871885

Summary: Oops 0002 [#1] SMP on mvsas
Product: [Fedora] Fedora Reporter: Erasmo Acosta <eacosta>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 17CC: gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-11-26 18:39:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
oops-2012-10-31-05:53:38-15613-1 tar file none

Description Erasmo Acosta 2012-10-31 16:04:19 UTC
Created attachment 636179 [details]
oops-2012-10-31-05:53:38-15613-1 tar file

Description of problem:

I have 5 2TB WD Caviar Black SATA drives on RAID 5 connected to a Supermicro PCI Express x4 Low Profile SAS RAID Controller (AOC-SASLP-MV8). There is an XFS filesystem on top of it.

The error ocurred after ~14 hours running cp -r from a SATA drived plugged to the motherboard to the XFS filesystem on the RAID. Total amount to be copied about 2.6 TB.

After the error the machine was partially responsive but froze during shutdown. The RAID device was not responsive at all. After powering off the RAID 5 went into resync. It takes about 5 1/2 hrs.

Version-Release number of selected component (if applicable):3.6.3-1.fc17.x86_64


How reproducible:Not Sure


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Oct 31 05:53:37 neo kernel: [58461.488743] BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
Oct 31 05:53:37 neo kernel: [58461.488788] IP: [<ffffffff8161d16f>] _raw_spin_lock_irqsave+0x1f/0x40
Oct 31 05:53:37 neo kernel: [58461.488822] PGD 0
Oct 31 05:53:37 neo kernel: [58461.488835] Oops: 0002 [#1] SMP
Oct 31 05:53:37 neo kernel: [58461.488855] Modules linked in: vmnet(O) vsock(O) vmci(O) vmmon(O) fuse lockd sunrpc bnep bluetooth rfkill xfs raid456 async_raid6_recov async_memcpy async_pq raid6_pq async_xor iTCO_wdt iTCO_vendor_support xor async_tx coretemp kvm_intel kvm snd_hda_codec_realtek ppdev microcode serio_raw i2c_i801 lpc_ich mfd_core snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm uinput r8169 shpchp snd_page_alloc snd_timer snd mii soundcore mei parport_pc parport crc32c_intel ghash_clmulni_intel mvsas libsas scsi_transport_sas usb_storage i915 video i2c_algo_bit drm_kms_helper drm i2c_core
Oct 31 05:53:37 neo kernel: [58461.489183] CPU 5
Oct 31 05:53:37 neo kernel: [58461.489196] Pid: 265, comm: scsi_eh_7 Tainted: G           O 3.6.3-1.fc17.x86_64 #1 MSI MS-7750/H67A-G43 (MS-7750)
Oct 31 05:53:37 neo kernel: [58461.489240] RIP: 0010:[<ffffffff8161d16f>]  [<ffffffff8161d16f>] _raw_spin_lock_irqsave+0x1f/0x40
Oct 31 05:53:37 neo kernel: [58461.489280] RSP: 0018:ffff8804225a9c80  EFLAGS: 00010086
Oct 31 05:53:37 neo kernel: [58461.489304] RAX: 0000000000000286 RBX: 0000000000000058 RCX: 0000000000000002
Oct 31 05:53:37 neo kernel: [58461.489334] RDX: 0000000000000100 RSI: 0000000000000286 RDI: 0000000000000058
Oct 31 05:53:37 neo kernel: [58461.489365] RBP: ffff8804225a9c80 R08: 000000000000000a R09: 0000000000000446
Oct 31 05:53:37 neo kernel: [58461.489395] R10: 0000000000000000 R11: 0000000000000445 R12: 0000000000000050
Oct 31 05:53:37 neo kernel: [58461.489425] R13: ffff8804225e5248 R14: ffff880422559800 R15: ffff8804225c0000
Oct 31 05:53:37 neo kernel: [58461.489455] FS:  0000000000000000(0000) GS:ffff88043fb40000(0000) knlGS:0000000000000000
Oct 31 05:53:37 neo kernel: [58461.489490] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Oct 31 05:53:37 neo kernel: [58461.489514] CR2: 0000000000000058 CR3: 0000000001c0b000 CR4: 00000000000407e0
Oct 31 05:53:37 neo kernel: [58461.489545] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Oct 31 05:53:37 neo kernel: [58461.489575] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Oct 31 05:53:37 neo kernel: [58461.489616] Process scsi_eh_7 (pid: 265, threadinfo ffff8804225a8000, task ffff880424785c40)
Oct 31 05:53:37 neo kernel: [58461.489651] Stack:
Oct 31 05:53:37 neo kernel: [58461.489661]  ffff8804225a9cb0 ffffffff8108b388 0000000000013cc0 ffff88010e695e00
Oct 31 05:53:37 neo kernel: [58461.489702]  0000000000000005 ffff8804225e5248 ffff8804225a9cc0 ffffffffa00fd80d
Oct 31 05:53:37 neo kernel: [58461.489753]  ffff8804225a9d20 ffffffffa01008d0 00000000225a9d00 0000000000000018
Oct 31 05:53:37 neo kernel: [58461.489795] Call Trace:
Oct 31 05:53:37 neo kernel: [58461.489813]  [<ffffffff8108b388>] complete+0x28/0x60
Oct 31 05:53:37 neo kernel: [58461.489844]  [<ffffffffa00fd80d>] mvs_tmf_timedout+0x1d/0x20 [mvsas]
Oct 31 05:53:37 neo kernel: [58461.489879]  [<ffffffffa01008d0>] mvs_slot_complete+0x710/0x7a0 [mvsas]
Oct 31 05:53:37 neo kernel: [58461.489916]  [<ffffffffa0126245>] sas_scsi_recover_host+0x365/0xe30 [libsas]
Oct 31 05:53:37 neo kernel: [58461.489953]  [<ffffffff813e375b>] scsi_error_handler+0x11b/0x6e0
Oct 31 05:53:37 neo kernel: [58461.489985]  [<ffffffff8161b8a6>] ? __schedule+0x3c6/0x7a0
Oct 31 05:53:37 neo kernel: [58461.490014]  [<ffffffff813e3640>] ? scsi_eh_get_sense+0x1d0/0x1d0
Oct 31 05:53:37 neo kernel: [58461.490044]  [<ffffffff8107ed13>] kthread+0x93/0xa0
Oct 31 05:53:37 neo kernel: [58461.490070]  [<ffffffff81626144>] kernel_thread_helper+0x4/0x10
Oct 31 05:53:37 neo kernel: [58461.490100]  [<ffffffff8107ec80>] ? kthread_freezable_should_stop+0x70/0x70
Oct 31 05:53:37 neo kernel: [58461.490134]  [<ffffffff81626140>] ? gs_change+0x13/0x13
Oct 31 05:53:37 neo kernel: [58461.490159] Code: c8 88 cc ff 48 89 d0 5d c3 0f 1f 00 55 48 89 e5 66 66 66 66 90 9c 58 66 66 90 66 90 48 89 c6 fa 66 66 90 66 66 90 ba 00 01 00 00 <f0> 66 0f c1 17 0f b6 ce 38 d1 74 0e 0f 1f 44 00 00 f3 90 0f b6
Oct 31 05:53:37 neo kernel: [58461.490391] RIP  [<ffffffff8161d16f>] _raw_spin_lock_irqsave+0x1f/0x40
Oct 31 05:53:37 neo kernel: [58461.490436]  RSP <ffff8804225a9c80>
Oct 31 05:53:37 neo kernel: [58461.490456] CR2: 0000000000000058
Oct 31 05:53:37 neo kernel: [58461.552184] ---[ end trace 6e7b5b75fe6e0a78 ]---
Oct 31 05:53:38 neo sh[759]: abrt-dump-oops: Found oopses: 1
Oct 31 05:53:38 neo sh[759]: abrt-dump-oops: Creating problem directories
Oct 31 05:53:38 neo abrtd: Directory 'oops-2012-10-31-05:53:38-15613-1' creation detected
Oct 31 05:53:38 neo abrt-dump-oops: Reported 1 kernel oopses to Abrt
Oct 31 05:53:38 neo abrtd: Looking for kernel package
Oct 31 05:53:38 neo abrtd: Kernel package kernel-3.6.3-1.fc17.x86_64 found
Oct 31 05:53:38 neo abrtd: New problem directory /var/spool/abrt/oops-2012-10-31-05:53:38-15613-1, processing

Comment 1 Josh Boyer 2012-11-26 18:39:28 UTC

*** This bug has been marked as a duplicate of bug 869629 ***