Bug 1002758 - [abrt] BUG: soft lockup - CPU#0 stuck for 22s! [scsi_eh_6:143]
Summary: [abrt] BUG: soft lockup - CPU#0 stuck for 22s! [scsi_eh_6:143]
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 19
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: abrt_hash:70af0f0685234e9e3d0bff2d8a0...
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-08-29 22:49 UTC by Adam DiFrischia
Modified: 2013-09-26 22:37 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-09-26 22:37:05 UTC
Type: ---


Attachments (Terms of Use)
File: dmesg (68.74 KB, text/plain)
2013-08-29 22:49 UTC, Adam DiFrischia
no flags Details

Description Adam DiFrischia 2013-08-29 22:49:33 UTC
Description of problem:
Running gnome-disks benchmark on a RAID-1 via an Areca 1882ix RAID card. This softlockup is not contained only to this RAID type or set. Occasionally this error seems to happen, during which I get a notice from the RAID card that the card "has powered on", which I would image means it has reset itself.

Additional info:
reporter:       libreport-2.1.6
BUG: soft lockup - CPU#0 stuck for 22s! [scsi_eh_6:143]
Modules linked in: fuse ebtable_nat nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle bnep nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack bluetooth rfkill ebtable_filter ebtables ip6table_filter ip6_tables e1000e iTCO_wdt mperf coretemp kvm_intel i2c_i801 ppdev iTCO_vendor_support kvm crc32_pclmul crc32c_intel ghash_clmulni_intel parport_pc parport i2c_core ptp pps_core lpc_ich mfd_core microcode video serio_raw uinput usb_storage arcmsr
CPU: 0 PID: 143 Comm: scsi_eh_6 Not tainted 3.10.9-200.fc19.x86_64 #1
Hardware name: To be filled by O.E.M. To be filled by O.E.M./P8B-C Series, BIOS 2011 02/24/2012
task: ffff8802227f07a0 ti: ffff880221c50000 task.ti: ffff880221c50000
RIP: 0010:[<ffffffffa0002432>]  [<ffffffffa0002432>] arcmsr_get_firmware_spec+0x72/0x5e0 [arcmsr]
RSP: 0018:ffff880221c51cd8  EFLAGS: 00000246
RAX: ffffc90010ec0000 RBX: ffff880221c51c90 RCX: ffffc90010ec00bc
RDX: 0000000000000000 RSI: ffffc90010ec2244 RDI: ffff880221f486b8
RBP: ffff880221c51d38 R08: ffff880221c50000 R09: 0000000000000001
R10: 0000000000000001 R11: 0000000000000005 R12: 0000000000000286
R13: ffffffff8106bf8f R14: ffff880221c51c58 R15: 0000000000000286
FS:  0000000000000000(0000) GS:ffff88022fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f438447b2b0 CR3: 0000000001c0c000 CR4: 00000000000407f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
 0000000000000000 dfe8000400000000 ffff880221f4ad00 ffff880221f4acf4
 ffffc90010ec0000 ffffc90010ec2244 ffff880221f486b8 ffff880221f48000
 000000000000000d ffff880221f486b8 ffffc90010ec00f8 ffffc90010ec0000
Call Trace:
 [<ffffffffa0002c77>] arcmsr_bus_reset+0x2d7/0x5c0 [arcmsr]
 [<ffffffff813d74c7>] ? put_device+0x17/0x20
 [<ffffffff813fcec3>] scsi_try_bus_reset+0x43/0x100
 [<ffffffff813fe8a9>] scsi_eh_ready_devs+0x4f9/0x920
 [<ffffffff813ff98f>] scsi_error_handler+0x51f/0x6f0
 [<ffffffff813ff470>] ? scsi_eh_get_sense+0x200/0x200
 [<ffffffff81080b60>] kthread+0xc0/0xd0
 [<ffffffff81080aa0>] ? insert_kthread_work+0x40/0x40
 [<ffffffff8164766c>] ret_from_fork+0x7c/0xb0
 [<ffffffff81080aa0>] ? insert_kthread_work+0x40/0x40
Code: 4c 8d a0 3c 22 00 00 48 89 75 b0 48 8d b0 44 22 00 00 48 89 75 c8 49 89 f7 8b 50 34 83 ca 0d 89 50 34 48 8d 88 bc 00 00 00 8b 11 <85> d2 79 fa ba 01 00 00 00 48 8b 45 c0 89 90 b0 00 00 00 b2 08

Comment 1 Adam DiFrischia 2013-08-29 22:49:39 UTC
Created attachment 791976 [details]
File: dmesg

Comment 2 Josh Boyer 2013-09-18 20:37:55 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.11.1-200.fc19.  Please test this kernel update and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you experience different issues, please open a new bug report for those.

Comment 3 Adam DiFrischia 2013-09-26 21:43:41 UTC
Still getting similar CPU stuck issues. I'm using an Areca RAID card in the device:

01:00.0 RAID bus controller: Areca Technology Corp. ARC-1880 8/12 port PCIe/PCI-X to SAS/SATA II RAID Controller (rev 01)

I'm not certain under what conditions exactly, but the card is resetting itself. During I/O, the card will reboot itself and re-spin the drives and check them, do it's normal POST routine, etc. Then Fedora unhangs itself and it continues. It seems related to when I've introduced iSCSI operations into the mix. I've used both scsi-target-utils and the kernel-based iSCSI (targetcli), but in both situations, this happens during a backup operation on a different machine which is using the iSCSI-exported volume.

Stress-testing the device using fio while it was standalone and not running any iSCSI operations ran flawlessly for over 3 days.

Comment 4 Adam DiFrischia 2013-09-26 22:37:05 UTC
Further investigation based on previous comment indicates this is appropriate behavior per arcmsr/arcmsr6 driver, which is resetting the card, causing the hang until the card is re-registered.


Note You need to log in before you can comment on or make changes to this bug.