Description of problem: In /var/log/messages, we see this message many times: Nov 15 08:04:01 ti23 kernel: [ID kern.info] [28041.220666] mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) Eventually, the system becomes so sluggish that it is unusable. Version-Release number of selected component (if applicable): kernel-2.6.34.7-61.fc13.x86_64 Fusion MPT base driver 3.04.14 Fusion MPT SAS Host driver 3.04.14 How reproducible: Do some I/O and wait for the system to hang Steps to Reproduce: 1. Boot current Fedora 13 kernel on a system using the LSI SAS1068E controller to speak to hard disks. 2. Do some I/O. 3. After a few hours (sometimes faster), the system becomes unresponsive, although Alt-Sysrequest works for rebooting it. Actual results: Disk I/O bandwidth becomes incredibly slow and causes the system to stop responding. Expected results: A working system. Additional info: I have downloaded the newer 4.24.00.00 driver from LSI's web site and patched that into the kernel. I hope this may solve the problem. http://www.lsi.com/storage_home/products_home/standard_product_ics/sas_ics/lsisas1068e/index.html
FYI, the 4.24.00.00 driver has solved the problem for me. The system is now stable. Is there a reason it's desirable to keep such an old version in the kernel source tree?
After 5.5 days, the new kernel started exhibiting similar problems, so I'm afraid the new driver did not fix this completely. I suppose it could be a hardware problem, although the system ran stably under Fedora Core 6 for years. Here are the messages we are seeing: Nov 21 05:52:23 ti23 kernel: [ID kern.info] mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) Nov 21 05:52:23 ti23 kernel: [ID kern.info] mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) Nov 21 05:52:23 ti23 kernel: [ID kern.info] mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) Nov 21 05:52:23 ti23 kernel: [ID kern.info] mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) Nov 21 05:52:54 ti23 kernel: [ID kern.info] mptscsih: ioc0: attempting task abort! (sc=ffff88000ab48800) Nov 21 05:52:54 ti23 kernel: [ID kern.info] sd 4:0:9:0: [sdj] CDB: Read(10): 28 00 41 2e 4b 3f 00 01 00 00 Nov 21 05:52:55 ti23 kernel: [ID kern.info] mptbase: ioc0: LogInfo(0x31130000): Originator={PL}, Code={IO Not Yet Executed}, SubCode(0x0000) Nov 21 05:52:55 ti23 kernel: [ID kern.info] mptscsih: ioc0: task abort: SUCCESS (sc=ffff88000ab48800) Nov 21 05:52:55 ti23 kernel: [ID kern.info] mptscsih: ioc0: attempting task abort! (sc=ffff880025460200) Nov 21 05:52:55 ti23 kernel: [ID kern.info] sd 4:0:9:0: [sdj] CDB: Read(10): 28 00 41 2e 45 3f 00 01 00 00 Nov 21 05:52:55 ti23 kernel: [ID kern.info] mptscsih: ioc0: task abort: SUCCESS (sc=ffff880025460200) ...
And also the kernel is reporting a hung task in the mdadm resync as follows: Nov 21 06:04:36 ti23 kernel: [ID kern.info] mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) Nov 21 06:04:36 ti23 kernel: [ID kern.info] mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort}, SubCode(0x3000) Nov 21 06:04:53 ti23 kernel: [ID kern.err] INFO: task md127_resync:29843 blocked for more than 120 seconds. Nov 21 06:04:53 ti23 kernel: [ID kern.err] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 21 06:04:53 ti23 kernel: [ID kern.info] md127_resync D 0000000000000002 0 29843 2 0x00000080 Nov 21 06:04:53 ti23 kernel: [ID kern.warn] ffff880049505b80 0000000000000046 ffff880049505b30 ffffffffa018784f Nov 21 06:04:53 ti23 kernel: [ID kern.warn] ffff880049505fd8 ffff88007aa49770 00000000000153c0 ffff880049505fd8 Nov 21 06:04:53 ti23 kernel: [ID kern.warn] 00000000000153c0 00000000000153c0 00000000000153c0 00000000000153c0 Nov 21 06:04:53 ti23 kernel: [ID kern.warn] Call Trace: Nov 21 06:04:53 ti23 kernel: [ID kern.warn] [<ffffffffa018784f>] ? unplug_slaves+0x7f/0xb9 [raid456] Nov 21 06:04:53 ti23 kernel: [ID kern.warn] [<ffffffffa0187c3e>] get_active_stripe+0x2bc/0x63e [raid456] Nov 21 06:04:53 ti23 kernel: [ID kern.warn] [<ffffffff8104818d>] ? default_wake_function+0x0/0x14 Nov 21 06:04:53 ti23 kernel: [ID kern.warn] [<ffffffffa018b4e6>] sync_request+0x257/0x2e3 [raid456] Nov 21 06:04:53 ti23 kernel: [ID kern.warn] [<ffffffff8135d5e8>] ? is_mddev_idle+0xae/0x102 Nov 21 06:04:53 ti23 kernel: [ID kern.warn] [<ffffffff8135dd75>] md_do_sync+0x739/0xb47 Nov 21 06:04:53 ti23 kernel: [ID kern.warn] [<ffffffff81066153>] ? autoremove_wake_function+0x0/0x39 Nov 21 06:04:53 ti23 kernel: [ID kern.warn] [<ffffffff8135ea80>] md_thread+0xf6/0x114 Nov 21 06:04:53 ti23 kernel: [ID kern.warn] [<ffffffff8135e98a>] ? md_thread+0x0/0x114 Nov 21 06:04:53 ti23 kernel: [ID kern.warn] [<ffffffff81065cd9>] kthread+0x7f/0x87 Nov 21 06:04:53 ti23 kernel: [ID kern.warn] [<ffffffff8100aa64>] kernel_thread_helper+0x4/0x10 Nov 21 06:04:53 ti23 kernel: [ID kern.warn] [<ffffffff81065c5a>] ? kthread+0x0/0x87 Nov 21 06:04:53 ti23 kernel: [ID kern.warn] [<ffffffff8100aa60>] ? kernel_thread_helper+0x0/0x10 Nov 21 06:05:07 ti23 kernel: [ID kern.info] mptscsih: ioc0: attempting task abort! (sc=ffff880077583300) Nov 21 06:05:07 ti23 kernel: [ID kern.info] sd 4:0:7:0: [sdh] CDB: Read(10): 28 00 41 2f 96 c7 00 00 78 00 Nov 21 06:05:08 ti23 kernel: [ID kern.info] mptbase: ioc0: LogInfo(0x31130000): Originator={PL}, Code={IO Not Yet Executed}, SubCode(0x0000) Nov 21 06:05:08 ti23 kernel: [ID kern.info] mptscsih: ioc0: task abort: SUCCESS (sc=ffff880077583300) Nov 21 06:05:08 ti23 kernel: [ID kern.info] mptscsih: ioc0: attempting task abort! (sc=ffff8800054c0200) Nov 21 06:05:08 ti23 kernel: [ID kern.info] sd 4:0:7:0: [sdh] CDB: Read(10): 28 00 41 2f 96 af 00 00 18 00 Nov 21 06:05:08 ti23 kernel: [ID kern.info] mptscsih: ioc0: task abort: SUCCESS (sc=ffff8800054c0200)
This message is a reminder that Fedora 13 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 13. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '13'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 13's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 13 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Fedora 13 changed to end-of-life (EOL) status on 2011-06-25. Fedora 13 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.