Bug 493093
Summary: | Old mptsas driver in RHEL 5.4/ Fedora 12 /RHEL 6 Beta | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Max E <max> |
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> |
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | low | Docs Contact: | |
Priority: | low | ||
Version: | 13 | CC: | admin, akrherz, bernhard.kohl, bruno, codehotter, d.bz-redhat, ijones, jimk, jwboyer, kernel-maint, pasik, pmarciniak, pvogel, shigorin |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2011-06-27 14:08:45 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
Max E
2009-03-31 15:52:15 UTC
The driver looks to be basically unmaintained in the upstream kernel: they fix only severe bugs AFAICT. If you have access to LSI support you might want to ask them why they don't keep the driver up-to-date. I have emailed LSI and got some responses back. >We have a relationship with kernel.org and do work to resolve issues > when they arise and are reported. > > Please note that driver certification is not instantaneous, and we do > not support all versions of Linux. We have limited kernel support and > provide direct fixes for customers who work in supported > configurations. Please accept my apologies, but we are unable to > provide source code for our proprietary drivers. When pushed to provide the latest code to kernel.org, I then got the following back. >I am unable to guarantee any timeframe for newer drivers to be integrated into >later kernels. Can you guys try and see whether anybody in Kernel.org actually has contact with LSI? I don't think I am going to get very far from the user perspective. Actually I have a problem with this old mptsas driver. On some of our systems, running Fedora 8 (2.6.26.8) or Fedora 10 (2.6.27.21), the hard disk goes offline every few days or even hours. At LSI they have MPT driver packages for RHEL 4 (mptlinux-3.13.04.00-2) and RHEL 5 (mptlinux-4.00.43.00-1). In the release notes of the mptlinux-3.13.04.00-2 there is already the following defect fix, which should solve my problem: SCGCQ00019660: When the diag reset is issued through lsitutil, the driver is not clearing the pending I/O requests and hence the requests got timed out and leads to error recovery. The error recovery actions may lead to offlining of the device. Also the message frame allocated for event notification is not released when the diag reset is issued which leads to failure of message frame allocation after N number of diag resets. Both the issues are fixed in this version of the driver. I tried to compile both drivers on Fedora 10, but both got compiler errors. Some of the scsi structures have been changed. It would be good to get this driver to the upstream kernel. My systems: lspci -nn ... 05:00.0 SCSI storage controller [0100]: LSI Logic / Symbios Logic SAS1064ET PCI-Express Fusion-MPT SAS [1000:0056] (rev 02) ... dmesg ... mptscsih: ioc0: attempting task abort! (sc=f5310500) sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 07 65 e3 99 00 00 08 00 mptscsih: ioc0: WARNING - TM Handler for type=1: IOC Not operational (0xffffffff)! mptscsih: ioc0: WARNING - Issuing HardReset!! mptbase: ioc0: Initiating recovery mptbase: ioc0: WARNING - Unexpected doorbell active! sd 0:0:0:0: mptscsih: ioc0: completing cmds: fw_channel 0, fw_id 4, sc=f5310500, mf = f59e8500, idx=ba sd 0:0:0:0: mptscsih: ioc0: completing cmds: fw_channel 0, fw_id 4, sc=f5310700, mf = f59e9d00, idx=ea sd 0:0:0:0: mptscsih: ioc0: completing cmds: fw_channel 0, fw_id 4, sc=f5310800, mf = f59e9d80, idx=eb mptbase: ioc0: WARNING - ResetHistory bit failed to clear! mptbase: ioc0: ERROR - Diagnostic reset FAILED! (ffffffffh) mptbase: ioc0: WARNING - NOT READY! mptbase: ioc0: WARNING - Cannot recover rc = -1! mptscsih: ioc0: WARNING - TMHandler: HardReset FAILED!! mptscsih: ioc0: task abort: FAILED (sc=f5310500) mptscsih: ioc0: attempting task abort! (sc=f5310700) sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 07 65 e4 39 00 00 08 00 mptscsih: ioc0: task abort: SUCCESS (sc=f5310700) mptscsih: ioc0: attempting task abort! (sc=f5310700) sd 0:0:0:0: [sda] CDB: Test Unit Ready: 00 00 00 00 00 00 mptscsih: ioc0: task abort: SUCCESS (sc=f5310700) mptscsih: ioc0: attempting task abort! (sc=f5310800) sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 07 65 e4 49 00 00 08 00 mptscsih: ioc0: task abort: SUCCESS (sc=f5310800) mptscsih: ioc0: attempting task abort! (sc=f5310800) sd 0:0:0:0: [sda] CDB: Test Unit Ready: 00 00 00 00 00 00 mptscsih: ioc0: task abort: SUCCESS (sc=f5310800) mptscsih: ioc0: attempting target reset! (sc=f5310500) sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 07 65 e3 99 00 00 08 00 mptscsih: ioc0: target reset: FAILED (sc=f5310500) mptscsih: ioc0: attempting bus reset! (sc=f5310500) sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 07 65 e3 99 00 00 08 00 mptscsih: ioc0: bus reset: FAILED (sc=f5310500) mptscsih: ioc0: attempting host reset! (sc=f5310500) mptbase: ioc0: Initiating recovery mptbase: ioc0: WARNING - Unexpected doorbell active! mptbase: ioc0: WARNING - ResetHistory bit failed to clear! mptbase: ioc0: ERROR - Diagnostic reset FAILED! (ffffffffh) mptbase: ioc0: WARNING - NOT READY! mptbase: ioc0: WARNING - Cannot recover rc = -1! mptscsih: ioc0: host reset: FAILED (sc=f5310500) sd 0:0:0:0: Device offlined - not ready after error recovery sd 0:0:0:0: Device offlined - not ready after error recovery sd 0:0:0:0: Device offlined - not ready after error recovery sd 0:0:0:0: rejecting I/O to offline device Buffer I/O error on device dm-0, logical block 15464488 lost page write due to I/O error on dm-0 ... Created attachment 340163 [details]
Source Code to replace aging mptsas driver - if you can get it to compile!
It seems that LSI ~do~ infact give out the source-code for this module from their website, so I had a quick stab at it on Fedora 10 x86_64. You will need to get the kernel sources for your kernel version, and change the kernel BUILD directory to point to where your 'build' directory is located. This is not so much a point and shoot, you need to run the bash scripts (although I suspect these will need to be modified to get them to work.) The system complained that some of the kernel configs were invalid when I ran the ./compile bash script, and my knowledge of bodging kernels rather driffed away after kernel version 2.2! Would it be possible for somebody far more clever than myself to try and get this code to work for FC10? This message is a reminder that Fedora 10 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 10. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '10'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 10's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 10 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Chaps - still no movement with this - still running 3.04.07 on Fedora 11 This message is a reminder that Fedora 11 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 11. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '11'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 11's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 11 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Fedora 12 now has version 3.04.12 of the driver. Still very odd code. I'm going to keep this open to keep poking RH/Fedora people to try and merge the latest code from LSI. Created attachment 424786 [details] Port LSI driver to 2.6.27 Found this patch here: https://bugzilla.kernel.org/show_bug.cgi?id=12163 Created attachment 424791 [details] Port LSI driver to 2.6.29 And I did this one myself. Please review this patch very carefully before attempting to build with it, as I don't really know what I'm doing. I have managed to build the LSI driver 4.00.43.00 with latest kernel using these two patches yum install kernel-devel gcc wget http://www.lsi.com/DistributionSystem/AssetDocument/support/downloads/hbas/sas/software_drivers/linux/MPTLINUX_RHEL5_SLES10_PH14-4.00.43.00-1.zip unzip MPTLINUX_RHEL5_SLES10_PH14-4.00.43.00-1.zip cd mptlinux_RHEL5_SLES10_rel tar xf RHEL5_SLES10.tar.gz cd srpms-1 rpm2cpio mptlinux-4.00.43.00-1.src.rpm | cpio -idv tar xf mptlinux-4.00.43.00.tar.gz patch -p1 < to2.6.27.patch patch -p1 < to2.6.29.patch cat Makefile | sed -e "s/\(KERNEL=\)/\1`uname -r`/" > Makefile ls `uname -r` I hope it's useful to someone. Thanks to Emanuel, you ~have~ to down both the patches 2.6.27 and 2.6.29 and apply them to the source...thanks Emanuel. Emanuel sent this very kind reply when I asked him about the compilation process.... You do not need to change directories. You can download to2.6.27.patch and to2.6.29.patch from the bug report. They update the driver code to be compatible with the api changes in 2.6.27 and 2.6.29 respectively. After applying those two patches, you can use the sed command in the instructions to update the makefile with the kernel version you are running. If you are compiling for a different kernel, you should install the kernel-devel rpm for that kernel and edit the makefile manually instead. You can compile the driver by running "make." When the command completes, you can find the driver modules in the directory with the same name as the kernel you are compiling for. Let me warn you to backup all data before trying this driver as it is not well tested, especially not with recent kernels. Good luck! Emanuel I can confirm that the patch and subsequent code compile in 2.6.32.14-127.fc12.x86_64 - Fedora 12. I'll give it a go for Fedora 13 and report back! this is still the case in rhel 5.5. On a Dell 2950 with a lsi SAS1068E raid controller with the latest firmware I get complaints from OMSA tools that the driver is to old On a Dell T105 I cannot update firmware because the driver is too old. Off course I can use the official Dell drivers for it, but these drivers should be in the kernel already official lsi-driver 4.22.00.00 does nicely compile on rhel 5.5 rhel6 beta 2 has also still the old version 3.04.13 Could somebody from Kernel Maintenance give us an idea of what driver we will be expecting in RHEL 6 please? Current LSI version of the code standards at 4.31. Could somebody please push a 4.x version of the code into the kernel for us please? Created attachment 456262 [details]
Version 4.33 of the LSI MegaRAID SAS drivers
Did someone try that 4.33 version of megaraid_sas with 2.6.32 kernel? The default megaraid_sas in 2.6.32.25 seems to be broken (at least on Dell R510 + PERC H700 RAID). Boot fails because the disks are not detected/enabled and driver prints some errors/failures after a while.. so I'm wondering which driver version I should upgrade to. I just tried that megaraid_sas v4.33 and it worked OK with Linux 2.6.32.25 kernel! (the default megaraid_sas driver v4.01 included in 2.6.32.25 is broken, it fails to bring up the disks and the boot fails.) Pasi, You might want to raise that issue with the broken driver as another bug; as this bug is more concerned with the version of the drivers. If the shipped driver is broken, then RH need to have this flag as a high priority and a show-stopper. Thanks Max The problems I see are related to use SMART features. The commands or responses don't seem to be passed through properly. But I don't know if the updated driver would actually fix this issue. I'm still seeing this problem with the H700 controller. If your root partition is an array managed by the H700 controller, the system will not boot because it fails to mount the root partition. I also had this problem with 2.6.32.19 a while ago, and I'm not sure what version it was broken at but I do know that 2.6.27.10 works. I have tried installing the 4.33 driver version included here in the following way: make -C /lib/modules/2.6.32.25-grsec/build M=$PWD modules cp -dp megaraid_sas.ko /lib/modules/2.6.32.25-grsec/kernel/drivers/scsi/megaraid/megaraid_sas.ko depmod -a 2.6.32.25-grsec mkinitrd /boot/initrd-2.6.32.25-grsec.img 2.6.32.25-grsec Still the same problem, can't mount the root partition. It seems like this should have been and urgent issue for a while now - The Dell H700/LSI MegaSAS 9260 is really common and lots of people are looking to update their kernels with all of the recent security problems. v4.33 works for me with H700, with Linux 2.6.32.25. Ian: You should set up a serial console and log the full boot messages and paste them here. This is off a new Dell R410 with a fresh F14 load. It would be nice if Dell OMSA would not complain about this old driver. [root@newnte ~]# omreport about Product name : Server Administrator Version : 6.4.0 Copyright : Copyright (C) Dell Inc. 1995-2010 All rights reserved. Company : Dell Inc. [root@newnte ~]# omreport storage controller Controller SAS 6/iR Integrated (Embedded) Controllers ID : 0 Status : Non-Critical Name : SAS 6/iR Integrated Slot ID : Embedded State : Degraded Firmware Version : 00.25.47.00.06.22.03.00 Minimum Required Firmware Version : Not Applicable Driver Version : 3.04.15 Minimum Required Driver Version : 3.12.29.00 [root@newnte ~]# modinfo mptsas filename: /lib/modules/2.6.35.10-74.fc14.x86_64/kernel/drivers/message/fusion/mptsas.ko version: 3.04.15 license: GPL description: Fusion MPT SAS Host driver author: LSI Corporation srcversion: F404DF8025454601A4567FB alias: pci:v00001000d00000062sv*sd*bc*sc*i* alias: pci:v00001000d00000058sv*sd*bc*sc*i* alias: pci:v00001000d00000056sv*sd*bc*sc*i* alias: pci:v00001000d00000054sv*sd*bc*sc*i* alias: pci:v00001000d00000050sv*sd*bc*sc*i* depends: scsi_transport_sas,mptscsih,mptbase vermagic: 2.6.35.10-74.fc14.x86_64 SMP mod_unload parm: mpt_pt_clear: Clear persistency table: enable=1 (default=MPTSCSIH_PT_CLEAR=0) (int) parm: max_lun: max lun, default=16895 (int) This message is a reminder that Fedora 13 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 13. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '13'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 13's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 13 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Fedora 13 changed to end-of-life (EOL) status on 2011-06-25. Fedora 13 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed. RHEL6 final still has this very old version of the driver (3.04.16). Dell support balks at us when we try to get a drive replaced or have any support issue with the RAID controller when we're not running a version of the driver that passes their minimum required driver as expected by OMSA (currently 3.12.29.00 for the LSISAS1068E or SAS 6/iR as Dell refers to it). LSI.com only has up to v4.26 and that's also only listed for RHEL5. The 4.33 driver and now the 4.38 driver is for MegaRAID SAS models and don't specifically list the SAS1068E as a supported product in the readme. Any idea when you guys can backport 4.26 or when LSI might release an official driver for the SAS106E (SAS 6/iR) for RHEL6? (In reply to comment #26) > RHEL6 final still has this very old version of the driver (3.04.16). Dell You should probably file a bug (or find one already opened) against RHEL itself. This is a closed bug is against an EOL version of Fedora, and it's quite likely that nobody in RHEL is paying any attention to it. Sorry for the noise, but is there a RHEL6 bug regarding this issue? |