Created attachment 1714626 [details] 5.8 kernel log - SAS HBA SCSI probe FAILS 1. Please describe the problem: After upgrading from kernel 5.7.17-200 to 5.8.6-201, I lose the ability to use my LSI 9201-16e SAS HBA. 2. What is the Version-Release number of the kernel: 5.8.6-201 3. Did it work previously in Fedora? If so, what kernel version did the issue *first* appear? Old kernels are available for download at https://koji.fedoraproject.org/koji/packageinfo?packageID=8 : Yes; it worked on kernels before (<) 5.8.x, including 5.7.17-200. 4. Can you reproduce this issue? If so, please provide the steps to reproduce the issue below: Error from 5.8.6-201: kernel 5.8.6-201.fc32.x86_64 # dmesg |grep mpt [ 4.464692] mpt3sas version 34.100.00.00 loaded [ 4.473120] mpt2sas_cm0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (16378884 kB) [ 4.541545] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k [ 4.541555] mpt2sas_cm0: MSI-X vectors supported: 1 [ 4.541557] mpt2sas_cm0: 0 1 [ 4.541620] mpt2sas_cm0: High IOPs queues : disabled [ 4.541620] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 42 [ 4.541622] mpt2sas_cm0: iomem(0x00000000fb93c000), mapped(0x00000000eee9bf99), size(16384) [ 4.541623] mpt2sas_cm0: ioport(0x000000000000b000), size(256) [ 4.629024] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k [ 4.629027] mpt2sas_cm0: sending message unit reset !! [ 4.633016] mpt2sas_cm0: message unit reset: SUCCESS [ 4.680145] mpt2sas_cm0: scatter gather: sge_in_main_msg(1), sge_per_chain(9), sge_per_io(128), chains_per_io(15) [ 4.680171] Modules linked in: crc32c_intel mpt3sas(+) serio_raw sata_sil24 uas usb_storage sky2 raid_class scsi_transport_sas fuse [ 4.680226] base_alloc_rdpq_dma_pool+0xe2/0x17d [mpt3sas] [ 4.680235] mpt3sas_base_attach.cold+0x3da/0x1618 [mpt3sas] [ 4.680242] _scsih_probe+0x68e/0x7a0 [mpt3sas] [ 4.680277] _mpt3sas_init+0x1ac/0x1000 [mpt3sas] [ 4.680406] mpt2sas_cm0: failure at drivers/scsi/mpt3sas/mpt3sas_scsih.c:10790/_scsih_probe()! But, it works in 5.7.17-200: # lspci -v -s 01:00.0 01:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2116 PCI-Express Fusion-MPT SAS-2 [Meteor] (rev 02) Subsystem: Broadcom / LSI 9201-16e 6Gb/s SAS/SATA PCIe x8 External HBA Flags: bus master, fast devsel, latency 0, IRQ 16 I/O ports at b000 [size=256] Memory at fb93c000 (64-bit, non-prefetchable) [size=16K] Memory at fb940000 (64-bit, non-prefetchable) [size=256K] Expansion ROM at fb980000 [disabled] [size=512K] Capabilities: [50] Power Management version 3 Capabilities: [68] Express Endpoint, MSI 00 Capabilities: [d0] Vital Product Data Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [c0] MSI-X: Enable+ Count=15 Masked- Capabilities: [100] Advanced Error Reporting Capabilities: [138] Power Budgeting <?> Capabilities: [150] Single Root I/O Virtualization (SR-IOV) Capabilities: [190] Alternative Routing-ID Interpretation (ARI) Kernel driver in use: mpt3sas Kernel modules: mpt3sas # uname -a Linux melka.reple.at 5.7.17-200.fc32.x86_64 #1 SMP Fri Aug 21 15:23:46 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux # dmesg |grep mpt [ 0.000000] Device empty [ 0.000484] MDS: Vulnerable: Clear CPU buffers attempted, no microcode [ 4.266848] mpt3sas version 33.100.00.00 loaded [ 4.280066] mpt2sas_cm0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (16379236 kB) [ 4.361448] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k [ 4.361459] mpt2sas_cm0: MSI-X vectors supported: 1 [ 4.361460] mpt2sas_cm0: 0 1 [ 4.361519] mpt2sas_cm0: High IOPs queues : disabled [ 4.361519] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 42 [ 4.361522] mpt2sas_cm0: iomem(0x00000000fb93c000), mapped(0x00000000c6c05bf3), size(16384) [ 4.361522] mpt2sas_cm0: ioport(0x000000000000b000), size(256) [ 4.454783] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k [ 4.454785] mpt2sas_cm0: sending message unit reset !! [ 4.459266] mpt2sas_cm0: message unit reset: SUCCESS [ 4.490436] mpt2sas_cm0: scatter gather: sge_in_main_msg(1), sge_per_chain(9), sge_per_io(128), chains_per_io(15) [ 4.491076] mpt2sas_cm0: request pool(0x000000005c7c3d89) - dma(0x423800000): depth(30127), frame_size(128), pool_size(3765 kB) [ 6.529087] mpt2sas_cm0: sense pool(0x000000008e6fb40f)- dma(0x41e800000): depth(29868),element_size(96), pool_size(2800 kB) [ 6.529449] mpt2sas_cm0: config page(0x00000000ffaf6102) - dma(0x41ef99000): size(512) [ 6.529450] mpt2sas_cm0: Allocated physical memory: size(14663 kB) [ 6.529451] mpt2sas_cm0: Current Controller Queue Depth(29865),Max Controller Queue Depth(32455) [ 6.529451] mpt2sas_cm0: Scatter Gather Elements per IO(128) [ 6.573662] mpt2sas_cm0: overriding NVDATA EEDPTagMode setting [ 6.574118] mpt2sas_cm0: LSISAS2116: FWVersion(20.00.07.00), ChipRevision(0x02), BiosVersion(07.39.02.00) [ 6.574119] mpt2sas_cm0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ) [ 6.576413] mpt2sas_cm0: sending port enable !! [ 6.580425] mpt2sas_cm0: host_add: handle(0x0001), sas_addr(0x500062b200de0d40), phys(16) [ 6.709012] mpt2sas_cm0: port enable: SUCCESS # uname -a Linux melka.reple.at 5.7.17-200.fc32.x86_64 #1 SMP Fri Aug 21 15:23:46 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux # dmesg |grep mpt [ 0.000000] Device empty [ 0.000484] MDS: Vulnerable: Clear CPU buffers attempted, no microcode [ 4.266848] mpt3sas version 33.100.00.00 loaded [ 4.280066] mpt2sas_cm0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (16379236 kB) [ 4.361448] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k [ 4.361459] mpt2sas_cm0: MSI-X vectors supported: 1 [ 4.361460] mpt2sas_cm0: 0 1 [ 4.361519] mpt2sas_cm0: High IOPs queues : disabled [ 4.361519] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 42 [ 4.361522] mpt2sas_cm0: iomem(0x00000000fb93c000), mapped(0x00000000c6c05bf3), size(16384) [ 4.361522] mpt2sas_cm0: ioport(0x000000000000b000), size(256) [ 4.454783] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k [ 4.454785] mpt2sas_cm0: sending message unit reset !! [ 4.459266] mpt2sas_cm0: message unit reset: SUCCESS [ 4.490436] mpt2sas_cm0: scatter gather: sge_in_main_msg(1), sge_per_chain(9), sge_per_io(128), chains_per_io(15) [ 4.491076] mpt2sas_cm0: request pool(0x000000005c7c3d89) - dma(0x423800000): depth(30127), frame_size(128), pool_size(3765 kB) [ 6.529087] mpt2sas_cm0: sense pool(0x000000008e6fb40f)- dma(0x41e800000): depth(29868),element_size(96), pool_size(2800 kB) [ 6.529449] mpt2sas_cm0: config page(0x00000000ffaf6102) - dma(0x41ef99000): size(512) [ 6.529450] mpt2sas_cm0: Allocated physical memory: size(14663 kB) [ 6.529451] mpt2sas_cm0: Current Controller Queue Depth(29865),Max Controller Queue Depth(32455) [ 6.529451] mpt2sas_cm0: Scatter Gather Elements per IO(128) [ 6.573662] mpt2sas_cm0: overriding NVDATA EEDPTagMode setting [ 6.574118] mpt2sas_cm0: LSISAS2116: FWVersion(20.00.07.00), ChipRevision(0x02), BiosVersion(07.39.02.00) [ 6.574119] mpt2sas_cm0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ) [ 6.576413] mpt2sas_cm0: sending port enable !! [ 6.580425] mpt2sas_cm0: host_add: handle(0x0001), sas_addr(0x500062b200de0d40), phys(16) [ 6.709012] mpt2sas_cm0: port enable: SUCCESS 5. Does this problem occur with the latest Rawhide kernel? To install the Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by ``sudo dnf update --enablerepo=rawhide kernel``: 6. Are you running any modules that not shipped with directly Fedora's kernel?: No 7. Please attach the kernel logs. You can get the complete kernel log for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the issue occurred on a previous boot, use the journalctl ``-b`` flag. 5.7 & 5.8 logs attached..
Created attachment 1714627 [details] 5.7 kernel log - SAS HBA SCSI probe WORKS, initializes OK
I'm also having this issue as well. (https://bugzilla.redhat.com/show_bug.cgi?id=1877574). Can confirm that 5.7 kernels are working.
Created attachment 1715242 [details] 5.8.9-200.fc32.x86_64 kernel log - probe FAILS Just adding the logs for the most recent kernel, 5.8.9-200. Sep 16 22:49:27 kernel: mpt2sas_cm0: failure at drivers/scsi/mpt3sas/mpt3sas_scsih.c:10790/_scsih_probe()!
Created attachment 1715604 [details] 5.8.10-200.fc32.x86_64 kernel log - SAS HBA SCSI probe FAILS Issue persists on kernel-5.8.10-200.fc32.x86_64; logs attached.
I have the same problem with my LSI card under 5.8. Reported the bugs some weeks ago, but didn't get any feedback. Works on 5.7.17.
Created attachment 1717598 [details] 5.8.11-200.fc32.x86_64 kernel log - SAS HBA SCSI probe FAILS Issue is still present in kernel 5.8.11-200.fc32.x86_64
5.8.12 still does not boot. I did try 5.8.12 withe the new driver (35.100.00.00) with no luck. The old 33.100.00.00 driver however works just fine under 5.8.x.
And the workaround suggested by Broadcom for 5.8.x is to set the kernel param: "mpt3sas.max_queue_depth=10000". It works for me.
Created attachment 1718566 [details] 5.8.12-200.fc32.x86_64 kernel log - SAS HBA SCSI probe FAILS Kernel logs for 5.8.12-200 For my part, I do NOT, and have not used the proprietary Broadcom / Avago / LSI driver(s). All of my use and testing has been on an untainted / stock / plain Fedora built Linux kernel.
I can confirm that changing the queue depth fixed the issue on a system running 5.8.11 WITHOUT proprietary drivers (just the builtins). Still, this workaround should not be necessary and this should be fixed.
(In reply to James Boyle from comment #9) > Created attachment 1718566 [details] > 5.8.12-200.fc32.x86_64 kernel log - SAS HBA SCSI probe FAILS > > Kernel logs for 5.8.12-200 > > For my part, I do NOT, and have not used the proprietary Broadcom / Avago / > LSI driver(s). > > All of my use and testing has been on an untainted / stock / plain Fedora > built Linux kernel. This workaround is for the stock fedora kernel driver.
Just do grubby --update-kernel=ALL --args="mpt3sas.max_queue_depth=10000" and it will work (but I do agree, it should not be needed).
Created attachment 1718692 [details] 5.8.12-200.fc32.x86_64 kernel log - with workaround Kernel logs with the workaround in place on the stock / untainted kernel 5.8.12-200 I can confirm that the driver loads with the workaround (mpt3sas.max_queue_depth=10000) in place. Thanks for the hint!
Created attachment 1721851 [details] 5.8.14-200.fc32.x86_64 kernel log - SAS HBA SCSI probe FAILS Same issue recurs on 5.8.14-200. Workaround is still effective.
Issue persists on 5.9.8. Workaround still works.
Upstream bug report: https://bugzilla.kernel.org/show_bug.cgi?id=209177 "mpt2sas_cm0: failure at drivers/scsi/mpt3sas/mpt3sas_scsih.c:10791/_scsih_probe()!"
This message is a reminder that Fedora 32 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 32 on 2021-05-25. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '32'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 32 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 32 changed to end-of-life (EOL) status on 2021-05-25. Fedora 32 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.