Bug 1878332 - mpt3sas modules fails to setup LSI 9201-16e on 5.8 series kernels
Summary: mpt3sas modules fails to setup LSI 9201-16e on 5.8 series kernels
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 32
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-12 02:59 UTC by James Boyle
Modified: 2021-05-25 17:28 UTC (History)
23 users (show)

Fixed In Version:
Doc Type: ---
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-25 17:28:34 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
5.8 kernel log - SAS HBA SCSI probe FAILS (107.09 KB, text/plain)
2020-09-12 02:59 UTC, James Boyle
no flags Details
5.7 kernel log - SAS HBA SCSI probe WORKS, initializes OK (106.49 KB, text/plain)
2020-09-12 03:00 UTC, James Boyle
no flags Details
5.8.9-200.fc32.x86_64 kernel log - probe FAILS (105.03 KB, text/plain)
2020-09-17 15:02 UTC, James Boyle
no flags Details
5.8.10-200.fc32.x86_64 kernel log - SAS HBA SCSI probe FAILS (105.28 KB, text/plain)
2020-09-21 22:09 UTC, James Boyle
no flags Details
5.8.11-200.fc32.x86_64 kernel log - SAS HBA SCSI probe FAILS (105.17 KB, text/plain)
2020-09-29 18:06 UTC, James Boyle
no flags Details
5.8.12-200.fc32.x86_64 kernel log - SAS HBA SCSI probe FAILS (106.07 KB, text/plain)
2020-10-03 01:37 UTC, James Boyle
no flags Details
5.8.12-200.fc32.x86_64 kernel log - with workaround (105.49 KB, text/plain)
2020-10-03 15:11 UTC, James Boyle
no flags Details
5.8.14-200.fc32.x86_64 kernel log - SAS HBA SCSI probe FAILS (106.33 KB, text/plain)
2020-10-15 14:21 UTC, James Boyle
no flags Details

Description James Boyle 2020-09-12 02:59:40 UTC
Created attachment 1714626 [details]
5.8 kernel log - SAS HBA SCSI probe FAILS

1. Please describe the problem:
After upgrading from kernel 5.7.17-200 to 5.8.6-201, I lose the ability to use my LSI 9201-16e SAS HBA.

2. What is the Version-Release number of the kernel:
5.8.6-201

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :
Yes; it worked on kernels before (<) 5.8.x, including 5.7.17-200.

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:
Error from 5.8.6-201:
kernel 5.8.6-201.fc32.x86_64
# dmesg |grep mpt
[    4.464692] mpt3sas version 34.100.00.00 loaded
[    4.473120] mpt2sas_cm0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (16378884 kB)
[    4.541545] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
[    4.541555] mpt2sas_cm0: MSI-X vectors supported: 1
[    4.541557] mpt2sas_cm0:  0 1
[    4.541620] mpt2sas_cm0: High IOPs queues : disabled
[    4.541620] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 42
[    4.541622] mpt2sas_cm0: iomem(0x00000000fb93c000), mapped(0x00000000eee9bf99), size(16384)
[    4.541623] mpt2sas_cm0: ioport(0x000000000000b000), size(256)
[    4.629024] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
[    4.629027] mpt2sas_cm0: sending message unit reset !!
[    4.633016] mpt2sas_cm0: message unit reset: SUCCESS
[    4.680145] mpt2sas_cm0: scatter gather: sge_in_main_msg(1), sge_per_chain(9), sge_per_io(128), chains_per_io(15)
[    4.680171] Modules linked in: crc32c_intel mpt3sas(+) serio_raw sata_sil24 uas usb_storage sky2 raid_class scsi_transport_sas fuse
[    4.680226]  base_alloc_rdpq_dma_pool+0xe2/0x17d [mpt3sas]
[    4.680235]  mpt3sas_base_attach.cold+0x3da/0x1618 [mpt3sas]
[    4.680242]  _scsih_probe+0x68e/0x7a0 [mpt3sas]
[    4.680277]  _mpt3sas_init+0x1ac/0x1000 [mpt3sas]
[    4.680406] mpt2sas_cm0: failure at drivers/scsi/mpt3sas/mpt3sas_scsih.c:10790/_scsih_probe()!

But, it works in 5.7.17-200:
# lspci -v -s 01:00.0
01:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2116 PCI-Express Fusion-MPT SAS-2 [Meteor] (rev 02)
        Subsystem: Broadcom / LSI 9201-16e 6Gb/s SAS/SATA PCIe x8 External HBA
        Flags: bus master, fast devsel, latency 0, IRQ 16
        I/O ports at b000 [size=256]
        Memory at fb93c000 (64-bit, non-prefetchable) [size=16K]
        Memory at fb940000 (64-bit, non-prefetchable) [size=256K]
        Expansion ROM at fb980000 [disabled] [size=512K]
        Capabilities: [50] Power Management version 3
        Capabilities: [68] Express Endpoint, MSI 00
        Capabilities: [d0] Vital Product Data
        Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [c0] MSI-X: Enable+ Count=15 Masked-
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [138] Power Budgeting <?>
        Capabilities: [150] Single Root I/O Virtualization (SR-IOV)
        Capabilities: [190] Alternative Routing-ID Interpretation (ARI)
        Kernel driver in use: mpt3sas
        Kernel modules: mpt3sas
# uname -a
Linux melka.reple.at 5.7.17-200.fc32.x86_64 #1 SMP Fri Aug 21 15:23:46 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
# dmesg |grep mpt
[    0.000000]   Device   empty
[    0.000484] MDS: Vulnerable: Clear CPU buffers attempted, no microcode
[    4.266848] mpt3sas version 33.100.00.00 loaded
[    4.280066] mpt2sas_cm0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (16379236 kB)
[    4.361448] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
[    4.361459] mpt2sas_cm0: MSI-X vectors supported: 1
[    4.361460] mpt2sas_cm0:  0 1
[    4.361519] mpt2sas_cm0: High IOPs queues : disabled
[    4.361519] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 42
[    4.361522] mpt2sas_cm0: iomem(0x00000000fb93c000), mapped(0x00000000c6c05bf3), size(16384)
[    4.361522] mpt2sas_cm0: ioport(0x000000000000b000), size(256)
[    4.454783] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
[    4.454785] mpt2sas_cm0: sending message unit reset !!
[    4.459266] mpt2sas_cm0: message unit reset: SUCCESS
[    4.490436] mpt2sas_cm0: scatter gather: sge_in_main_msg(1), sge_per_chain(9), sge_per_io(128), chains_per_io(15)
[    4.491076] mpt2sas_cm0: request pool(0x000000005c7c3d89) - dma(0x423800000): depth(30127), frame_size(128), pool_size(3765 kB)
[    6.529087] mpt2sas_cm0: sense pool(0x000000008e6fb40f)- dma(0x41e800000): depth(29868),element_size(96), pool_size(2800 kB)
[    6.529449] mpt2sas_cm0: config page(0x00000000ffaf6102) - dma(0x41ef99000): size(512)
[    6.529450] mpt2sas_cm0: Allocated physical memory: size(14663 kB)
[    6.529451] mpt2sas_cm0: Current Controller Queue Depth(29865),Max Controller Queue Depth(32455)
[    6.529451] mpt2sas_cm0: Scatter Gather Elements per IO(128)
[    6.573662] mpt2sas_cm0: overriding NVDATA EEDPTagMode setting
[    6.574118] mpt2sas_cm0: LSISAS2116: FWVersion(20.00.07.00), ChipRevision(0x02), BiosVersion(07.39.02.00)
[    6.574119] mpt2sas_cm0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ)
[    6.576413] mpt2sas_cm0: sending port enable !!
[    6.580425] mpt2sas_cm0: host_add: handle(0x0001), sas_addr(0x500062b200de0d40), phys(16)
[    6.709012] mpt2sas_cm0: port enable: SUCCESS
# uname -a
Linux melka.reple.at 5.7.17-200.fc32.x86_64 #1 SMP Fri Aug 21 15:23:46 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
# dmesg |grep mpt
[    0.000000]   Device   empty
[    0.000484] MDS: Vulnerable: Clear CPU buffers attempted, no microcode
[    4.266848] mpt3sas version 33.100.00.00 loaded
[    4.280066] mpt2sas_cm0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (16379236 kB)
[    4.361448] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
[    4.361459] mpt2sas_cm0: MSI-X vectors supported: 1
[    4.361460] mpt2sas_cm0:  0 1
[    4.361519] mpt2sas_cm0: High IOPs queues : disabled
[    4.361519] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 42
[    4.361522] mpt2sas_cm0: iomem(0x00000000fb93c000), mapped(0x00000000c6c05bf3), size(16384)
[    4.361522] mpt2sas_cm0: ioport(0x000000000000b000), size(256)
[    4.454783] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
[    4.454785] mpt2sas_cm0: sending message unit reset !!
[    4.459266] mpt2sas_cm0: message unit reset: SUCCESS
[    4.490436] mpt2sas_cm0: scatter gather: sge_in_main_msg(1), sge_per_chain(9), sge_per_io(128), chains_per_io(15)
[    4.491076] mpt2sas_cm0: request pool(0x000000005c7c3d89) - dma(0x423800000): depth(30127), frame_size(128), pool_size(3765 kB)
[    6.529087] mpt2sas_cm0: sense pool(0x000000008e6fb40f)- dma(0x41e800000): depth(29868),element_size(96), pool_size(2800 kB)
[    6.529449] mpt2sas_cm0: config page(0x00000000ffaf6102) - dma(0x41ef99000): size(512)
[    6.529450] mpt2sas_cm0: Allocated physical memory: size(14663 kB)
[    6.529451] mpt2sas_cm0: Current Controller Queue Depth(29865),Max Controller Queue Depth(32455)
[    6.529451] mpt2sas_cm0: Scatter Gather Elements per IO(128)
[    6.573662] mpt2sas_cm0: overriding NVDATA EEDPTagMode setting
[    6.574118] mpt2sas_cm0: LSISAS2116: FWVersion(20.00.07.00), ChipRevision(0x02), BiosVersion(07.39.02.00)
[    6.574119] mpt2sas_cm0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ)
[    6.576413] mpt2sas_cm0: sending port enable !!
[    6.580425] mpt2sas_cm0: host_add: handle(0x0001), sas_addr(0x500062b200de0d40), phys(16)
[    6.709012] mpt2sas_cm0: port enable: SUCCESS


5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:


6. Are you running any modules that not shipped with directly Fedora's kernel?:
No

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

5.7 & 5.8 logs attached..

Comment 1 James Boyle 2020-09-12 03:00:57 UTC
Created attachment 1714627 [details]
5.7 kernel log - SAS HBA SCSI probe WORKS, initializes OK

Comment 2 RedTed 2020-09-12 22:50:27 UTC
I'm also having this issue as well. (https://bugzilla.redhat.com/show_bug.cgi?id=1877574). Can confirm that 5.7 kernels are working.

Comment 3 James Boyle 2020-09-17 15:02:56 UTC
Created attachment 1715242 [details]
5.8.9-200.fc32.x86_64 kernel log - probe FAILS

Just adding the logs for the most recent kernel, 5.8.9-200. 

Sep 16 22:49:27 kernel: mpt2sas_cm0: failure at drivers/scsi/mpt3sas/mpt3sas_scsih.c:10790/_scsih_probe()!

Comment 4 James Boyle 2020-09-21 22:09:26 UTC
Created attachment 1715604 [details]
5.8.10-200.fc32.x86_64 kernel log - SAS HBA SCSI probe FAILS

Issue persists on kernel-5.8.10-200.fc32.x86_64; logs attached.

Comment 5 Harald Evensen 2020-09-23 16:06:38 UTC
I have the same problem with my LSI card under 5.8. Reported the bugs some weeks ago, but didn't get any feedback. Works on 5.7.17.

Comment 6 James Boyle 2020-09-29 18:06:43 UTC
Created attachment 1717598 [details]
5.8.11-200.fc32.x86_64 kernel log - SAS HBA SCSI probe FAILS

Issue is still present in kernel 5.8.11-200.fc32.x86_64

Comment 7 Harald Evensen 2020-09-30 16:11:59 UTC
5.8.12 still does not boot. I did try 5.8.12 withe the new driver (35.100.00.00) with no luck. The old 33.100.00.00 driver however works just fine under 5.8.x.

Comment 8 Harald Evensen 2020-10-01 05:42:28 UTC
And the workaround suggested by Broadcom for 5.8.x is to set the kernel param: "mpt3sas.max_queue_depth=10000". It works for me.

Comment 9 James Boyle 2020-10-03 01:37:30 UTC
Created attachment 1718566 [details]
5.8.12-200.fc32.x86_64 kernel log - SAS HBA SCSI probe FAILS

Kernel logs for 5.8.12-200

For my part, I do NOT, and have not used the proprietary Broadcom / Avago / LSI driver(s).

All of my use and testing has been on an untainted / stock / plain Fedora built Linux kernel.

Comment 10 RedTed 2020-10-03 03:48:07 UTC
I can confirm that changing the queue depth fixed the issue on a system running 5.8.11 WITHOUT proprietary drivers (just the builtins). Still, this workaround should not be necessary and this should be fixed.

Comment 11 Harald Evensen 2020-10-03 08:06:28 UTC
(In reply to James Boyle from comment #9)
> Created attachment 1718566 [details]
> 5.8.12-200.fc32.x86_64 kernel log - SAS HBA SCSI probe FAILS
> 
> Kernel logs for 5.8.12-200
> 
> For my part, I do NOT, and have not used the proprietary Broadcom / Avago /
> LSI driver(s).
> 
> All of my use and testing has been on an untainted / stock / plain Fedora
> built Linux kernel.

This workaround is for the stock fedora kernel driver.

Comment 12 Harald Evensen 2020-10-03 08:07:50 UTC
Just do grubby --update-kernel=ALL --args="mpt3sas.max_queue_depth=10000" and it will work (but I do agree, it should not be needed).

Comment 13 James Boyle 2020-10-03 15:11:48 UTC
Created attachment 1718692 [details]
5.8.12-200.fc32.x86_64 kernel log - with workaround

Kernel logs with the workaround in place on the stock / untainted kernel 5.8.12-200 

I can confirm that the driver loads with the workaround (mpt3sas.max_queue_depth=10000) in place.

Thanks for the hint!

Comment 14 James Boyle 2020-10-15 14:21:07 UTC
Created attachment 1721851 [details]
5.8.14-200.fc32.x86_64 kernel log - SAS HBA SCSI probe FAILS

Same issue recurs on 5.8.14-200.  Workaround is still effective.

Comment 15 RedTed 2020-11-16 06:18:07 UTC
Issue persists on 5.9.8. Workaround still works.

Comment 16 Akemi Yagi 2021-04-11 07:35:14 UTC
Upstream bug report:

https://bugzilla.kernel.org/show_bug.cgi?id=209177
"mpt2sas_cm0: failure at drivers/scsi/mpt3sas/mpt3sas_scsih.c:10791/_scsih_probe()!"

Comment 17 Fedora Program Management 2021-04-29 16:56:48 UTC
This message is a reminder that Fedora 32 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 32 on 2021-05-25.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '32'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 32 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 18 Ben Cotton 2021-05-25 17:28:34 UTC
Fedora 32 changed to end-of-life (EOL) status on 2021-05-25. Fedora 32 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.