Bug 1734557

Summary: cavium thunderx fails to boot due to SATA errors and Failed to set up IOMMU for device
Product: [Fedora] Fedora Reporter: Rachel Sibley <rasibley>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 31CC: airlied, bskeggs, hdegoede, ichavero, itamar, jarodwilson, jbastian, jeremy, jglisse, john.j5live, jonathan, josef, kernel-maint, linville, mchehab, mjg59, msalter, pbunyan, pwhalen, steved, winson.lin
Target Milestone: ---Flags: jforbes: needinfo? (rasibley)
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-25 22:29:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Rachel Sibley 2019-07-30 21:09:27 UTC
1. Please describe the problem:
cavium thunder-x systems are aborting during kickstart or when booting an updated kernel, this is easily reproducible, you will see the following failures in the serial log:

[  855.415163] Failed to set up IOMMU for device 0000:01:01.4; retaining platform DMA ops 
[  855.423431] thunderx_mmc: probe of 0000:01:01.4 failed with error -2 
[  855.424011] Failed to set up IOMMU for device 0000:01:09.5; retaining platform DMA ops 
[  855.429866] Failed to set up IOMMU for device 0004:01:01.4; retaining platform DMA ops 
[  855.438717] i2c-thunderx 0000:01:09.5: Probed. Set system clock to 800000000 
[  855.445882] thunderx_mmc: probe of 0004:01:01.4 failed with error -2 
[  855.452703] i2c-thunderx 0000:01:09.5: SMBUS alert not active on this bus 
[  855.460315] input: soc@0:gpio-keys as /devices/platform/soc@0/soc@0:gpio-keys/input/input0 
[  855.465954] Failed to set up IOMMU for device 0004:01:09.5; retaining platform DMA ops 
[  855.482302] i2c-thunderx: probe of 0004:01:09.5 failed with error -28 

2. What is the Version-Release number of the kernel:
Fedora 30 5.0.16-300.fc30.x86_64

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :
Have only tried F30 will try earlier versions

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

Install Fedora 30 on a Cavium Thunder-x system

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:
Will try this soon

6. Are you running any modules that not shipped with directly Fedora's kernel?:
No

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

cav-thunderx2s-cn88xx-02.khw3.lab.eng.bos.redhat.com
https://beaker.engineering.redhat.com/recipes/6876119#task92902849
https://beaker.engineering.redhat.com/recipes/6874924#task92885352
https://beaker.engineering.redhat.com/recipes/6872063#task92841542


cav-thunderx2s-cn88xx-01.khw3.lab.eng.bos.redhat.com
https://beaker.engineering.redhat.com/recipes/7179323#task97018149
https://beaker.engineering.redhat.com/recipes/7179323#task97018155
https://beaker.engineering.redhat.com/recipes/7154311#task96668449

Comment 1 Jeremy Cline 2019-07-31 13:56:04 UTC
*** Bug 1734556 has been marked as a duplicate of this bug. ***

Comment 2 Rachel Sibley 2019-07-31 15:30:59 UTC
For comparison, here is the check-install logs for both F30 and 8.1 kernels on the same system (cav-thunderx2s-cn88xx-01.khw3.lab.eng.bos.redhat.com),
note the flood of IOMMU errors in the F30 logs, I'm trying to see if I can reproduce with rawhide as well:

* http://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2019/07/36984/3698474/7179323/97018148/444876787/resultoutputfile.log
Fedora 30 5.1.19-300.fc30.aarch64 

* http://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2019/07/36693/3669389/7121757/96228420/441260666/resultoutputfile.log
RHEL 8.1 4.18.0-112.el8.aarch64

Comment 4 Jeff Bastian 2019-08-16 18:18:35 UTC
The IOMMU errors might be a red herring.  I just tested this and noticed some SATA errors which may be the more likely culprit:

[   21.244850] ata1: softreset failed (1st FIS failed)
[   21.884426] ata1: SATA link down (SStatus 0 SControl 300)
[   21.922628] ata1: link online but 1 devices misclassified, retrying
[   21.963354] ata1: reset failed (errno=-11), retrying in 10 secs
[   32.514444] ata1: SATA link down (SStatus 0 SControl 300)
[   32.556714] ata1: link online but 1 devices misclassified, retrying
[   32.601420] ata1: reset failed (errno=-11), retrying in 35 secs
[   67.034446] ata1: limiting SATA link speed to 3.0 Gbps
[   67.754444] ata1: SATA link down (SStatus 0 SControl 320)
[   67.800537] ata1: link online but 1 devices misclassified, device detection might fail


Also, Beaker enables Fedora Updates by default, so the kickstart uses the Fedora 30 GA kernel 5.0.9-301.fc30.aarch64, but it installs and does first-boot into kernel 5.2.8-200.fc30.aarch64 which is the buggy kernel.  If you use ks_meta="no_updates_repos" in your Beaker job, it will disable Fedora Updates and kernel 5.0.9 works great on first-boot.

Comment 5 Rachel Sibley 2019-08-19 18:33:04 UTC
Jeff the private comments were removed, I'll give it a try using ks_meta="no_updates_repos", I wasn't aware of that, thanks! I'll report back soon, if it works ok I can remove the exclusion in kpet-db.

Comment 6 Justin M. Forbes 2019-08-20 17:39:00 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There are a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 30 kernel bugs.

Fedora 30 has now been rebased to 5.2.9-200.fc30.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 31, and are still experiencing this issue, please change the version to Fedora 31.

If you experience different issues, please open a new bug report for those.

Comment 7 Jeff Bastian 2019-08-23 17:30:04 UTC
I tested a few kernels on a Gigabyte R120 and it looks like the SATA problems in comment 4 started with 5.2:

 OK  5.0.9-301.fc30
 OK  5.1.18-300.fc30
FAIL 5.2.1-200.fc30
FAIL 5.3.0-0.rc5.git0.1.fc31

Updating the version to Fedora 31 given the fc31 kernel is also failing.

Comment 8 PaulB 2019-08-26 17:38:54 UTC
msalter is having a peek...

RT#496176: loan: gigabyte-r120-09.khw4.lab.eng.bos.redhat.com [BZ1734557]
https://engineering.redhat.com/rt/Ticket/Display.html?id=496176

Best,
-pbunyan

Comment 9 Mark Salter 2019-08-26 18:43:51 UTC
This is fallout from two things. This commit:

commit 954a03be033c7cef80ddc232e7cbdb17df735663
Author: Douglas Anderson <dianders@chromium.org>
Date:   Fri Mar 1 11:20:17 2019 -0800

    iommu/arm-smmu: Break insecure users by disabling bypass by default
    
    If you're bisecting why your peripherals stopped working, it's
    probably this CL.  Specifically if you see this in your dmesg:
      Unexpected global fault, this could be serious
    ...then it's almost certainly this CL.

and the fact that gigabyte firmware is using a deprecated method of describing iommu relationships in the devicetree:

[   15.329160] arm-smmu: deprecated "mmu-masters" DT property in use; DMA API support unavailable
    
This can be worked around in two ways:

  1) Don't use devicetree (use acpi=force on command line)

  2) Undo the effect of the above commit (use arm-smmu.disable_bypass=n on the command line)

If you want to avoid having to do anything on the command line, the the firmware DT needs to be fixed to use the newer (3 years old) bindings described in:

  linux/Documentation/devicetree/bindings/iommu/arm,iommu.txt

Comment 10 Jeff Bastian 2019-08-26 19:57:06 UTC
Thanks Mark!  I should have realized this was a DeviceTree vs ACPI issue.

I tested the suggested kernel command line args with kernel 5.2.1-200.fc30 on a Gigabyte R120 system with T49 firmware.  Using just acpi=force by itself fixed the IOMMU errors (comment 0) and SATA errors (comment 4), however the system still semi-froze later in boot (tt wasn't truly frozen, but the boot log messages slowed to about one per minute).

Next I tried arm-smmu.disable_bypass=n by itself, and the system booted successfully.

Finally, I tried both acpi=force and arm-smmu.disable_bypass=n together and it also booted successfully.

Thus it looks like arm-smmu.disable_bypass=n is the solution for this issue (barring a firmware fix from Gigabyte).


For the record, partial boot logs of both args together:

[root@gigabyte-r120-04 ~]# dmesg | grep -i -e Linux.version -e acpi -e arm-smmu -e ata[0-4]
[    0.000000] Linux version 5.2.1-200.fc30.aarch64 (mockbuild@buildvm-aarch64-01.arm.fedoraproject.org) (gcc version 9.1.1 20190503 (Red Hat 9.1.1-1) (GCC)) #1 SMP Sat Jul 20 23:21:00 UTC 2019
[    0.000000] efi:  ESRT=0xffce0ff18  SMBIOS 3.0=0xfffb0000  ACPI 2.0=0xffa810000  MEMRESERVE=0xffc871e18 
[    0.000000] ACPI: Early table checksum verification disabled
[    0.000000] ACPI: RSDP 0x0000000FFA810000 000024 (v02 ALASKA)
[    0.000000] ACPI: XSDT 0x0000000FFA810028 00008C (v01 ALASKA A M I    01072009 AMI  00010013)
[    0.000000] ACPI: FACP 0x0000000FFA8100B8 000114 (v06 ALASKA A M I    01072009 AMI  00010013)
[    0.000000] ACPI: DSDT 0x0000000FFA8101D0 00220B (v02 CAVIUM THUNDERX 00000001 INTL 20130517)
[    0.000000] ACPI: SPMI 0x0000000FFA8123E0 000041 (v05 ALASKA A M I    00000000 AMI. 00000000)
[    0.000000] ACPI: FIDT 0x0000000FFA812428 00009C (v01 ALASKA A M I    01072009 AMI  00010013)
[    0.000000] ACPI: APIC 0x0000000FFA8124C8 000F68 (v03 CAVIUM THUNDERX 00000001 INTL 20150619)
[    0.000000] ACPI: DBG2 0x0000000FFA813430 000067 (v01 CAVIUM CN88XDBG 00000000 INTL 20150619)
[    0.000000] ACPI: GTDT 0x0000000FFA813498 0000E0 (v02 CAVIUM THUNDERX 00000001 INTL 20150619)
[    0.000000] ACPI: IORT 0x0000000FFA813578 0013D4 (v01 CAVIUM THUNDERX 00000001 INTL 20150619)
[    0.000000] ACPI: MCFG 0x0000000FFA814950 00006C (v01 CAVIUM THUNDERX 00000001 INTL 20150619)
[    0.000000] ACPI: SSDT 0x0000000FFA8149C0 00089C (v02 CAVIUM NETWORK  00000001 INTL 20150619)
[    0.000000] ACPI: OEM1 0x0000000FFA815260 0001E8 (v02 CAVIUM THUNDERX 00000001 INTL 20150619)
[    0.000000] ACPI: SLIT 0x0000000FFA815448 000030 (v01 CAVIUM TEMPLATE 00000001 INTL 20150619)
[    0.000000] ACPI: SPCR 0x0000000FFA815478 000050 (v02 A M I  APTIO V  01072009 AMI. 0005000B)
[    0.000000] ACPI: BGRT 0x0000000FFA8154C8 000038 (v01 ALASKA A M I    01072009 AMI  00010013)
[    0.000000] ACPI: SPCR: console: pl011,mmio32,0x87e024000000,115200
[    0.000000] ACPI: NUMA: Failed to initialise from firmware
[    0.000000] psci: probing for conduit method from ACPI.
[    0.000000] ACPI: SRAT not present
[    0.000000] Kernel command line: BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.2.1-200.fc30.aarch64 root=/dev/mapper/fedora_gigabyte--r120--04-root ro rd.lvm.lv=fedora_gigabyte-r120-04/root rd.lvm.lv=fedora_gigabyte-r120-04/swap arm-smmu.disable_bypass=n acpi=force
[    0.000000] ACPI: SRAT not present
[    0.000218] ACPI: Core revision 20190509
[    0.023551] ACPI PPTT: No PPTT table found, CPU and cache topology may be inaccurate
[    0.073381] ACPI: bus type PCI registered
[    0.073386] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[    0.469275] ACPI: Added _OSI(Module Device)
[    0.469280] ACPI: Added _OSI(Processor Device)
[    0.469284] ACPI: Added _OSI(3.0 _SCP Extensions)
[    0.469288] ACPI: Added _OSI(Processor Aggregator Device)
[    0.469292] ACPI: Added _OSI(Linux-Dell-Video)
[    0.469296] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
[    0.469300] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
[    0.472863] ACPI: 2 ACPI AML tables successfully acquired and loaded
[    0.475107] ACPI: Interpreter enabled
[    0.475112] ACPI: Using GIC for interrupt routing
[    0.475142] ACPI: MCFG table detected, 4 entries
[    0.496810] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-1f])
[    0.496822] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3]
[    0.496960] acpi PNP0A08:00: _OSC: platform does not support [PCIeHotplug PME AER LTR]
[    0.497074] acpi PNP0A08:00: _OSC: OS now controls [PCIeCapability]
[    0.497514] acpi PNP0A08:00: ECAM area [mem 0x848000000000-0x848001ffffff] reserved by CAVA02C:00
[    0.497850] acpi PNP0A08:00: ECAM at [mem 0x848000000000-0x848001ffffff] for [bus 00-1f]
[    1.536106] ACPI: PCI Root Bridge [PCI1] (domain 0001 [bus 00-1f])
[    1.536114] acpi PNP0A08:01: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3]
[    1.536242] acpi PNP0A08:01: _OSC: platform does not support [PCIeHotplug PME AER LTR]
[    1.536354] acpi PNP0A08:01: _OSC: OS now controls [PCIeCapability]
[    1.536806] acpi PNP0A08:01: ECAM area [mem 0x849000000000-0x849001ffffff] reserved by CAVA02C:01
[    1.537137] acpi PNP0A08:01: ECAM at [mem 0x849000000000-0x849001ffffff] for [bus 00-1f]
[    1.537999] ACPI: PCI Root Bridge [PCI2] (domain 0002 [bus 00-1f])
[    1.538008] acpi PNP0A08:02: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3]
[    1.538134] acpi PNP0A08:02: _OSC: platform does not support [PCIeHotplug PME AER LTR]
[    1.538245] acpi PNP0A08:02: _OSC: OS now controls [PCIeCapability]
[    1.538716] acpi PNP0A08:02: ECAM area [mem 0x84a000000000-0x84a001ffffff] reserved by CAVA02C:02
[    1.539032] acpi PNP0A08:02: ECAM at [mem 0x84a000000000-0x84a001ffffff] for [bus 00-1f]
[    2.570959] ACPI: PCI Root Bridge [PCI3] (domain 0003 [bus 00-1f])
[    2.570968] acpi PNP0A08:03: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3]
[    2.571095] acpi PNP0A08:03: _OSC: platform does not support [PCIeHotplug PME AER LTR]
[    2.571206] acpi PNP0A08:03: _OSC: OS now controls [PCIeCapability]
[    2.571700] acpi PNP0A08:03: ECAM area [mem 0x84b000000000-0x84b001ffffff] reserved by CAVA02C:03
[    2.572018] acpi PNP0A08:03: ECAM at [mem 0x84b000000000-0x84b001ffffff] for [bus 00-1f]
[    2.572224] ACPI: PCI Root Bridge [PEM0] (domain 0004 [bus 1f-57])
[    2.572233] acpi PNP0A08:04: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3]
[    2.572357] acpi PNP0A08:04: _OSC: platform does not support [PCIeHotplug PME AER LTR]
[    2.572469] acpi PNP0A08:04: _OSC: OS now controls [PCIeCapability]
[    2.572485] acpi PNP0A08:04: MCFG quirk: ECAM at [mem 0x88001f000000-0x880057ffffff] for [bus 1f-57] with thunder_pem_ecam_ops
[    2.573004] acpi PNP0A08:04: ECAM area [mem 0x88001f000000-0x880057ffffff] reserved by CAVA02B:00
[    2.582257] acpi PNP0A08:04: ECAM at [mem 0x88001f000000-0x880057ffffff] for [bus 1f-57]
[    2.583693] ACPI: PCI Interrupt Link [LN0A] (IRQs *48)
[    2.583734] ACPI: PCI Interrupt Link [LN0B] (IRQs *49)
[    2.583773] ACPI: PCI Interrupt Link [LN0C] (IRQs *50)
[    2.583811] ACPI: PCI Interrupt Link [LN0D] (IRQs *51)
[    6.248026] ACPI: bus type USB registered
[    6.391085] pnp: PnP ACPI init
[    6.402439] system 00:00: Plug and Play ACPI device, IDs CAVa02c PNP0c02 (active)
[    6.410317] system 00:01: Plug and Play ACPI device, IDs CAVa02c PNP0c02 (active)
[    6.418186] system 00:02: Plug and Play ACPI device, IDs CAVa02c PNP0c02 (active)
[    6.426048] system 00:03: Plug and Play ACPI device, IDs CAVa02c PNP0c02 (active)
[    6.441578] system 00:04: Plug and Play ACPI device, IDs CAVa02b PNP0c02 (active)
[    6.443614] pnp: PnP ACPI: found 5 devices
[    7.528263] ACPI: Power Button [PWRB]
[    7.533081] ACPI GTDT: [Firmware Bug]: failed to get the Watchdog base address.
[    7.559215] arm-smmu arm-smmu.0.auto: probing hardware configuration...
[    7.565818] arm-smmu arm-smmu.0.auto: SMMUv2 with:
[    7.570611] arm-smmu arm-smmu.0.auto: 	stage 1 translation
[    7.576083] arm-smmu arm-smmu.0.auto: 	stage 2 translation
[    7.581565] arm-smmu arm-smmu.0.auto: 	nested translation
[    7.586952] arm-smmu arm-smmu.0.auto: 	non-coherent table walk
[    7.592776] arm-smmu arm-smmu.0.auto: 	(IDR0.CTTW overridden by FW configuration)
[    7.600250] arm-smmu arm-smmu.0.auto: 	stream matching with 128 register groups
[    7.607549] arm-smmu arm-smmu.0.auto: 	128 context banks (0 stage-2 only)
[    7.614328] arm-smmu arm-smmu.0.auto: 	enabling workaround for Cavium erratum 27704
[    7.621977] arm-smmu arm-smmu.0.auto: 	Supported page sizes: 0x62215000
[    7.628582] arm-smmu arm-smmu.0.auto: 	Stage-1: 48-bit VA -> 48-bit IPA
[    7.635185] arm-smmu arm-smmu.0.auto: 	Stage-2: 48-bit IPA -> 48-bit PA
[    7.642339] arm-smmu arm-smmu.1.auto: probing hardware configuration...
[    7.648954] arm-smmu arm-smmu.1.auto: SMMUv2 with:
[    7.653733] arm-smmu arm-smmu.1.auto: 	stage 1 translation
[    7.659210] arm-smmu arm-smmu.1.auto: 	stage 2 translation
[    7.664682] arm-smmu arm-smmu.1.auto: 	nested translation
[    7.670072] arm-smmu arm-smmu.1.auto: 	non-coherent table walk
[    7.675892] arm-smmu arm-smmu.1.auto: 	(IDR0.CTTW overridden by FW configuration)
[    7.683366] arm-smmu arm-smmu.1.auto: 	stream matching with 128 register groups
[    7.690668] arm-smmu arm-smmu.1.auto: 	128 context banks (0 stage-2 only)
[    7.697443] arm-smmu arm-smmu.1.auto: 	enabling workaround for Cavium erratum 27704
[    7.705093] arm-smmu arm-smmu.1.auto: 	Supported page sizes: 0x62215000
[    7.711705] arm-smmu arm-smmu.1.auto: 	Stage-1: 48-bit VA -> 48-bit IPA
[    7.718311] arm-smmu arm-smmu.1.auto: 	Stage-2: 48-bit IPA -> 48-bit PA
[    7.725419] arm-smmu arm-smmu.2.auto: probing hardware configuration...
[    7.732034] arm-smmu arm-smmu.2.auto: SMMUv2 with:
[    7.736813] arm-smmu arm-smmu.2.auto: 	stage 1 translation
[    7.742291] arm-smmu arm-smmu.2.auto: 	stage 2 translation
[    7.747768] arm-smmu arm-smmu.2.auto: 	nested translation
[    7.753154] arm-smmu arm-smmu.2.auto: 	non-coherent table walk
[    7.758979] arm-smmu arm-smmu.2.auto: 	(IDR0.CTTW overridden by FW configuration)
[    7.766450] arm-smmu arm-smmu.2.auto: 	stream matching with 128 register groups
[    7.773752] arm-smmu arm-smmu.2.auto: 	128 context banks (0 stage-2 only)
[    7.780532] arm-smmu arm-smmu.2.auto: 	enabling workaround for Cavium erratum 27704
[    7.788180] arm-smmu arm-smmu.2.auto: 	Supported page sizes: 0x62215000
[    7.794782] arm-smmu arm-smmu.2.auto: 	Stage-1: 48-bit VA -> 48-bit IPA
[    7.801389] arm-smmu arm-smmu.2.auto: 	Stage-2: 48-bit IPA -> 48-bit PA
[    7.808490] arm-smmu arm-smmu.3.auto: probing hardware configuration...
[    7.815093] arm-smmu arm-smmu.3.auto: SMMUv2 with:
[    7.819885] arm-smmu arm-smmu.3.auto: 	stage 1 translation
[    7.825358] arm-smmu arm-smmu.3.auto: 	stage 2 translation
[    7.830835] arm-smmu arm-smmu.3.auto: 	nested translation
[    7.836222] arm-smmu arm-smmu.3.auto: 	non-coherent table walk
[    7.842047] arm-smmu arm-smmu.3.auto: 	(IDR0.CTTW overridden by FW configuration)
[    7.849522] arm-smmu arm-smmu.3.auto: 	stream matching with 128 register groups
[    7.856820] arm-smmu arm-smmu.3.auto: 	128 context banks (0 stage-2 only)
[    7.863607] arm-smmu arm-smmu.3.auto: 	enabling workaround for Cavium erratum 27704
[    7.871259] arm-smmu arm-smmu.3.auto: 	Supported page sizes: 0x62215000
[    7.877866] arm-smmu arm-smmu.3.auto: 	Stage-1: 48-bit VA -> 48-bit IPA
[    7.884467] arm-smmu arm-smmu.3.auto: 	Stage-2: 48-bit IPA -> 48-bit PA
[    7.937572] ata1: SATA max UDMA/133 abar m2097152@0x814000000000 port 0x814000000100 irq 17
[    7.985210] ata2: SATA max UDMA/133 abar m2097152@0x815000000000 port 0x815000000100 irq 18
[    8.032820] ata3: SATA max UDMA/133 abar m2097152@0x816000000000 port 0x816000000100 irq 19
[    8.080488] ata4: SATA max UDMA/133 abar m2097152@0x817000000000 port 0x817000000100 irq 20
[    8.318864] ata2: SATA link down (SStatus 0 SControl 300)
[    8.368867] ata3: SATA link down (SStatus 0 SControl 300)
[    8.418871] ata4: SATA link down (SStatus 0 SControl 300)
[    8.437609] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    8.444593] ata1.00: ATA-8: WDC WD5003ABYZ-011FA0, 01.01S03, max UDMA/133
[    8.451382] ata1.00: 976773168 sectors, multi 16: LBA48 NCQ (depth 32), AA
[    8.459104] ata1.00: configured for UDMA/133

Comment 11 Jeff Bastian 2019-08-26 20:27:45 UTC
The latest Fedora kernel in Koji, kernel-5.3.0-0.rc5.git2.1.fc32, also works well with these command line options.

[root@gigabyte-r120-04 ~]# uname -r
5.3.0-0.rc5.git2.1.fc32.aarch64

[root@gigabyte-r120-04 ~]# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.3.0-0.rc5.git2.1.fc32.aarch64 root=/dev/mapper/fedora_gigabyte--r120--04-root ro rd.lvm.lv=fedora_gigabyte-r120-04/root rd.lvm.lv=fedora_gigabyte-r120-04/swap arm-smmu.disable_bypass=n acpi=force

Comment 12 Mark Salter 2019-08-26 20:41:03 UTC
I think the acpi boot issue is something else. I've seen it too.
The disable_bypass is a no-op with acpi. If firmware provides an IORT table, the iommu gets setup the same regardless of the flag.

Comment 13 winson.lin 2019-08-27 15:13:47 UTC
Hi Paul,

About Cavium ThunderX Linux IOMMU , please apply below kernel parameter , thanks. 

iommu.passthrough=1

BR, Winson

Comment 14 Mark Salter 2019-08-27 16:43:47 UTC
Ah, I forgot that bit. We have a patch in RHEL for forcing passthrough. With fedora, you need both iommu.passthrough=1 and arm-smmu.disable_bypass=n.

Comment 15 Justin M. Forbes 2020-03-03 16:37:19 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There are a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 31 kernel bugs.

Fedora 31 has now been rebased to 5.5.7-200.fc31.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 32, and are still experiencing this issue, please change the version to Fedora 32.

If you experience different issues, please open a new bug report for those.

Comment 16 Justin M. Forbes 2020-03-25 22:29:57 UTC
*********** MASS BUG UPDATE **************
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 3 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.