Bug 1285107 - HPE ProLiant m400 Moonshot cartridge fails to boot 4.4+ upstream kernels due to a change in EOI handling (MADT update required in firmware)
HPE ProLiant m400 Moonshot cartridge fails to boot 4.4+ upstream kernels due ...
Status: ASSIGNED
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
rawhide
aarch64 Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Mark Salter
Fedora Extras Quality Assurance
:
Depends On:
Blocks: 1045641 1365499
  Show dependency treegraph
 
Reported: 2015-11-24 16:47 EST by Jeremy Linton
Modified: 2017-06-06 09:35 EDT (History)
13 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1365499 1459186 (view as bug list)
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
APM patch change gic interface base address (1.59 KB, patch)
2016-04-07 17:59 EDT, Tuan Phan
no flags Details | Diff

  None (edit)
Description Jeremy Linton 2015-11-24 16:47:53 EST
Description of problem: The code dropped into 4.3 that changes the way EOI is processed for arm64 depends on the gic mapped register directory addresses being correctly in the last 4k of the 64k window. If this isn't the case the kernel is unable to boot.


Version-Release number of selected component (if applicable): kernels >= 4.3


How reproducible: 100% on ACPI firmware machines.


Steps to Reproduce:
1. Build arm kernel with ACPI
2. boot with `acpi=force verbose debug` etc
3. console stops responding after initial probe

Here is the output from the ACPI dump

[02Ch 0044   1]                Subtable Type : 0B [Generic Interrupt Controller]
[02Dh 0045   1]                       Length : 4C
[02Eh 0046   2]                     Reserved : 0000
[030h 0048   4]         CPU Interface Number : 00000000
[034h 0052   4]                Processor UID : 00000000
[038h 0056   4]        Flags (decoded below) : 00000001
                           Processor Enabled : 1
          Performance Interrupt Trigger Mode : 0
          Virtual GIC Interrupt Trigger Mode : 0
[03Ch 0060   4]     Parking Protocol Version : 00000001
[040h 0064   4]        Performance Interrupt : 0000001C
[044h 0068   8]               Parked Address : 0000004000008000
[04Ch 0076   8]                 Base Address : 00000000780A0000
[054h 0084   8]     Virtual GIC Base Address : 00000000780E0000
[05Ch 0092   8]  Hypervisor GIC Base Address : 00000000780C0000
[064h 0100   4]        Virtual GIC Interrupt : 00000019
[068h 0104   8]   Redistributor Base Address : 0000000078090000
[070h 0112   8]                    ARM MPIDR : 0000000000000000
...
[28Ch 0652   1]                Subtable Type : 0C [Generic Interrupt Distributor]
[28Dh 0653   1]                       Length : 18
[28Eh 0654   2]                     Reserved : 0000
[290h 0656   4]        Local GIC Hardware ID : 00000000
[294h 0660   8]                 Base Address : 0000000078090000
[29Ch 0668   4]               Interrupt Base : 00000000
[2A0h 0672   1]                      Version : 00
[2A1h 0673   3]                     Reserved : 000000

The 0x00000000780A0000 should have 0xF000 added to it.

The distributor base is incorrect AFAIK as well, it should be 000000007809F000 but this doesn't cause problems when booted in UP mode, without KVM...
Comment 1 Mark Salter 2015-12-15 13:21:17 EST
I don't think upstream will accept a kernel patch for this. It needs to be handled in the ACPI tables.
Comment 2 Jeremy Linton 2016-01-13 10:03:55 EST
BTW: The current ACPI/GICv3 patches include a call to the DT quirking mechanism to work around this problem. 

That doesn't mean the table doesn't need to be fixed IMHO.
Comment 3 Linda Knippers 2016-03-31 12:54:33 EDT
Joe/Thomas, what do you think about this?

We now have the 03/04 FW installed.
Comment 4 Thomas Palmer 2016-03-31 16:22:27 EDT
We put in several ACPI fixes into the latest rom to appease the ACPI tools.  Are you saying this issue is still present with the 3/14 ROM? Or are you asking whether it should be tested again?

Either way, we should grab the latest fwts output.  I used the latest version available when the ROM was built and the errors that appeared Marvin and I could explain away as somehow irrelevant
Comment 5 Linda Knippers 2016-03-31 16:41:01 EDT
(In reply to Thomas Palmer from comment #4)
> We put in several ACPI fixes into the latest rom to appease the ACPI tools. 
> Are you saying this issue is still present with the 3/14 ROM? Or are you
> asking whether it should be tested again?

I believe it is still present on the 03/04 ROM, which has just been
rolled out across more m400's.  That's the latest version on your webpage.
Comment 6 Thomas Palmer 2016-03-31 16:54:09 EDT
Sorry, mis-typed on "3/14".

Can you send me what the error report is? or how they are detecting the issue?
Comment 7 Thomas Palmer 2016-03-31 16:55:34 EDT
To clarify:  I see RH's dump above, but I am wondering how they came across the issue.  Is there a particular Linux kernel error message, for instance?
Comment 8 Linda Knippers 2016-03-31 17:12:21 EDT
(In reply to Thomas Palmer from comment #7)
> To clarify:  I see RH's dump above, but I am wondering how they came across
> the issue.  Is there a particular Linux kernel error message, for instance?

I think the problem was initially discovered when a particular kernel
version (kernel-4.6.0-0.rc1.git0.1.fc25.aarch64) failed to boot, and
apparently it fails in kernels since 4.3.

I don't know if there is a pre-built version of that kernel that someone 
can provide.  Jeremy?

It apparently doesn't fail with a RHELSA kernel or with a kernel
that has some patches that workaround the problem.  

Perhaps Jeremy and/or Mark can provide more specifics.
Comment 9 Joe Shifflett 2016-03-31 17:25:39 EDT
are you referring to this from 8.14 of the GIC spec?


To enable use of 64KB pages, the GICV_* memory map must ensure that:
• The base address of the GICV_* registers is 64KB aligned.
• An alias of the GICV_* registers is provided starting at offset 0xF000 from the start of the page such that a
second copy of GICV_DIR exists at the start of the next 64KB page.



the virtual registers are the only ones I could find that reference the 4kb below a 64kb boundary.

Also, this table is describing HW registers.  are you sure they are actually in the location you're asking for?  Have we consulted APM?
Comment 10 Peter Robinson 2016-03-31 17:53:03 EDT
(In reply to Linda Knippers from comment #8)
> (In reply to Thomas Palmer from comment #7)
> > To clarify:  I see RH's dump above, but I am wondering how they came across
> > the issue.  Is there a particular Linux kernel error message, for instance?
> 
> I think the problem was initially discovered when a particular kernel
> version (kernel-4.6.0-0.rc1.git0.1.fc25.aarch64) failed to boot, and
> apparently it fails in kernels since 4.3.
> 
> I don't know if there is a pre-built version of that kernel that someone 
> can provide.  Jeremy?

It can be retrieved from the build system here:
http://arm.koji.fedoraproject.org/koji/buildinfo?buildID=364689
Comment 11 Jeremy Linton 2016-04-01 10:55:50 EDT
I initially hit a boot failure with 4.5 when the console is being reactivated with interrupts enabled, and bisected it to Mark Z's EOI changes, a quick conversation with him lead to my understanding of the problem and a patch to add 0xF000 to the base for ACPI systems corrected the problem. 

You can see the fixup in the DT path here:

http://lxr.free-electrons.com/source/drivers/irqchip/irq-gic.c#L1154

Let me reread the GIC spec and see if I can pin down where it says it should be that way.

(BTW: I assume that someone has seen the failure in a vanilla 4.6 as well?)
Comment 12 Jeremy Linton 2016-04-01 13:01:48 EDT
Well my reading of the gicv2, doesn't lead directly to a point where it says it must be offset 60k. What it does say (similar to the v3 one) is that access to the register set must be the same regardless of native/vm.It also says that the gicv_dir must be in a separate 4k page. AKA the offset of the gicc_dir register is 0x1000 from the gicc base. 

Well on xgene this isn't true with 64k pages, the gicc_dir register is offset by 0x10000 from the base. At offset 0x1000 is an alias of the first 4k. That is why the current code which simply accesses the gicc_dir at gicc_base+0x1000 fails.

Mark Z explains this in his own way:

http://lkml.iu.edu/hypermail/linux/kernel/1601.2/01631.html
Comment 13 Tuan Phan 2016-04-07 17:21:39 EDT
APM fixed this issue by release 1.16.01 by changing 0x780a0000->0x780af000 in ACPI table.
Comment 14 Thomas Palmer 2016-04-07 17:31:39 EDT
Tuan,

Can you send me (thomas.palmer@hpe.com) a patch of that change? I do not believe we've seen the 1.16.01 bundle yet.  I only have access to the 1.15.23 on myapm

Thomas
Comment 15 Tuan Phan 2016-04-07 17:59 EDT
Created attachment 1144903 [details]
APM patch change gic interface base address
Comment 16 Linda Knippers 2016-04-13 16:47:14 EDT
I've updated the c30 m400 with a test BIOS so Jeremy can try it.

Right now the system is running 4.5.0-0.33.el7.aarch64 but will
boot the 4.2.0-0.21.el7.aarch64 by default.
Comment 17 Linda Knippers 2016-06-23 17:46:55 EDT
Jeremy, we're wondering if you've had a chance to re-test this problem on c30, which has the newest firmware.
Comment 18 Peter Robinson 2016-06-23 18:01:19 EDT
I'll be testing this in the next few days on the Fedora system, I'll be looking at firmwares then. Details of firmwares would be useful as to which I should be using for an upstream 4.7rc4 + ACPI PCIe patches
Comment 19 Linda Knippers 2016-06-23 18:45:41 EDT
(In reply to Peter Robinson from comment #18)
> I'll be testing this in the next few days on the Fedora system, I'll be
> looking at firmwares then. Details of firmwares would be useful as to which
> I should be using for an upstream 4.7rc4 + ACPI PCIe patches

The only system at Red Hat that has FW with the fix is hp-moonshot-02-c30.khw.lab.eng.bos.redhat.com.  I can put that version of FW on a different m400 for testing if needed once you're ready.
Comment 20 Jeremy Linton 2016-06-27 14:15:45 EDT
Ok, I will get on it again. Is this a newer firmware than last time? I thought I (IIRC 4.6?) kernel was booting ok on it with the additional patches for pcie/xgene?

Although I'm at summit this week, so its going to be spotty.
Comment 21 Linda Knippers 2016-06-28 10:05:20 EDT
(In reply to Jeremy Linton from comment #20)
> Ok, I will get on it again. Is this a newer firmware than last time? I
> thought I (IIRC 4.6?) kernel was booting ok on it with the additional
> patches for pcie/xgene?
> 
> Although I'm at summit this week, so its going to be spotty.

It's the same FW as last time but I don't recall seeing an update that your kernel worked.

Thanks!
Comment 22 Jeremy Linton 2016-07-06 21:00:09 EDT
I think I tested it before, but I installed a mostly mainline 4.7rc5+ accepted pci patches, 48bit VA, and it actually boots all the way up, even without any xgene quirking (although the network is not working without the xgene pci quirk and there are a couple ugly messages). So, the CPUs come online, and the sata disk works too.

Crazy!!! Mainline very nearly works out of the box on this machine now.

[000h 0000   4]                    Signature : "APIC"    [Multiple APIC Description Table (MADT)]
[004h 0004   4]                 Table Length : 000002C4
[008h 0008   1]                     Revision : 03
[009h 0009   1]                     Checksum : 0E
[00Ah 0010   6]                       Oem ID : "HPE   "
[010h 0016   8]                 Oem Table ID : "ProLiant"
[018h 0024   4]                 Oem Revision : 00000001
[01Ch 0028   4]              Asl Compiler ID : "HP  "
[020h 0032   4]        Asl Compiler Revision : 00000001

[024h 0036   4]           Local Apic Address : 00000000
[028h 0040   4]        Flags (decoded below) : 00000001
                         PC-AT Compatibility : 1

[02Ch 0044   1]                Subtable Type : 0B [Generic Interrupt Controller]
[02Dh 0045   1]                       Length : 50
[02Eh 0046   2]                     Reserved : 0000
[030h 0048   4]         CPU Interface Number : 00000000
[034h 0052   4]                Processor UID : 00000000
[038h 0056   4]        Flags (decoded below) : 00000001
                           Processor Enabled : 1
          Performance Interrupt Trigger Mode : 0
          Virtual GIC Interrupt Trigger Mode : 0
[03Ch 0060   4]     Parking Protocol Version : 00000001
[040h 0064   4]        Performance Interrupt : 0000001C
[044h 0068   8]               Parked Address : 0000004FF7F00000
[04Ch 0076   8]                 Base Address : 00000000780AF000
[054h 0084   8]     Virtual GIC Base Address : 00000000780EF000
[05Ch 0092   8]  Hypervisor GIC Base Address : 00000000780CF000
[064h 0100   4]        Virtual GIC Interrupt : 00000019
[068h 0104   8]   Redistributor Base Address : 0000000078090000
[070h 0112   8]                    ARM MPIDR : 0000000000000000
[078h 0120   1]             Efficiency Class : 00
[079h 0121   3]                     Reserved : 000000

...
[2ACh 0684   1]                Subtable Type : 0C [Generic Interrupt Distributor]
[2ADh 0685   1]                       Length : 18
[2AEh 0686   2]                     Reserved : 0000
[2B0h 0688   4]        Local GIC Hardware ID : 00000000
[2B4h 0692   8]                 Base Address : 0000000078090000
[2BCh 0700   4]               Interrupt Base : 00C0h 0704   1]                      Version : 02
[2C1h 0705   3]                     Reserved : 000000
Comment 23 Thomas Palmer 2016-07-06 23:49:04 EDT
Could you send Linda and myself an email with all the quirks and ugly messages? How does the Mustang reference board fare with the same kernel?
Comment 24 Jeremy Linton 2016-07-07 12:00:42 EDT
The ugly messages, were PCIe bar assignment related because I was running without the xgene pcie quirks, which are in a state of flux. I don't currently have a clean set of patches to fix that against a mainline kernel. That said the changes should be fairly straightforward. 

Other than that and the 48bit VA issues (unrelated to xgene/m400) everything looks good.
Comment 25 Jon Masters 2016-07-28 04:26:11 EDT
Quick note that everything in comment #c24 is about kernel stuff, not platform firmware.
Comment 26 Peter Robinson 2016-07-28 04:36:27 EDT
(In reply to Jeremy Linton from comment #24)
> The ugly messages, were PCIe bar assignment related because I was running
> without the xgene pcie quirks, which are in a state of flux. I don't
> currently have a clean set of patches to fix that against a mainline kernel.
> That said the changes should be fairly straightforward. 

The Fedora 4.7.0 GA kernel has all the patches needed including the x-gene quirks patches. Note that mustang/moonshot need a new firmware to work with it (awaiting from the vendor) to fix an issue with upstream ACPI and GICv2.

http://arm.koji.fedoraproject.org/koji/buildinfo?buildID=391758
Comment 27 Thomas Palmer 2016-07-29 11:53:05 EDT
mcdivitt U02 ROM 6/25 and 7/15 both have the PcdGicInterruptInterfaceBas updated value.
Comment 28 Jon Masters 2016-08-09 07:50:27 EDT
Thanks Thomas - we will test this.
Comment 29 Jeff Bastian 2017-05-25 15:26:35 EDT
I tried to verify if this bug is fixed by installing the nightly compose Fedora-26-20170512.n.0 on an HP m400 (with firmware "U02 v1.10 (08/19/2016)") and it got into an endless loop very early in boot.  This WARNING and backtrace are just printed repeatedly forever:

[    0.000000] ------------[ cut here ]------------ 
[    0.000000] WARNING: CPU: 0 PID: 0 at ./include/linux/uaccess.h:15 __probe_kernel_read+0xc8/0xd0 
[    0.000000] Modules linked in: 
[    0.000000]  
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.11.0-1.fc26.aarch64 #1 
[    0.000000] Hardware name: (null) (DT) 
[    0.000000] task: ffff000008e13880 task.stack: ffff000008e00000 
[    0.000000] PC is at __probe_kernel_read+0xc8/0xd0 
[    0.000000] LR is at __probe_kernel_read+0x5c/0xd0 
[    0.000000] pc : [<ffff0000081ffb88>] lr : [<ffff0000081ffb1c>] pstate: 800000c5 
[    0.000000] sp : ffff000008e03e60 
[    0.000000] x29: ffff000008e03e60 x28: ffff000008e0b000  
[    0.000000] x27: 0000000000000000 x26: 0000000000000080  
[    0.000000] x25: 0000000000000280 x24: 0000000000000000  
[    0.000000] x23: ffff800fbd001600 x22: 0000000000000001  
[    0.000000] x21: ffff000008e03f0f x20: ffffffffffffffff  
[    0.000000] x19: ffff000008e13880 x18: ffff800fffeef580  
[    0.000000] x17: 0000000000000000 x16: 0000000000000000  
[    0.000000] x15: 000000007c274a1c x14: 0000000000000000  
[    0.000000] x13: 0000000000000000 x12: 0000000000000000  
[    0.000000] x11: 0000000000000000 x10: 0000000000000000  
[    0.000000] x9 : 0000000000000000 x8 : ffff800fbd00b800  
[    0.000000] x7 : 0000000000000000 x6 : ffff000008e03f10  
[    0.000000] x5 : ffff000008e03f10 x4 : 0000000000000000  
[    0.000000] x3 : 0000000000000064 x2 : 0000000000000001  
[    0.000000] x1 : 00000000ffffffff x0 : 0000000000000000  
[    0.000000]  
[    0.000000] ---[ end trace bec787061c40ac80 ]--- 
[    0.000000] Call trace: 
[    0.000000] Exception stack(0xffff000008e03c90 to 0xffff000008e03dc0) 
[    0.000000] 3c80:                                   ffff000008e13880 0001000000000000 
[    0.000000] 3ca0: ffff000008e03e60 ffff0000081ffb88 0000000000000001 00000000ffffffff 
[    0.000000] 3cc0: ffff000008e03ce0 014080c00826f440 0000000000000001 00000000ffffffff 
[    0.000000] 3ce0: ffff000008e03de0 ffff00000826f7f8 00000000000000c0 ffff800fbd003c80 
[    0.000000] 3d00: 00000000014080c0 00000000ffffffff ffff00000844d3e8 ffff800fffef4b60 
[    0.000000] 3d20: ffff800fbd003c80 ffff000008f7d000 0000000000000000 00000000ffffffff 
[    0.000000] 3d40: 0000000000000001 0000000000000064 0000000000000000 ffff000008e03f10 
[    0.000000] 3d60: ffff000008e03f10 0000000000000000 ffff800fbd00b800 0000000000000000 
[    0.000000] 3d80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[    0.000000] 3da0: 0000000000000000 000000007c274a1c 0000000000000000 0000000000000000 
[    0.000000] [<ffff0000081ffb88>] __probe_kernel_read+0xc8/0xd0 
[    0.000000] [<ffff00000822a7b4>] kmem_cache_create+0x9c/0x2f0 
[    0.000000] [<ffff000008cc1410>] sched_init+0x1d8/0x518 
[    0.000000] [<ffff000008cb09ec>] start_kernel+0x210/0x3b0 
[    0.000000] [<ffff000008cb01e0>] __primary_switched+0x64/0x6c
Comment 30 Thomas Palmer 2017-05-25 15:32:00 EDT
Please retest with RHEL73, we do not officially support Fedora.
Comment 31 Laura Abbott 2017-05-25 15:59:21 EDT
That backtrace looks like https://bugzilla.redhat.com/show_bug.cgi?id=1447166 https://bugzilla.redhat.com/show_bug.cgi?id=1448958, please test with an image that has 4.11.0-2 or newer
Comment 32 Jeff Bastian 2017-05-25 16:23:08 EDT
(In reply to Thomas Palmer from comment #30)
> Please retest with RHEL73, we do not officially support Fedora.

Right, and RHEL73 is working fine.  This bug was opened because (I believe) the Fedora Koji builders are running on HP m400 systems.


(In reply to Laura Abbott from comment #31)
> That backtrace looks like
> https://bugzilla.redhat.com/show_bug.cgi?id=1447166
> https://bugzilla.redhat.com/show_bug.cgi?id=1448958, please test with an
> image that has 4.11.0-2 or newer

Thanks Laura!  I'll give that a try and post an update soon.
Comment 33 Jeff Bastian 2017-05-25 17:17:17 EDT
I tested with Fedora-26-20170523.n.0 and kernel-4.11.0-2.el7.aarch64 gets past the problems seen in comment 29, but it hangs later in boot.

EFI stub: Booting Linux Kernel...
EFI stub: Using DTB from configuration table
EFI stub: Exiting boot services and installing virtual address map...
[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 4.11.0-2.fc26.aarch64 (mockbuild@buildvm-aarch64-01.arm.fedoraproject.org) (gcc version 7.1.1 20170503 (Red Hat 7.1.1-1) (GCC) ) #1 SMP Tue May 9 15:09:58 UTC 2017
...
[    0.000000] ACPI: RSDP 0x0000004FF8000000 000024 (v02 HP    )
[    0.000000] ACPI: XSDT 0x0000004FF7FF0000 000084 (v01 HP     ProLiant 00000001      01000013)
[    0.000000] ACPI: FACP 0x0000004FF7FB0000 000114 (v06 HPE    ProLiant 00000001 HP   00000001)
...
[    0.000000] Kernel command line: BOOT_IMAGE=(tftp)/F26-20170523.n.0/vmlinuz vnc repo=http://10.0.0.1/F26-20170523.n.0/ console=ttyS0,9600 earlycon=uart,mmio32,0x1c021000 verbose debug
...
[   17.494298] xgene-gpio APMC0D14:00: X-Gene GPIO driver registered.
[   17.568313] pcieport 0000:00:00.0: can't derive routing for PCI INT A
[   17.645381] pcieport 0000:00:00.0: PCI INT A: no GSI
[   17.704898] pcie_pme: probe of 0000:00:00.0:pcie001 failed with error -22
[   17.786673] GHES: Failed to enable APEI firmware first mode.
[   17.854865] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
[   17.954600] dw-apb-uart APMC0D08:00: cannot get irq


This is as far as it gets into booting.
Comment 34 Mark Salter 2017-05-26 10:54:48 EDT
This is the problem Jeremy Linton found:

"4.11 integrated patches for the MBIGEN interrupt controller, as part of this work additional checks were added in various places. One check in particular verifies that the interrupt producer/consumer direction is correct. This breaks the HP M400 serial console devices which makes the machine appear to fail to boot.
    
The correct fix for this is updated ACPI tables, in the interim lets patch the kernel so testing/qa/etc can continue."

The only workaround is to patch kernel:

diff --git a/drivers/acpi/irq.c b/drivers/acpi/irq.c
index 830299a..fec7d2b 100644
--- a/drivers/acpi/irq.c
+++ b/drivers/acpi/irq.c
@@ -200,8 +200,6 @@ static acpi_status acpi_irq_parse_one_cb(struct acpi_resource *ares,
                return AE_CTRL_TERMINATE;
        case ACPI_RESOURCE_TYPE_EXTENDED_IRQ:
                eirq = &ares->data.extended_irq;
-               if (eirq->producer_consumer == ACPI_PRODUCER)
-                       return AE_OK;
                if (ctx->index >= eirq->interrupt_count) {
                        ctx->index -= eirq->interrupt_count;
                        return AE_OK;
Comment 35 Jeff Bastian 2017-06-06 09:35:02 EDT
I've opened bug 1459186 for the patch in comment 34.  Thanks, Mark!

Note You need to log in before you can comment on or make changes to this bug.