Bug 167333 - LTC18148- FC4 installer kernel needs APIC support to install on IBM x445
Summary: LTC18148- FC4 installer kernel needs APIC support to install on IBM x445
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Dave Jones
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: 171661
TreeView+ depends on / blocked
 
Reported: 2005-09-01 17:22 UTC by Darrick Wong
Modified: 2015-01-04 22:21 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-09-24 02:00:31 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Patch to 2.6.11-1.1369 UP kernel that keeps APICs off by default. (3.61 KB, patch)
2005-09-01 17:26 UTC, Darrick Wong
no flags Details | Diff
Patch to 2.6.13-1.1624 UP kernel that keeps APICs off by default. (4.02 KB, patch)
2005-10-25 01:03 UTC, Darrick Wong
no flags Details | Diff
Patch to 2.6.13-1.1626 UP kernel that keeps APICs off by default. (3.89 KB, patch)
2005-10-26 23:18 UTC, Darrick Wong
no flags Details | Diff
disable i386 uniproc APICs by default (5.40 KB, patch)
2005-11-23 02:26 UTC, Darrick Wong
no flags Details | Diff
disable i386 uniproc APIC by default (5.43 KB, patch)
2005-11-23 09:34 UTC, Darrick Wong
no flags Details | Diff
the same, but with actual x86-64 build fixes (5.46 KB, patch)
2005-11-23 21:08 UTC, Darrick Wong
no flags Details | Diff

Description Darrick Wong 2005-09-01 17:22:59 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.10) Gecko/20050825 Firefox/1.0.6 (Ubuntu package 1.0.6)

Description of problem:
Hi there,

I tried to install FC4 on a two-chassis IBM x445 system, and the
installer fails when hardware detection starts and the system tries to
load modules for any PCI devices in the second chassis.  It appears that
interrupts from any device in the second chassis do not get routed to
the boot CPU (in the first chassis) unless APIC support is enabled in
the kernel.  Apparently, this same issue affects RHEL4 and a patch has
been written that forces APICs off by default.  This patch makes it so
that CONFIG_UP_APIC can be turned on in the UP kernel build and by
default the APICs are forced off.  Furthermore, the patch makes it so
that if a user passes "lapic apic" on the kernel command line, the APICs
will be enabled.  This makes it so that we can have APIC support for the
hardware that needs it; by leaving them off by default, we sidestep the
problem of broken APICs on other UP systems.

I took the UP kernel (2.6.11-1.1369), applied that RHEL4 patch to it,
and built a custom installer image with the patched kernel.  With this
kernel, the system boots correctly and I could install the system ok.
The attached patch is against 2.6.11-1.1369.  Would it be possible to
have this patch included in FC5?  I'm told that this APIC problem also
affects multi-node x440 and x460 systems as well.

Note that the FC4 SMP kernels aren't affected by this because APIC
support is enabled and mainline isn't either because one can enable or
disable APIC support at will.

Version-Release number of selected component (if applicable):
2.6.11-1.1369

How reproducible:
Always

Steps to Reproduce:
1. Stick FC4 CD into x445.
2. Boot installer per instructions.
3. Wait for drivers to start loading...

Actual Results:  Drivers erupt in a blizzard of complaints about device timeouts, interrupts that should have happened, etc.  But only if the devices are on the second chassis or in a RXE100 rack.

Expected Results:  Drivers find devices, register them, and the install continues.

Additional info:

Patch fixing this problem will be attached shortly.

Comment 1 Darrick Wong 2005-09-01 17:26:08 UTC
Created attachment 118357 [details]
Patch to 2.6.11-1.1369 UP kernel that keeps APICs off by default.

Apply this patch and then turn on CONFIG_X86_UP_APIC and CONFIG_X86_UP_IOAPIC. 
APIC code should remain dormant unless 'lapic apic' are specified on command
line.

Comment 2 Dave Jones 2005-09-24 02:00:31 UTC
I've applied it to rawhide CVS.  Can you try and get this upstream please?
It'd also be good to have dmi entries force the relevant boot flags on if
necessary on affected systems.


Comment 3 Darrick Wong 2005-10-25 01:03:04 UTC
Created attachment 120333 [details]
Patch to 2.6.13-1.1624 UP kernel that keeps APICs off by default.

Respin of the patch, this time without the scary BIOS bug message if
CONFIG_X86_UP_APIC_DEFAULT_OFF=y.

Comment 4 Dave Jones 2005-10-25 02:24:16 UTC
I'll merge this patch into tomorrows build, but there's also an additional case
I think.. Look at bug number 171661, and you'll see another case where we're
panicing in the lapic init code, when we don't pass in 'apic', (it works just
fine with it).

Comment 5 Darrick Wong 2005-10-26 23:18:35 UTC
Created attachment 120440 [details]
Patch to 2.6.13-1.1626 UP kernel that keeps APICs off by default.

Here's a second respin.  Now, we always jump out of APIC_init_uniprocessor if
enable_local_apic == -2, regardless of the boot cpu feature flags.  This
_should_ keep the local APIC off _except_ when expressly asked for via 'lapic'.
 Before, we were toggling the boot cpu feature flags, which wasn't reliably
keeping the lapic off.

Comment 6 Darrick Wong 2005-11-23 02:26:27 UTC
Created attachment 121381 [details]
disable i386 uniproc APICs by default

Ok, here's the latest apic-off-by-default patch, which should apply against
2.6.14-1.1707_FC5.

This patch (v6) adds two things over v4:

1. The v6 patch makes it so that the ACPI MADT is not parsed except when 'lapic
apic' are passed.  Disabling APIC_init_uniprocessor is insufficient, because
the \_PIC method in ACPI needs to be notified about which method (PIC, APIC,
etc) that we're using.	The acpi_process_madt function has a side effect of
settnig "acpi_irq_model = ACPI_IRQ_MODEL_IOAPIC"; this acpi_irq_model variable
is eventually passed to \_PIC, which means that the BIOS thinks we're using
APICs when we're not.  This is probably why Mr. Tweedie's machine gets
confused.
And yes, Dave, you were correct to suggest poking through the ACPI code to make
sure that there weren't any side effects.  :) 

2. It _also_ turns out that the get_smp_config function plays a role in
locating the local and IO APICs; if ACPI doesn't supply an MADT (see #1 above),
then this method will poke through the MP table as a backup and try to set
things up--precisely what we don't want.  Since we're assuming a uniprocessor,
APIC-less machine in this mode, we don't need MP configuration and can skip
that step.

I've tested this on a x226, a single-chassis x445 and a two-chassis x445
without problems, and I'm hoping that it resolves at least a few
problems.  Unfortunately, I've not been having any problems on our
hardware, so debugging is a bit ... difficult.

This patch is intended as a drop-in replacement of the one that's in the
rawhide kernel right now.

Comment 7 Darrick Wong 2005-11-23 09:34:05 UTC
Created attachment 121388 [details]
disable i386 uniproc APIC by default

Rev. 8 of the patch, wherein enable_local_apic is now behind a #ifdef
CONFIG_X86_LOCAL_APIC guard, which fixes the x86_64 SMP build.

Comment 8 Darrick Wong 2005-11-23 21:08:25 UTC
Created attachment 121415 [details]
the same, but with actual x86-64 build fixes

v9 = v8 + actually fix x86-64 build.

Comment 9 Dave Jones 2005-12-01 09:42:43 UTC
v9 now merged in current kernels available in rawhide today.



Note You need to log in before you can comment on or make changes to this bug.