Bug 167333

Summary: LTC18148- FC4 installer kernel needs APIC support to install on IBM x445
Product: [Fedora] Fedora Reporter: Darrick Wong <darrick>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED RAWHIDE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: bugproxy, lcm, pfrields, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-09-24 02:00:31 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 171661    
Attachments:
Description Flags
Patch to 2.6.11-1.1369 UP kernel that keeps APICs off by default.
none
Patch to 2.6.13-1.1624 UP kernel that keeps APICs off by default.
none
Patch to 2.6.13-1.1626 UP kernel that keeps APICs off by default.
none
disable i386 uniproc APICs by default
none
disable i386 uniproc APIC by default
none
the same, but with actual x86-64 build fixes none

Description Darrick Wong 2005-09-01 17:22:59 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.10) Gecko/20050825 Firefox/1.0.6 (Ubuntu package 1.0.6)

Description of problem:
Hi there,

I tried to install FC4 on a two-chassis IBM x445 system, and the
installer fails when hardware detection starts and the system tries to
load modules for any PCI devices in the second chassis.  It appears that
interrupts from any device in the second chassis do not get routed to
the boot CPU (in the first chassis) unless APIC support is enabled in
the kernel.  Apparently, this same issue affects RHEL4 and a patch has
been written that forces APICs off by default.  This patch makes it so
that CONFIG_UP_APIC can be turned on in the UP kernel build and by
default the APICs are forced off.  Furthermore, the patch makes it so
that if a user passes "lapic apic" on the kernel command line, the APICs
will be enabled.  This makes it so that we can have APIC support for the
hardware that needs it; by leaving them off by default, we sidestep the
problem of broken APICs on other UP systems.

I took the UP kernel (2.6.11-1.1369), applied that RHEL4 patch to it,
and built a custom installer image with the patched kernel.  With this
kernel, the system boots correctly and I could install the system ok.
The attached patch is against 2.6.11-1.1369.  Would it be possible to
have this patch included in FC5?  I'm told that this APIC problem also
affects multi-node x440 and x460 systems as well.

Note that the FC4 SMP kernels aren't affected by this because APIC
support is enabled and mainline isn't either because one can enable or
disable APIC support at will.

Version-Release number of selected component (if applicable):
2.6.11-1.1369

How reproducible:
Always

Steps to Reproduce:
1. Stick FC4 CD into x445.
2. Boot installer per instructions.
3. Wait for drivers to start loading...

Actual Results:  Drivers erupt in a blizzard of complaints about device timeouts, interrupts that should have happened, etc.  But only if the devices are on the second chassis or in a RXE100 rack.

Expected Results:  Drivers find devices, register them, and the install continues.

Additional info:

Patch fixing this problem will be attached shortly.

Comment 1 Darrick Wong 2005-09-01 17:26:08 UTC
Created attachment 118357 [details]
Patch to 2.6.11-1.1369 UP kernel that keeps APICs off by default.

Apply this patch and then turn on CONFIG_X86_UP_APIC and CONFIG_X86_UP_IOAPIC. 
APIC code should remain dormant unless 'lapic apic' are specified on command
line.

Comment 2 Dave Jones 2005-09-24 02:00:31 UTC
I've applied it to rawhide CVS.  Can you try and get this upstream please?
It'd also be good to have dmi entries force the relevant boot flags on if
necessary on affected systems.


Comment 3 Darrick Wong 2005-10-25 01:03:04 UTC
Created attachment 120333 [details]
Patch to 2.6.13-1.1624 UP kernel that keeps APICs off by default.

Respin of the patch, this time without the scary BIOS bug message if
CONFIG_X86_UP_APIC_DEFAULT_OFF=y.

Comment 4 Dave Jones 2005-10-25 02:24:16 UTC
I'll merge this patch into tomorrows build, but there's also an additional case
I think.. Look at bug number 171661, and you'll see another case where we're
panicing in the lapic init code, when we don't pass in 'apic', (it works just
fine with it).

Comment 5 Darrick Wong 2005-10-26 23:18:35 UTC
Created attachment 120440 [details]
Patch to 2.6.13-1.1626 UP kernel that keeps APICs off by default.

Here's a second respin.  Now, we always jump out of APIC_init_uniprocessor if
enable_local_apic == -2, regardless of the boot cpu feature flags.  This
_should_ keep the local APIC off _except_ when expressly asked for via 'lapic'.
 Before, we were toggling the boot cpu feature flags, which wasn't reliably
keeping the lapic off.

Comment 6 Darrick Wong 2005-11-23 02:26:27 UTC
Created attachment 121381 [details]
disable i386 uniproc APICs by default

Ok, here's the latest apic-off-by-default patch, which should apply against
2.6.14-1.1707_FC5.

This patch (v6) adds two things over v4:

1. The v6 patch makes it so that the ACPI MADT is not parsed except when 'lapic
apic' are passed.  Disabling APIC_init_uniprocessor is insufficient, because
the \_PIC method in ACPI needs to be notified about which method (PIC, APIC,
etc) that we're using.	The acpi_process_madt function has a side effect of
settnig "acpi_irq_model = ACPI_IRQ_MODEL_IOAPIC"; this acpi_irq_model variable
is eventually passed to \_PIC, which means that the BIOS thinks we're using
APICs when we're not.  This is probably why Mr. Tweedie's machine gets
confused.
And yes, Dave, you were correct to suggest poking through the ACPI code to make
sure that there weren't any side effects.  :) 

2. It _also_ turns out that the get_smp_config function plays a role in
locating the local and IO APICs; if ACPI doesn't supply an MADT (see #1 above),
then this method will poke through the MP table as a backup and try to set
things up--precisely what we don't want.  Since we're assuming a uniprocessor,
APIC-less machine in this mode, we don't need MP configuration and can skip
that step.

I've tested this on a x226, a single-chassis x445 and a two-chassis x445
without problems, and I'm hoping that it resolves at least a few
problems.  Unfortunately, I've not been having any problems on our
hardware, so debugging is a bit ... difficult.

This patch is intended as a drop-in replacement of the one that's in the
rawhide kernel right now.

Comment 7 Darrick Wong 2005-11-23 09:34:05 UTC
Created attachment 121388 [details]
disable i386 uniproc APIC by default

Rev. 8 of the patch, wherein enable_local_apic is now behind a #ifdef
CONFIG_X86_LOCAL_APIC guard, which fixes the x86_64 SMP build.

Comment 8 Darrick Wong 2005-11-23 21:08:25 UTC
Created attachment 121415 [details]
the same, but with actual x86-64 build fixes

v9 = v8 + actually fix x86-64 build.

Comment 9 Dave Jones 2005-12-01 09:42:43 UTC
v9 now merged in current kernels available in rawhide today.