Bug 674829 - apic: Physflat mode should be used when there are more than 8 CPUs on a system
Summary: apic: Physflat mode should be used when there are more than 8 CPUs on a system
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.6
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: rc
: ---
Assignee: Prarit Bhargava
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-02-03 13:41 UTC by Konstantin Khorenko
Modified: 2011-02-07 18:12 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-02-07 18:12:24 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Konstantin Khorenko 2011-02-03 13:41:04 UTC
Andrey Vagin from Parallels Linux Kernel team found an issue with incorrect apic initialization on nodes with more than 8 CPUs in case CONFIG_HOTPLUG_CPU is disabled.

Current RHEL5 kernel (2.6.18-238.1.1) contains the following code:
arch/x86_64/kernel/genapic.c
 41 void __init clustered_apic_check(void)
 42 {
...
 48 #ifdef CONFIG_HOTPLUG_CPU
...
 50                 genapic = &apic_physflat;
...
 58                 if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) {
 59                         for (i = 0; i < NR_CPUS; i++) {
 60                                 if (bios_cpu_apicid[i] == BAD_APICID)
 61                                         continue;
 62                                 if (bios_cpu_apicid[i] > max_apic)
 63                                         max_apic = bios_cpu_apicid[i];
 64                         }
 65 
 66                         if (max_apic <= 8)
 67                                 genapic = &apic_flat;
 68                 }
 69 #else
 70                 genapic = &apic_flat;
 71 #endif

This means if the kernel is compiled without CONFIG_HOTPLUG_CPU a node with more than 8 CPUs will try to use "apic_flat", which won't work.
This is a real issue: we experienced a 64 CPUs node hang on boot (the last line on log was "Jan 25 14:28:35 test-server Brought up 64 CPUs".

The change was introduced by the following patch: linux-2.6-x86_64-unify-apic-mapping-code.patch

NOTE 1: native Redhat 2.6.18-238.1.1 kernel is not affected because it is built with CONFIG_HOTPLUG_CPU config option enabled.

NOTE2: In the current code in Redhat code apic_flat is used only on AMD nodes with less than 8 CPUs. Why the node must be AMD? Mainstream code (2.6.37) do not differ AMD nodes from Intel ones in this case:
./arch/x86/kernel/apic/probe_64.c
 52 /*
 53  * Check the APIC IDs in bios_cpu_apicid and choose the APIC mode.
 54  */
 55 void __init default_setup_apic_routing(void)
...
 73         if (apic == &apic_flat && num_possible_cpus() > 8)
 74                         apic = &apic_physflat;
...

i wonder do you have any issues with Intel nodes with less than 8 cpus using apic_flat?

Thank you.

Comment 1 Prarit Bhargava 2011-02-04 20:51:54 UTC
> NOTE 1: native Redhat 2.6.18-238.1.1 kernel is not affected because it is built
> with CONFIG_HOTPLUG_CPU config option enabled.
> 

Right -- we have not tested the !CONFIG_HOTPLUG_CPU case.

> NOTE2: In the current code in Redhat code apic_flat is used only on AMD nodes
> with less than 8 CPUs. Why the node must be AMD? Mainstream code (2.6.37) do
> not differ AMD nodes from Intel ones in this case:
> ./arch/x86/kernel/apic/probe_64.c
>  52 /*
>  53  * Check the APIC IDs in bios_cpu_apicid and choose the APIC mode.
>  54  */
>  55 void __init default_setup_apic_routing(void)
> ...
>  73         if (apic == &apic_flat && num_possible_cpus() > 8)
>  74                         apic = &apic_physflat;
> ...
> 
> i wonder do you have any issues with Intel nodes with less than 8 cpus using
> apic_flat?

None that I know of.

P.

> 
> Thank you.

Comment 2 Konstantin Khorenko 2011-02-07 08:55:04 UTC
> > i wonder do you have any issues with Intel nodes with less than 8 cpus using
> > apic_flat?
> 
> None that I know of.

Thank you for handling this, and as there are no issues with Intel nodes with less than 8 cpus using apic_flat, may be it is better to let Intel node also benefit from apic_flat performance as it is done in mainstream nowadays?
Are you doing to make this change?

Comment 3 Prarit Bhargava 2011-02-07 18:12:24 UTC
(In reply to comment #2)
> > > i wonder do you have any issues with Intel nodes with less than 8 cpus using
> > > apic_flat?
> > 
> > None that I know of.
> 
> Thank you for handling this, and as there are no issues with Intel nodes with
> less than 8 cpus using apic_flat, may be it is better to let Intel node also
> benefit from apic_flat performance as it is done in mainstream nowadays?
> Are you doing to make this change?

It is unlikely that we will change to apic_flat unless we see a significant bug against physical flat.  Most of our customers run with apic flat and have been doing so for some time without issue.

P.


Note You need to log in before you can comment on or make changes to this bug.