Bug 674829

Summary: apic: Physflat mode should be used when there are more than 8 CPUs on a system
Product: Red Hat Enterprise Linux 5 Reporter: Konstantin Khorenko <khorenko>
Component: kernelAssignee: Prarit Bhargava <prarit>
Status: CLOSED WONTFIX QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: low Docs Contact:
Priority: unspecified    
Version: 5.6CC: jarod
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-02-07 18:12:24 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Konstantin Khorenko 2011-02-03 13:41:04 UTC
Andrey Vagin from Parallels Linux Kernel team found an issue with incorrect apic initialization on nodes with more than 8 CPUs in case CONFIG_HOTPLUG_CPU is disabled.

Current RHEL5 kernel (2.6.18-238.1.1) contains the following code:
arch/x86_64/kernel/genapic.c
 41 void __init clustered_apic_check(void)
 42 {
...
 48 #ifdef CONFIG_HOTPLUG_CPU
...
 50                 genapic = &apic_physflat;
...
 58                 if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) {
 59                         for (i = 0; i < NR_CPUS; i++) {
 60                                 if (bios_cpu_apicid[i] == BAD_APICID)
 61                                         continue;
 62                                 if (bios_cpu_apicid[i] > max_apic)
 63                                         max_apic = bios_cpu_apicid[i];
 64                         }
 65 
 66                         if (max_apic <= 8)
 67                                 genapic = &apic_flat;
 68                 }
 69 #else
 70                 genapic = &apic_flat;
 71 #endif

This means if the kernel is compiled without CONFIG_HOTPLUG_CPU a node with more than 8 CPUs will try to use "apic_flat", which won't work.
This is a real issue: we experienced a 64 CPUs node hang on boot (the last line on log was "Jan 25 14:28:35 test-server Brought up 64 CPUs".

The change was introduced by the following patch: linux-2.6-x86_64-unify-apic-mapping-code.patch

NOTE 1: native Redhat 2.6.18-238.1.1 kernel is not affected because it is built with CONFIG_HOTPLUG_CPU config option enabled.

NOTE2: In the current code in Redhat code apic_flat is used only on AMD nodes with less than 8 CPUs. Why the node must be AMD? Mainstream code (2.6.37) do not differ AMD nodes from Intel ones in this case:
./arch/x86/kernel/apic/probe_64.c
 52 /*
 53  * Check the APIC IDs in bios_cpu_apicid and choose the APIC mode.
 54  */
 55 void __init default_setup_apic_routing(void)
...
 73         if (apic == &apic_flat && num_possible_cpus() > 8)
 74                         apic = &apic_physflat;
...

i wonder do you have any issues with Intel nodes with less than 8 cpus using apic_flat?

Thank you.

Comment 1 Prarit Bhargava 2011-02-04 20:51:54 UTC
> NOTE 1: native Redhat 2.6.18-238.1.1 kernel is not affected because it is built
> with CONFIG_HOTPLUG_CPU config option enabled.
> 

Right -- we have not tested the !CONFIG_HOTPLUG_CPU case.

> NOTE2: In the current code in Redhat code apic_flat is used only on AMD nodes
> with less than 8 CPUs. Why the node must be AMD? Mainstream code (2.6.37) do
> not differ AMD nodes from Intel ones in this case:
> ./arch/x86/kernel/apic/probe_64.c
>  52 /*
>  53  * Check the APIC IDs in bios_cpu_apicid and choose the APIC mode.
>  54  */
>  55 void __init default_setup_apic_routing(void)
> ...
>  73         if (apic == &apic_flat && num_possible_cpus() > 8)
>  74                         apic = &apic_physflat;
> ...
> 
> i wonder do you have any issues with Intel nodes with less than 8 cpus using
> apic_flat?

None that I know of.

P.

> 
> Thank you.

Comment 2 Konstantin Khorenko 2011-02-07 08:55:04 UTC
> > i wonder do you have any issues with Intel nodes with less than 8 cpus using
> > apic_flat?
> 
> None that I know of.

Thank you for handling this, and as there are no issues with Intel nodes with less than 8 cpus using apic_flat, may be it is better to let Intel node also benefit from apic_flat performance as it is done in mainstream nowadays?
Are you doing to make this change?

Comment 3 Prarit Bhargava 2011-02-07 18:12:24 UTC
(In reply to comment #2)
> > > i wonder do you have any issues with Intel nodes with less than 8 cpus using
> > > apic_flat?
> > 
> > None that I know of.
> 
> Thank you for handling this, and as there are no issues with Intel nodes with
> less than 8 cpus using apic_flat, may be it is better to let Intel node also
> benefit from apic_flat performance as it is done in mainstream nowadays?
> Are you doing to make this change?

It is unlikely that we will change to apic_flat unless we see a significant bug against physical flat.  Most of our customers run with apic flat and have been doing so for some time without issue.

P.