This service will be undergoing maintenance at 20:00 UTC, 2017-04-03. It is expected to last about 30 minutes
Bug 52451 - structure misalignment problem in gcc 2.96-81
structure misalignment problem in gcc 2.96-81
Status: CLOSED WONTFIX
Product: Red Hat Linux
Classification: Retired
Component: gcc (Show other bugs)
7.1
i386 Linux
medium Severity high
: ---
: ---
Assigned To: Jakub Jelinek
David Lawrence
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-08-23 16:13 EDT by Paul Clements
Modified: 2007-04-18 12:36 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2002-12-15 12:36:28 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
pre-processed source file, which demonstrates misalignment problem (613.34 KB, patch)
2001-08-23 16:37 EDT, Paul Clements
no flags Details | Diff

  None (edit)
Description Paul Clements 2001-08-23 16:13:04 EDT
Description of Problem:

We have been experiencing some problems trying to use kernel modules
with kernels that are compiled with different versions of gcc. On our
kernel build machine (where we compile our kernel modules) we have gcc
2.91.66 (I believe the preferred kernel compiler, according to
Documentation/Changes); RedHat 7.1 ships with gcc 2.96. 

Now, the problem is that RedHat also apparently compiles (at least its
newer) kernels with the 2.96 gcc. Unfortunately, there appears to be a
structure misalignment problem in gcc 2.96. 

One particular instance of this problem that we are running into is in
the raid1.o module in the 2.4.3 kernel. The structure alignment problem
is causing our gcc 2.91.66-compiled raid1 module to malfunction.
(raid1.o compiled from the same source on gcc 2.96 works fine.) We've
traced the problem down to the following assembly code generated by the
2.96 and 2.91.66 gcc's respectively:

(assembly code for parameter setup and call to __alloc_pages (within
raid1_grow_buffers))

2.96:

movl    $contig_page_data_Rsmp_cef82582+3800, %eax
call    __alloc_pages_Rsmp_decacc2f

2.91.66:

movl    $contig_page_data_Rsmp_cef82582+3884,%eax
call    __alloc_pages_Rsmp_decacc2f
 

gcc 2.91.66 is padding out the zone_t structure by 28 bytes. With an
array of 3 of those before our field in question that equals 84 bytes
offset in the above assembler code.
 
The 28 byte padding is because gcc 2.91.66 is trying to 32 byte align
this structure. The reason for this is that the first submember of
zone_t is explicitly defined as 32 byte aligned (per_cpu_t).

So, gcc 2.91.66 is (properly) aligning the per_cpu_t structure on a 32
byte boundary as specified by the __attribute__((aligned(32))) directive
in that structure's definition:

(gdb) p &((pg_data_t *)0)->node_zones[1].cpu_pages[0]
$22 = (per_cpu_t *) 0x4e0

(gdb) p &((pg_data_t *)0)->node_zones[1].cpu_pages[1]
$23 = (per_cpu_t *) 0x500

(gdb) p 0x500 % 32
$24 = 0

(gdb) p 0x4e0 % 32
$25 = 0


gcc 2.96 is not properly aligning this structure:

(gdb) p &((pg_data_t *)0)->node_zones[1].cpu_pages[0]
$32 = (per_cpu_t *) 0x4c4

(gdb) p &((pg_data_t *)0)->node_zones[1].cpu_pages[1]
$33 = (per_cpu_t *) 0x4e4

(gdb) p 0x4c4 % 32
$34 = 4

(gdb) p 0x4e4 % 32
$35 = 4



So, in order for our raid1 modules to work properly with a kernel
compiled by gcc 2.96, we must also use (the broken) 2.96 to compile our
module.





Version-Release number of selected component (if applicable):

# rpm -q gcc
gcc-2.96-81

How Reproducible:

compile raid1.o kernel module with gcc 2.91.66 and attempt to run it on RH
2.4.3-12 kernel (compiled with gcc 2.96)

module fails to work properly - data is not resynchronized when software
array is created as it is supposed to 

(root cause is failure in call to __alloc_pages kernel function due to
structure misalignment problems)

Steps to Reproduce:
1. 
2. 
3. 

Actual Results:


Expected Results:


Additional Information:
Comment 1 Jakub Jelinek 2001-08-23 16:20:00 EDT
Can you please attach the exact preprocessed source which shows this?
I've tried
typedef struct x { int a; int b; } __attribute__((aligned(32))) X;
typedef struct y { X x; int c; } Y;

Y y[3];
which models about what I can see in 2.4.7's mmzone.h and y has the same
size and alignment both with all 2.96-RH's I've tried and egcs 1.1.2.
Comment 2 Paul Clements 2001-08-23 16:37:25 EDT
Created attachment 29274 [details]
pre-processed source file, which demonstrates misalignment problem
Comment 3 Jakub Jelinek 2001-08-23 16:59:38 EDT
Simplified testcase
typedef struct x { int a; int b; } __attribute__((aligned(32))) X;
typedef struct y { X x[32]; int c; } Y;

Y y[3];

int main(void)
{
  if (sizeof (y) != 3168)
    abort ();
  exit (0);
}
(the X in array is important, changing it to
typedef struct y { X x; X y[31]; int c; } Y;
fixes it).
This testcase works on gcc < 2.96-RH or on 2.96-RH+ (incl. 3.0, 3.0.1, 3.1)
on non-IA-32 architectures (tested alpha, IA-64, sparc).
Fails on IA-32 with 2.96-RH, 3.0, 3.0.1, 3.1.
Apparently some config/i386 alignment issue, will debug this tomorrow.
Comment 4 Jakub Jelinek 2001-09-04 09:35:55 EDT
s/tomorrow/today/.
Analysis with a patch is at http://gcc.gnu.org/ml/gcc-patches/2001-09/msg00072.html
but I'm not sure I want to apply this, since then would mean binary incompatibility
between modules created with gcc < 2.96-98 and modules created with >= 2.96-98,
which is a worse thing than binary incompatibility with egcs 1.1.2.
Comment 5 Alan Cox 2002-12-15 12:36:28 EST
gcc 3.2 should have resolved all these issues

Note You need to log in before you can comment on or make changes to this bug.