Bug 164440 - linux-gate.so.1 has inconsistent .text padding and alignment
Summary: linux-gate.so.1 has inconsistent .text padding and alignment
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 4
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Dave Jones
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-07-27 21:42 UTC by John Reiser
Modified: 2015-01-04 22:21 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-07-28 19:07:00 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description John Reiser 2005-07-27 21:42:20 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.10) Gecko/20050720 Fedora/1.0.6-1.1.fc4 Firefox/1.0.6

Description of problem:
The padding and alignment of .text in linux-gate.so.1 is inconsistent.    __kernel_vsyscall is padded with bytes of 0 [which is "add %al,(%eax)"], __kernel_sigreturn is padded with 'nop' instructions, and __kernel_rt_sigreturn is followed very shortly by non-instructions.

For best performance, .text and non-.text should not be mixed in the same cache line, so __kernel_rt_sigreturn should be padded to 32 bytes instead of just 8.  Also for best performance, any bytes that might be seen by the instruction prefetch+decode/translate hardware should designate non-faulting addresses, so __kernel_vsyscall should not use a string of 0 bytes which decodes as "add %al,(%eax)".  Potential instructions also should not trigger register relabeling, so the safest pad is 'nop'.  Icache performance would tend to increase slightly if all three routines occupied the same 1 cache line, with  each routine starting on an 8-byte boundary.

Version-Release number of selected component (if applicable):
2.6.12-1.1398_FC4

How reproducible:
Always

Steps to Reproduce:
1. Disassemble linux-gate.so.1 as mapped into a process.
2.
3.
  

Actual Results:  0x705400 <__kernel_vsyscall>:   int    $0x80
0x705402 <__kernel_vsyscall+2>: ret
0x705403 <__kernel_vsyscall+3>: add    %al,(%eax)   ## possible costly hardware translation
0x705405 <__kernel_vsyscall+5>: add    %al,(%eax)
0x705407 <__kernel_vsyscall+7>: add    %al,(%eax)

0x705420 <__kernel_sigreturn>:  pop    %eax
0x705421 <__kernel_sigreturn+1>:        mov    $0x77,%eax
0x705426 <__kernel_sigreturn+6>:        int    $0x80
0x705428 <__kernel_sigreturn+8>:        nop   ## good safe padding
0x705429 <__kernel_sigreturn+9>:        nop


0x705440 <__kernel_rt_sigreturn>:       mov    $0xad,%eax
0x705445 <__kernel_rt_sigreturn+5>:     int    $0x80
0x705447 <__kernel_rt_sigreturn+7>:     nop
0x705448:       push   %es   ## non-.text
0x705449:       add    %al,(%eax)
0x70544b:       add    %al,(%eax,%eax,1)


Expected Results:  +0x400: <__kernel_vsyscall>:  int $0x80; ret; nop; nop; nop; nop; nop

+0x408: <__kernel_sigreturn>: pop %eax; mov $0x77,%eax; int $0x80

+0x410: <__kernel_rt_sigreturn>: mov $0xad,%eax; int $0x80; nop

+0x418: nop; nop; nop; nop; nop; nop; nop; nop

+0x420:

Additional info:

Comment 1 Roland McGrath 2005-07-28 08:29:11 UTC
This is really something to address upstream, not with Fedora.
I've posted a patch to make the padding use nops.
Please look at the sysenter version of the vDSO, which uses more than 8
instruction bytes.  The space consumed there must be the same in both flavors of
vDSO, so that __kernel_sigreturn lies at the same offset.  If you would like to
suggest compressing the padding so that sigreturn and rt_sigreturn hit the same
cache line, please do so upstream.


Note You need to log in before you can comment on or make changes to this bug.