Bug 52131 - gcc generates bad code in e1000 module with the -Os flag
gcc generates bad code in e1000 module with the -Os flag
Status: CLOSED NEXTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: gcc (Show other bugs)
9
i386 Linux
medium Severity high
: ---
: ---
Assigned To: Jakub Jelinek
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-08-20 17:24 EDT by Asit Mallick
Modified: 2007-04-18 12:36 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-10-05 14:46:50 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
detailed information, not messed up by text box (7.18 KB, text/plain)
2001-08-20 17:27 EDT, Asit Mallick
no flags Details

  None (edit)
Description Asit Mallick 2001-08-20 17:24:39 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0)

Description of problem:
When looking into bug #49571, I discovered that gcc 2.96 was generating 
bad code in one section of the driver when built with the -Os option for 
the BOOT kernel.

- Chris Leech <christoper.leech@intel.com>

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. build the e1000 module with gcc 2.96 and the -Os flag
	

Additional info:

This code segment from e1000_fxhw.c (along with a macro in e1000_fxhw.h 
that
is needed to understand it) shows the problem the e1000 driver was having 
when
built with the -Os option for the BOOT kernel (bug 49571)

Not ignoring the return value of E1000_WRITE_REG, which is the return 
value of
writel, by asigning it to a temporary variable fixes the problem.  
Declaring
HardwareVirtualAddress to be volatile (which is a much cleaner fix than
putting unused temporary variables at every register access throughout the
entire driver) also fixes things.

Adapter->HardwareVirutalAddress is a pointer to the kernel space virtual
address of the NICs PCI memory mapped register set.

----------------------------------------

#define E1000_WRITE_REG(reg, value) ((Adapter->MacType >= MAC_LIVENGOOD) ? 
\
                                     writel(value, &((PE1000_REGISTERS)
Adapter->HardwareVirtualAddress)->reg) : \
                                     writel(value, &((POLD_REGISTERS)
Adapter->HardwareVirtualAddress)->reg))

static void RaiseClock(struct adapter *Adapter, u32 *EecdRegValue)
{
	*EecdRegValue = *EecdRegValue | E1000_EESK;

	E1000_WRITE_REG(Eecd, *EecdRegValue);

	udelay(50);
}

----------------------------------------

This is what it looks like after pre-compiling (gcc -E)

----------------------------------------

static void RaiseClock(struct adapter *Adapter, u32 *EecdRegValue)
{
	*EecdRegValue = *EecdRegValue | (0x00000001);

	((Adapter->MacType >= MAC_LIVENGOOD) ? 
	(*(volatile unsigned int *) ((void *) (&((PE1000_REGISTERS)Adapter-
>HardwareVirtualAddress)->Eecd)) = (*EecdRegValue)) :
	(*(volatile unsigned int *) ((void *) (&((POLD_REGISTERS)  Adapter-
>HardwareVirtualAddress)->Eecd)) = (*EecdRegValue)));

	(__builtin_constant_p(50) ? ((50) > 20000 ? __bad_udelay() : 
__const_udelay_Reae3dfd6((50) * 0x10c6ul)) : __udelay_R9e7d6bd0(50));
}

----------------------------------------

Here is the generated asm (gcc -S), the oops occurs when trying to 
reference
off of ecx [ movl %edx, 16(%ecx) ] which is 0.  Note that ecx never is set,
and the pointer to the adapter struct is never read off the stack.

----------------------------------------

	.align 4
.stabs "RaiseClock:f(283,19)",36,0,1050,RaiseClock
.stabs "Adapter:p(237,3)",160,0,1049,8
.stabs "EecdRegValue:p(0,21)=*(283,14)",160,0,1049,12
	.type	 RaiseClock,@function
RaiseClock:
	.stabn 68,0,1050,.LM573-RaiseClock
.LM573:
	pushl	%ebp
	movl	%esp, %ebp
	pushl	%eax
	pushl	%eax
	movl	12(%ebp), %eax
	.stabn 68,0,1052,.LM574-RaiseClock
.LM574:
	movl	(%eax), %edx
	orl	$1, %edx
	movl	%edx, (%eax)
	.stabn 68,0,1054,.LM575-RaiseClock
.LM575:
	movl	%edx, 16(%ecx)
	.stabn 68,0,1056,.LM576-RaiseClock
.LM576:
	movl	$214700, 8(%ebp)
	leave
	jmp	__const_udelay_Reae3dfd6
.Lfe19:
	.size	 RaiseClock,.Lfe19-RaiseClock
.stabs "EecdRegValue:r(0,21)",64,0,1049,0
.Lscope18:
.stabs "",36,0,0,.Lscope18-RaiseClock

----------------------------------------

Here it is again after changing the HardwareVirtualAddress field in the
adapter struct to be volatile.  This time the pointer to the adapter 
struct is
read off the stack, the pointer to PCI memory is read from it, and the
register write is done properly.

----------------------------------------

	.align 4
.stabs "RaiseClock:f(283,19)",36,0,1050,RaiseClock
.stabs "Adapter:p(237,3)",160,0,1049,8
.stabs "EecdRegValue:p(0,21)=*(283,14)",160,0,1049,12
	.type	 RaiseClock,@function
RaiseClock:
	.stabn 68,0,1050,.LM559-RaiseClock
.LM559:
	pushl	%ebp
	movl	%esp, %ebp
	pushl	%ecx
	pushl	%ecx
	movl	12(%ebp), %eax
	.stabn 68,0,1052,.LM560-RaiseClock
.LM560:
	movl	(%eax), %ecx
	orl	$1, %ecx
	.stabn 68,0,1050,.LM561-RaiseClock
.LM561:
	movl	8(%ebp), %edx
	.stabn 68,0,1052,.LM562-RaiseClock
.LM562:
	movl	%ecx, (%eax)
	.stabn 68,0,1054,.LM563-RaiseClock
.LM563:
	cmpb	$1, 64(%edx)
	movl	8(%edx), %eax
	movl	%ecx, 16(%eax)
	.stabn 68,0,1056,.LM564-RaiseClock
.LM564:
	movl	$214700, 8(%ebp)
	leave
	jmp	__const_udelay_Reae3dfd6
.Lfe19:
	.size	 RaiseClock,.Lfe19-RaiseClock
.stabs "Adapter:r(237,3)",64,0,1049,2
.stabs "EecdRegValue:r(0,21)",64,0,1049,0
.Lscope18:
.stabs "",36,0,0,.Lscope18-RaiseClock

----------------------------------------

Just to make it easier to see, the output of objdump showing the problem.

----------------------------------------

static void RaiseClock(struct adapter *Adapter, u32 *EecdRegValue)
 {
    1068:	55                   	push   %ebp
    1069:	89 e5                	mov    %esp,%ebp
    106b:	50                   	push   %eax
    106c:	50                   	push   %eax
    106d:	8b 45 0c             	mov    0xc(%ebp),%eax

    *EecdRegValue = *EecdRegValue | E1000_EESK;
    1070:	8b 10                	mov    (%eax),%edx
    1072:	83 ca 01             	or     $0x1,%edx
    1075:	89 10                	mov    %edx,(%eax)

    E1000_WRITE_REG(Eecd, *EecdRegValue);
    1077:	89 51 10             	mov    %edx,0x10(%ecx)

    DelayInMicroseconds(50);
    107a:	c7 45 08 ac 46 03 00 	movl   $0x346ac,0x8(%ebp)
    1081:	c9                   	leave  
    1082:	e9 fc ff ff ff       	jmp    1083 <RaiseClock+0x1b>
    1087:	90                   	nop    
}

----------------------------------------

and again, after makeing HardwareVirtualAddress volatile things look better

----------------------------------------

static void RaiseClock(struct adapter *Adapter, u32 *EecdRegValue)
 {
    113c:	55        	push   %ebp
    113d:	89 e5                	mov    %esp,%ebp
    113f:	51                   	push   %ecx
    1140:	51                   	push   %ecx
    1141:	8b 45 0c             	mov    0xc(%ebp),%eax

    *EecdRegValue = *EecdRegValue | E1000_EESK;
    1144:	8b 08                	mov    (%eax),%ecx
    1146:	83 c9 01             	or     $0x1,%ecx
    1149:	8b 55 08             	mov    0x8(%ebp),%edx
    114c:	89 08                	mov    %ecx,(%eax)

    E1000_WRITE_REG(Eecd, *EecdRegValue);
    114e:	80 7a 40 01          	cmpb   $0x1,0x40(%edx)
    1152:	8b 42 08             	mov    0x8(%edx),%eax
    1155:	89 48 10             	mov    %ecx,0x10(%eax)

    DelayInMicroseconds(50);
    1158:	c7 45 08 ac 46 03 00 	movl   $0x346ac,0x8(%ebp)
    115f:	c9                   	leave  
    1160:	e9 fc ff ff ff       	jmp    1161 <RaiseClock+0x25>
    1165:	8d 76 00             	lea    0x0(%esi),%esi
}

----------------------------------------

This shows the results of using egcs-2.91.66, without changing
HardwareVirtualAddress to volatile.  Different, but still good.

----------------------------------------

static void RaiseClock(struct adapter *Adapter, u32 *EecdRegValue)
{
     e44:	55                   	push   %ebp
     e45:	89 e5                	mov    %esp,%ebp
     e47:	8b 4d 08             	mov    0x8(%ebp),%ecx
     e4a:	8b 45 0c             	mov    0xc(%ebp),%eax

    *EecdRegValue = *EecdRegValue | E1000_EESK;
     e4d:	8b 10                	mov    (%eax),%edx
     e4f:	83 ca 01             	or     $0x1,%edx
     e52:	89 10                	mov    %edx,(%eax)

    E1000_WRITE_REG(Eecd, *EecdRegValue);
     e54:	8b 41 08             	mov    0x8(%ecx),%eax
     e57:	89 50 10             	mov    %edx,0x10(%eax)

    DelayInMicroseconds(50);
     e5a:	68 ac 46 03 00       	push   $0x346ac
     e5f:	e8 fc ff ff ff       	call   e60 <RaiseClock+0x1c>
}
     e64:	c9                   	leave  
     e65:	c3                   	ret
Comment 1 Asit Mallick 2001-08-20 17:27:54 EDT
Created attachment 28535 [details]
detailed information, not messed up by text box
Comment 2 Jakub Jelinek 2001-08-20 18:45:08 EDT
Can you please provide full preprocessed source, so that I can debug it?
Thanks.
Comment 3 Asit Mallick 2001-08-21 12:53:15 EDT
The full source is included the roswell kernel package.

- Chris Leech <christopher.leech@intel.com>
Comment 4 Glen Foster 2001-08-21 16:30:17 EDT
This defect is considered SHOULD-FIX for Fairfax.
Comment 5 Jakub Jelinek 2001-08-22 10:34:16 EDT
Seems to be fixed with http://gcc.gnu.org/ml/gcc-patches/2000-11/msg00455.html
which will be included in gcc-2.96-98.
Comment 6 Vladimir Makarov 2004-10-05 14:46:50 EDT
gcc-2.96 is too old.  Its release cycle was finished long ago.  Also
Jakub wrote about fixing the bug in gcc-2.96-98.  Therefore I am
closing the case.  If  it is still important we could reopen it.

Note You need to log in before you can comment on or make changes to this bug.