Bug 149250 (IT_66328)

Summary: Different outputs with -O2 and -O1 when -fno-automatic is used
Product: Red Hat Enterprise Linux 3 Reporter: Bastien Nocera <bnocera>
Component: gccAssignee: Jakub Jelinek <jakub>
Status: CLOSED ERRATA QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: rth, tao
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2005-660 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-09-28 14:08:19 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 156320    
Attachments:
Description Flags
cmc004.sh
none
fdkcovcal6.f
none
check.f
none
gcc32-rh149250.patch none

Description Bastien Nocera 2005-02-21 17:33:51 UTC
Description of problem:
$SUMMARY

Version-Release number of selected component (if applicable):
gcc-g77-3.2.3-49

How reproducible:
Every time

Steps to Reproduce:
1. Get all 3 attached files in this bug
2. Run "./cmc004.sh"
3. Check diff output between -O1 and -O2 settings
  
Actual results:
2,6c2,6
<   0.399976244  1.10568603 -1.5855838 -1.29522322  0.850277376 -1.5855838
<  -1.29522322  1.67283722  0.587542755 -1.33928542 -1.29522322
-0.536572933
<   0.587542755 -0.287903309 -0.750987232  0.850277376  0.990607381
-1.33928542
<  -0.750987232  0.912945271  3.02031678  1.71127798 -2.15531896
-0.49984765
<   1.87155229
---
>   0.399976244  1.10568603 -1.5855838 -1.29522322  0.850277376 
1.10568603
>   0.989358246 -1.29522322 -0.536572933  0.990607381 -1.5855838
-1.29522322
>   1.67283722  0.587542755 -1.33928542 -1.29522322 -0.536572933 
0.587542755
>  -0.287903309 -0.750987232  0.850277376  0.990607381 -1.33928542
-0.750987232
>   0.912945271


Expected results:
On RHEL4, and RHL 7.1, the results are the same

Additional info:
Remove "-fno-automatic" from the CFLAGS set in FO_1 in the "cmc004.sh"
file, and the programs have the same output under -O1 and -O2

Comment 1 Bastien Nocera 2005-02-21 17:34:26 UTC
Created attachment 111265 [details]
cmc004.sh

Compilation script

Comment 2 Bastien Nocera 2005-02-21 17:34:45 UTC
Created attachment 111266 [details]
fdkcovcal6.f

Fortan source, subroutine

Comment 3 Bastien Nocera 2005-02-21 17:35:37 UTC
Created attachment 111267 [details]
check.f

Main routine

Comment 6 Jakub Jelinek 2005-02-25 18:09:16 UTC
-O2 -fno-gcse cures this.  Will debug.

Comment 7 Jakub Jelinek 2005-02-28 17:48:30 UTC
The bug is in strength reduction, which causes the loop computing:
      DO I=1,5
      DO J=1,5
      DS=0.D0
      DO K=1,5
      DS=DS+DCOVMAT(I,K)*DFT(K,J)
      ENDDO
      DF(I,J)=DS
      ENDDO
      ENDDO
to read DFT array members as if J in the second and following iteration was
one bigger than it actually should be.  So first iteration reads DFT(,1) but
second DFT(,3) and third DFT(,4).

As a workaround, -fno-automatic -fno-strength-reduce can be used.


Comment 16 Jakub Jelinek 2005-07-18 14:41:21 UTC
Simplified testcase:

C Works with:
C -O1 -m32 -mcpu=i486 -fno-automatic
C -O2 -m32 -mcpu=i486
C -O2 -m32 -mcpu=i486 -fno-automatic -fno-strength-reduce
C Fails with:
C -O2 -m32 -mcpu=i486 -fno-automatic
        SUBROUTINE FOO(D)
        REAL*8 A,B,C,D,E
        DIMENSION A(5,5),B(5,5),C(5,5),D(5,5)
        DO I=1,5
          DO J=1,5
            A(I,J)=J
            B(I,J)=1
          ENDDO
        ENDDO
        DO I=1,5
          DO J=1,5
            E=0.D0
            DO K=1,5
              E=E+B(I,K)*A(K,J)
            ENDDO
            C(I,J)=E
          ENDDO
        ENDDO
        DO I=1,5
          DO J=1,5
            D(I,J)=C(I,J)
          ENDDO
        ENDDO
        END

        REAL*8 D
        DIMENSION D(5,5)
        CALL FOO(D)
        DO I=1,5
          DO J=1,5
            IF (D(I,J).NE.5*J) CALL ABORT
          ENDDO
        ENDDO
        END


Comment 17 Jakub Jelinek 2005-07-19 10:32:31 UTC
Looking at loop_givs_reduce changes in GCC 3.3 I found
http://gcc.gnu.org/ml/gcc-patches/2002-09/msg00045.html
that indeed fixes this testcase.
In pseudo patch, the change that loop_givs_reduce was doing in this case when
reducing GIV 136, creating new pseudo 185, is:
 (note 134 484 298 NOTE_INSN_LOOP_BEG)
 (code_label 298 134 436 26 "" "" [1 uses])
 ...
 (insn 587 150 152 (set (reg:SI 136) (const_int 1 [0x1])) 45 {*movsi_1} (nil))
 ...
+(insn 876 853 881 (set (reg:SI 185) (const_int 2 [0x2])) -1 (nil) (nil))
 ...
 (note 153 492 279 NOTE_INSN_LOOP_BEG)
 (code_label 279 153 437 25 "" "" [1 uses])
 ...
 (insn 730 729 731 (parallel[(set (reg:SI 92) (ashift:SI (reg:SI 136) (const_int
2 [0x2]))) (clobber (reg:CC 17 flags))]))
 (insn 731 730 734 (parallel[(set (reg:SI 93) (plus:SI (reg:SI 92) (reg:SI 136)))
(clobber (reg:CC 17 flags))]) -1 (nil)
     (expr_list:REG_EQUAL (mult:SI (reg:SI 136) (const_int 5 [0x5])) (nil)))
 ...
 (note 177 744 240 NOTE_INSN_LOOP_BEG)
 (code_label 240 177 438 24 "" "" [1 uses])
 ...
 (note 230 228 237 NOTE_INSN_LOOP_CONT)
 ...
 (note 506 239 180 NOTE_INSN_LOOP_VTOP)
 ...
 (note 245 182 738 NOTE_INSN_LOOP_END)
 ...
 (insn 256 443 258 (parallel[(set (reg:SI 106) (ashift:SI (reg:SI 136) (const_int
2 [0x2]))) (clobber (reg:CC 17 flags))]))
 (insn 258 256 262 (parallel[(set (reg:SI 107) (plus:SI (reg:SI 106) (reg:SI
136))) (clobber (reg:CC 17 flags))])
     207 {*addsi_1} (nil) (expr_list:REG_EQUAL (mult:SI (reg:SI 136) (const_int 5
[0x5])) (nil)))
 ...
 (note 269 267 276 NOTE_INSN_LOOP_CONT)
 (insn 276 269 584 (parallel[(set (reg:SI 113) (plus:SI (reg:SI 136) (const_int 1
[0x1]))) (clobber (reg:CC 17 flags))]))
+(insn 873 276 879 (parallel[(set (reg:SI 185) (plus:SI (reg:SI 185) (const_int 1
[0x1]))) (clobber (reg:CC 17 flags))]))
-(insn 584 276 278 (set (reg:SI 136) (reg:SI 113)) 45 {*movsi_1} (nil) (nil))
+(insn 584 914 278 (set (reg:SI 136) (reg:SI 185)) -1 (nil) (nil))
 ...
 (note 498 278 156 NOTE_INSN_LOOP_VTOP)
 ...
 (note 284 158 286 NOTE_INSN_LOOP_END)
 ...
 (note 288 286 449 NOTE_INSN_LOOP_CONT)
 ...
 (note 490 297 137 NOTE_INSN_LOOP_VTOP)
 ...
 (note 303 139 305 NOTE_INSN_LOOP_END)

Here, tv->insn is insn 584.  Without Jan's patch the new increment insn (873)
is inserted before that instruction, and as the initial value for pseudo 185
is 2, not 1 (pseudo 136's initial value (1) * mult_val (1) + add_val (1)), it
means pseudo 136 has value 1 in the first iteration, but 3 in the second, 4
in the third etc.

Comment 18 Jakub Jelinek 2005-07-19 10:34:08 UTC
Created attachment 116917 [details]
gcc32-rh149250.patch

Comment 24 Red Hat Bugzilla 2005-09-28 14:08:19 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2005-660.html