pdftexdir/wprob.test and pdftexdir/pdfimage.test fail when texlive is compiled with gcc-4.9.0-12.fc21.s390x with -O2, while -O1 and -O0 work correctly. I have tracked the problem down to function macrocall() from pdftex0.c. Prefixing the function with __attribute__((optimize (0))) fixes the problem. After calling macrocall() the program continues running but one of the subsequent calls to getnext() causes a segfault. Attaching the preprocessed source and backtrace. Let me know if I can provide any more information.
Created attachment 916753 [details] pdftex0.i
Created attachment 916754 [details] backtrace
Just a note - pdftex0.c is generated from a pascal source file that is generated from native *.web TeX source file, so please be benevolent.
So, like with the previous bugreport, we need a self-contained short testcase first. Does the problem still exist if you replace "optimize (0)" with "noinline, noclone" (i.e. if macrocall is optimized normally, just doesn't get inlined and is not cloned)? What about if you compiled with -O2 -fno-inline? Thus, is the problem really in macrocall itself or perhaps in some code inlined into it? How many calls to macrocall there are before it crashes?
-O2 -fno-inline still reproduces, also compiling the whole source file with -O0 and just the macrocall function with __attribute__((noinline, noclone, optimize (2))) reproduces it too, with -mno-lra it works. Command line options (with __attribute__((noinline, noclone, optimize (2))) added to macrocall): -fexceptions -fstack-protector-strong -m64 -march=z9-109 -mtune=z10 -fno-strict-aliasing -fno-inline -O0 There are some -Wmaybe-uninitialized warnings, but initializing the variables doesn't seem to fix the problem. From debugging it seems the problem in the macrocall function is that the n variable is zero when it should be 1 near the end of the function. In the assembly, it seems n has been hoisted to %r15+196 (32-bit word) or, because of endianity, just the low 8 bits of that %r15+199: mvi 199(%r15),0 corresponds to n = 0; then for: else pstack [n ]= mem [memtop - 3 ].hh .v.RH ; incr ( n ) ; I'm seeing: .L2853: .loc 1 10331 0 lgfr %r1,%r1 llgc %r2,199(%r15) larl %r8,pstack sllg %r1,%r1,3 sllg %r2,%r2,2 ly %r1,-24(%r1,%r12) st %r1,0(%r2,%r8) .L2854: .loc 1 10332 0 l %r1,196(%r15) ahi %r1,1 st %r1,180(%r15) stc %r1,184(%r15) .loc 1 10333 0 where the first basic block looks correct, it reads byte at 199+%r15, but the second, which is supposed to do ++n; looks wrong, while it reads the right value, it it stores it elsewhere. printint ( n ) ; a few lines below again assumes %r15+199 (i.e. given the increment the old n value rather than new n value): .loc 1 10337 0 llgc %r2,199(%r15) brasl %r14,zprintint then: showtokenlist ( pstack [n - 1 ], -268435455L , 1000 ) ; looks like: .loc 1 10339 0 llc %r1,183(%r15) lgfi %r3,-268435455 lghi %r4,1000 ahi %r1,-1 lgfr %r1,%r1 sllg %r1,%r1,2 lgf %r2,0(%r1,%r8) brasl %r14,zshowtokenlist and there it uses the low byte of the %r15+180 value (i.e. the new incremented n). pstack [n ]= mem [memtop - 3 ].hh .v.RH ; became: .loc 1 10165 0 ic %r3,199(%r15) larl %r1,memtop larl %r8,pstack .LBB23: lghi %r9,0 .LBE23: lgf %r2,0(%r1) llgcr %r1,%r3 .LBB24: lgr %r10,%r1 aghi %r10,1 .LBE24: sllg %r2,%r2,3 sllg %r1,%r1,2 ly %r2,-24(%r2,%r12) st %r2,0(%r1,%r8) pstack [n ]= mem [memtop - 3 ].hh .v.RH ; .loc 1 10226 0 ic %r3,199(%r15) larl %r1,memtop larl %r8,pstack ... and finally: if ( n > 0 ) .loc 1 10350 0 llc %r0,199(%r15) .loc 1 10348 0 l %r1,0(%r1) st %r1,16(%r12) .loc 1 10350 0 ltr %r0,%r0 .loc 1 10349 0 l %r1,0(%r6) st %r1,8(%r12) .loc 1 10350 0 je .L2822
In *.ira dump we have: (insn 673 675 674 85 (parallel [ (set (reg:SI 583) (plus:SI (subreg:SI (reg/v:QI 281 [ n ]) 0) (const_int 1 [0x1]))) (clobber (reg:CC 33 %cc)) ]) pdftex0.c:10332 327 {*addsi3} (expr_list:REG_DEAD (reg/v:QI 281 [ n ]) (expr_list:REG_UNUSED (reg:CC 33 %cc) (nil)))) (insn 674 673 676 85 (set (reg/v:QI 281 [ n ]) (subreg:QI (reg:SI 583) 3)) pdftex0.c:10332 74 {*movqi} (nil)) and *.reload turns this into: Choosing alt 2 in insn 673: (0) d (1) 0 (2) K {*addsi3} Creating newreg=851 from oldreg=583, assigning class GENERAL_REGS to r851 673: {r851:SI=r851:SI+0x1;clobber %cc:CC;} REG_DEAD r281:QI REG_UNUSED %cc:CC Inserting insn reload before: 1268: r851:SI=r281:QI#0 Inserting insn reload after: 1269: r583:SI=r851:SI (insn 1268 675 673 85 (set (reg:SI 1 %r1 [583]) (mem/c:SI (plus:DI (reg/f:DI 15 %r15) (const_int 196 [0xc4])) [0 %sfp+-36 S4 A8])) pdftex0.c:10332 67 {*movsi_zarch} (nil)) (insn 673 1268 1301 85 (parallel [ (set (reg:SI 1 %r1 [583]) (plus:SI (reg:SI 1 %r1 [583]) (const_int 1 [0x1]))) (clobber (reg:CC 33 %cc)) ]) pdftex0.c:10332 327 {*addsi3} (nil)) (note 1301 673 1300 85 NOTE_INSN_DELETED) (insn 1300 1301 674 85 (set (mem/c:SI (plus:DI (reg/f:DI 15 %r15) (const_int 180 [0xb4])) [0 %sfp+-52 S4 A32]) (reg:SI 1 %r1 [583])) pdftex0.c:10332 67 {*movsi_zarch} (nil)) (insn 674 1300 1299 85 (set (mem/c:QI (plus:DI (reg/f:DI 15 %r15) (const_int 184 [0xb8])) [0 %sfp+-48 S1 A64]) (reg:QI 1 %r1 [orig:583+3 ] [583])) pdftex0.c:10332 74 {*movqi} (nil)) Vlad, can you please have a look? Thanks.
The problem is in LRA inheritance code. That is pretty complicated even for me. So, Jakub, I don't think you can fix this easily. I am going to work on this but it might take a few days.
Ok, thanks. I'm going to release 4.9.1-rc1 (and thus 4.9.1 too) without it then.
I've committed a patch fixing the bug into the trunk. Jakub, when should I do the same for gcc-4.9-branch?
I think it will not hurt to have it for a few days on the trunk, so I wouldn't rush it into 4.9.1 after rc1 went out. So, if all goes well, can you commit it after 4.9.1 release (tentatively Thursday), e.g. during Cauldron?
Ok, I can do it during Cauldron.
Jakub, gcc-4.9.1-2.fc21 is still without the fix for this issue, correct?
Yes, forgot about it, could have included it as a patch. Vlad will hopefully check it in soon.
BTW, 4.9.1-2.fc21 failed to build on s390* due to some texinfo issue, is that chicken-and-egg problem (fixing texinfo requires fixed texlive)?
There are still couple perl modules that weren't rebuild yet meaning they are affected by the glibc ABI change (s390 only, pre 2.19). We didn't get over the F-21 mass rebuild yet which will fix it. There is a special build target (f21-glibc) with that has those broken packages replaced. I'll take care of the 4.9.1-2.fc21 build.
Tested with http://s390.koji.fedoraproject.org/koji/taskinfo?taskID=1442611 (4.9.1-2.fc21 + the fix for this bug) and the problem is fixed. Is there any chance to include the fix in 4.9?
Created attachment 920715 [details] proposed change for the gcc package I could commit and build updated gcc if I can get green light.
Marek, can you pull the change referenced in c#17 into f21 & rawhide to unblock the s390 guys? Thanks, Jeff
I was about to, but I got delayed when setting the FAS stuff - to be actually able to clone the repo, etc. Meanwhile, Dan offered to do the commit - so Dan, please go ahead.
This should be fixed in f21 by now, right?
(In reply to Jakub Jelinek from comment #20) > This should be fixed in f21 by now, right? yes, it is