Created attachment 534063 [details] Build Log from Mock Description of problem: tex segfaults when building dvipdfm on a 64bit build with: + tex dvipdfm This is TeX, Version 3.141592 (Web2C 7.5.6) /var/tmp/rpm-tmp.aDHHn2: line 59: 16512 Segmentation fault tex dvipdfm Version-Release number of selected component (if applicable): 2007-65.fc16 How reproducible: try to build dvipdfm in a 64bit chroot, 32bit builds fine, but 64bit tex will segfault Steps to Reproduce: 1. use Mock to try and build a 64bit dvipdfm package Actual results: + tex dvipdfm This is TeX, Version 3.141592 (Web2C 7.5.6) /var/tmp/rpm-tmp.aDHHn2: line 59: 16512 Segmentation fault tex dvipdfm Expected results: package build successfully and tex doesn't segfault Additional info:
Created attachment 534064 [details] Root Log from Mock
I confirm the existance of this issue. On a fresh clean installation of Fedora 16 x86-64, I tried: + tex <file.tex> and + the system check in Kile Both fails with segmentation fault. I am available to send any further information on order to solve.
The issue is still present even after updating from texlive-2007-38 to texlive-2007-40 (run 'yum update' today).
/usr/bin/tex crashes on all input files I have tried (texlive-2007-65.fc16.x86_64). Backtrace is always the same: #0 __memcpy_sse2 () at ../sysdeps/x86_64/memcpy.S:168 #1 0x00007ffff7604b8d in _IO_file_xsgetn (fp=0x9a8040, data=<optimized out>, n=176) at fileops.c:1427 #2 0x00007ffff75f8fe3 in _IO_fread (buf=<optimized out>, size=8, count=22, fp=0x9a8040) at iofread.c:44 #3 0x00000000004385cd in fread (__stream=<optimized out>, __n=22, __size=8, __ptr=0x7ffef6874018) at /usr/include/bits/stdio2.h:287 #4 do_undump (p=0x7ffef6874018 <Address 0x7ffef6874018 out of bounds>, item_size=8, nitems=22, in_file=<optimized out>) at texextra.c:1831 #5 0x00000000004086b9 in loadfmtfile () at texini.c:3073 #6 0x000000000040c79d in mainbody () at texini.c:4317 #7 0x0000000000401c4e in main (ac=<optimized out>, av=<optimized out>) at texextra.c:349 This is blocking builds of packages that use TeX, such as emacs-auctex.
Valgrind output: ==13555== Memcheck, a memory error detector ==13555== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al. ==13555== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info ==13555== Command: tex emacs-reference-booklet-3.tex ==13555== This is TeX, Version 3.141592 (Web2C 7.5.6) ==13555== Warning: set address range perms: large range [0x3941a040, 0xb9f8bb50) (undefined) ==13555== Invalid write of size 1 ==13555== at 0x5359E13: __GI_memcpy (memcpy.S:168) ==13555== by 0x5343B8C: _IO_file_xsgetn (fileops.c:1427) ==13555== by 0x5337FE2: fread (iofread.c:44) ==13555== by 0x4385CC: do_undump (stdio2.h:287) ==13555== by 0x4086B8: loadfmtfile (texini.c:3073) ==13555== by 0x40C79C: mainbody (texini.c:4317) ==13555== by 0x401C4D: main (texextra.c:349) ==13555== Address 0xffffffffb941a048 is not stack'd, malloc'd or (recently) free'd ==13555== ==13555== ==13555== Process terminating with default action of signal 11 (SIGSEGV) ==13555== Access not within mapped region at address 0xFFFFFFFFB941A048 ==13555== at 0x5359E13: __GI_memcpy (memcpy.S:168) ==13555== by 0x5343B8C: _IO_file_xsgetn (fileops.c:1427) ==13555== by 0x5337FE2: fread (iofread.c:44) ==13555== by 0x4385CC: do_undump (stdio2.h:287) ==13555== by 0x4086B8: loadfmtfile (texini.c:3073) ==13555== by 0x40C79C: mainbody (texini.c:4317) ==13555== by 0x401C4D: main (texextra.c:349) ==13555== If you believe this happened as a result of a stack ==13555== overflow in your program's main thread (unlikely but ==13555== possible), you can try to increase the size of the ==13555== main thread stack using the --main-stacksize= flag. ==13555== The main thread stack size used in this run was 8388608. ==13555== ==13555== HEAP SUMMARY: ==13555== in use at exit: 2,168,109,218 bytes in 76,499 blocks ==13555== total heap usage: 121,288 allocs, 44,789 frees, 2,171,444,993 bytes allocated ==13555== ==13555== LEAK SUMMARY: ==13555== definitely lost: 2,097 bytes in 120 blocks ==13555== indirectly lost: 768 bytes in 58 blocks ==13555== possibly lost: 0 bytes in 0 blocks ==13555== still reachable: 2,168,106,353 bytes in 76,321 blocks ==13555== suppressed: 0 bytes in 0 blocks ==13555== Rerun with --leak-check=full to see details of leaked memory ==13555== ==13555== For counts of detected and suppressed errors, rerun with: -v ==13555== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 2 from 2)
strace: access("/var/lib/texmf/web2c/tex/tex.fmt", R_OK) = 0 stat("/var/lib/texmf/web2c/tex/tex.fmt", {st_mode=S_IFREG|0644, st_size=247113, ...}) = 0 open("/var/lib/texmf/web2c/tex/tex.fmt", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=247113, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4d34ec7000 read(3, "W2TX\0\0\0\4tex\0\7\251^\327\0\1\2\3\4\5\6\7\10\t\n\v\f\r\16\17"..., 4096) = 4096 mmap(NULL, 1724416, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4d34cd5000 mmap(NULL, 1728512, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4d342b4000 mmap(NULL, 2159484928, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4cb3742000 mmap(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4cb3641000 read(3, "\0\0)\343\0\0)\353\0\0*\17\0\0*9\0\0*k\0\0*\223\0\0*\303\0\0*\361"..., 4096) = 4096 read(3, "\0\0k\307\0\0k\316\0\0k\327\0\0k\342\0\0k\351\0\0k\362\0\0k\375\0\0l\6"..., 4096) = 4096 mmap(NULL, 2002944, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4cb3458000 read(3, "naltyrelpenaltypredisplaypenalty"..., 24576) = 24576 read(3, "shcong@vereqnotinc@ncelrightleft"..., 4096) = 4096 --- {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x7f4c33742018} (Segmentation fault) --- +++ killed by SIGSEGV (core dumped) +++ Segmentation fault (core dumped)
Debugging: [karel@redhat 09]$ gdb -args tex emacs-reference-booklet-3.tex GNU gdb (GDB) Fedora (7.3.50.20110722-10.fc16) (gdb) break texini.c:3073 Breakpoint 1 at 0x40869d: file texini.c, line 3073. (gdb) run Starting program: /usr/bin/tex emacs-reference-booklet-3.tex This is TeX, Version 3.141592 (Web2C 7.5.6) Breakpoint 1, loadfmtfile () at texini.c:3073 3073 undumpthings ( mem [p ], q + 2 - p ) ; (gdb) p mem $1 = (memoryword *) 0x7ffef6874018 (gdb) p q $2 = 20 (gdb) p q + 2 - p $3 = 22 (gdb) p mem[p] Cannot access memory at address 0x7ffef6874018 (gdb) p mem $4 = (memoryword *) 0x7ffef6874018 (gdb) p *mem Cannot access memory at address 0x7ffef6874018 (gdb) p memmax $5 = 1499999 (gdb) p memmin $6 = 268435455 (gdb) p zmem $7 = (memoryword *) 0x7ffef6874018 (gdb) p extramembot $8 = -268435455 (gdb) p extramemtop $9 = 0 (gdb) p yzmem $10 = (memoryword *) 0x7fff76874010 (gdb) p mem $11 = (memoryword *) 0x7ffef6874018 (gdb) p membot $12 = 0 (gdb) p memtop $13 = 1499999 extramembot obviously should not be negative. Second run: [karel@redhat 09]$ gdb -args tex emacs-reference-booklet-3.tex GNU gdb (GDB) Fedora (7.3.50.20110722-10.fc16) (gdb) run Starting program: /usr/bin/tex emacs-reference-booklet-3.tex Breakpoint 1, main (ac=2, av=0x7fffffffe328) at texextra.c:340 340 { (gdb) p extramembot $1 = 0 (gdb) watch extramembot Hardware watchpoint 2: extramembot (gdb) cont Continuing. Hardware watchpoint 2: extramembot Old value = 0 New value = -268435455 0x0000000000402066 in initialize () at texini.c:123 123 mubytecswrite [i ]= -268435455L ; texini.c: 122 {register integer for_end; i = 0 ;for_end = 128 ; if ( i <= for_end) do 123 mubytecswrite [i ]= -268435455L ; 124 while ( i++ < for_end ) ;} 125 mubytekeep = 0 ; It writes 129 integers to mubytecswrite. However, mubytecswrite is only 128 integers long! texd.h, halfword is 32-bit int: EXTERN halfword mubytecswrite[128] ; EXTERN integer mubyteskip ; The program would work if memory layout remains unchanged by compiler -- if mubyteskip would be located right after mubytecswrite, because it is zeroed on texini.c:125.
So mubytecswrite initialization overwrites the value 0 in extramembot by storing -268435455 there. The variables are neighbouring in memory indeed: [karel@redhat ~]$ ls -l /usr/lib/debug/.build-id/e8/bcf8e080af0d967912f1dc219d2aee0e39e171 lrwxrwxrwx 1 root root 19 Nov 29 19:21 /usr/lib/debug/.build-id/e8/bcf8e080af0d967912f1dc219d2aee0e39e171 -> ../../../../bin/tex [karel@redhat ~]$ eu-readelf --symbols /usr/lib/debug/.build-id/e8/bcf8e080af0d967912f1dc219d2aee0e39e171.debug 237: 000000000064d518 4 OBJECT GLOBAL DEFAULT 25 pagetail 238: 000000000064d520 512 OBJECT GLOBAL DEFAULT 25 mubytecswrite 239: 000000000064d720 4 OBJECT GLOBAL DEFAULT 25 extramembot 240: 000000000064d724 4 OBJECT GLOBAL DEFAULT 25 inputptr
I think it might be caused by having the newer builds compiled with LDFLAGS='-Wl,-z,relro ' relro option for linker reorders the sections in the RW segment, which breaks TeX's memory layout assumptions. Checking...
Well, it is actually caused by off-by-one in the .ch files. It is now fixed in rawhide. F16 will be soon.
Ok, thanks.
Just ran into this today. Karel, thanks for all the work you did debugging. Jindrich, looking forward to those packages :-)
texlive-2007-66.fc16 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/texlive-2007-66.fc16
Package texlive-2007-66.fc16: * should fix your issue, * was pushed to the Fedora 16 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing texlive-2007-66.fc16' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2011-16995/texlive-2007-66.fc16 then log in and leave karma (feedback).
texlive-2007-66.fc16 has been pushed to the Fedora 16 stable repository. If problems still persist, please make note of it in this bug report.