Since it was rebuilt with GCC 8 - we think - pdftex is segfaulting on i686. This is breaking at least two package builds, OpenColorIO: https://koji.fedoraproject.org/koji/taskinfo?taskID=25176488 which is ultimately in the dependency path of a key component of KDE, Calligra, and R-htmltools: https://koji.fedoraproject.org/koji/taskinfo?taskID=25173683 It's very likely breaking / going to break other package builds too. Both segfault like this during doc generation: mktexfmt [INFO]: --- remaking pdflatex with pdftex mktexfmt: running `pdftex -ini -jobname=pdflatex -progname=pdflatex -translate-file=cp227.tcx *pdflatex.ini' ... sh: line 1: 29445 Segmentation fault (core dumped) pdftex -ini -jobname=pdflatex -progname=pdflatex -translate-file=cp227.tcx *pdflatex.ini 1>&2 < /dev/null mktexfmt [ERROR]: running `pdftex -ini -jobname=pdflatex -progname=pdflatex -translate-file=cp227.tcx *pdflatex.ini >&2 </dev/null' return status 139 mktexfmt [ERROR]: `pdftex -ini -jobname=pdflatex -progname=pdflatex -translate-file=cp227.tcx *pdflatex.ini >&2 </dev/null' failed (no pdflatex.fmt) I have shelled into the OpenColorIO build process in a mock and got a backtrace out of gdb. However, that's hardly the end of the story, because this is goddamn texlive so of course it isn't: 1834 pdftex-pool.c: No such file or directory. (gdb) thread apply all bt full Thread 1 (Thread 0xf731d740 (LWP 110)): #0 0x565cb88d in loadpoolstrings (spare_size=6160000) at pdftex-pool.c:1834 l = <optimized out> s = 0x56824001 <error: Cannot access memory at address 0x56824001> g = 0 i = 1727 j = <optimized out> #1 0x565651e0 in getstringsstarted () at pdftexini.c:649 Result = <optimized out> k = <optimized out> l = <optimized out> g = <optimized out> #2 0x56572d97 in mainbody () at pdftexini.c:5301 eqtb = 0xf4255010 #3 0x5655d9f3 in main (ac=6, av=0xffffd674) at ../../../texk/web2c/lib/texmfmp.c:1013 No locals. Yup: it's crashing in a file that, so far as Fedora debuginfo is concerned, doesn't exist. This is because pdftex-pool.c is *generated on the fly during the build process*, because there is not enough whisky in the goddamn world. This is where loadpoolstrings *ultimately* actually gets defined, as best as I can tell: https://tug.org/svn/pdftex/branches/stable/source/src/texk/web2c/web2c/makecpool.c?view=markup around line 81. Naturally it uses variables with single-character names. See above note in re whisky. This is about as far as I've got with this mess so far. Updates as and when the whisky resupply truck arrives.
Kevin, Rex: this is why I still can't rebuild OpenColorIO, hence why we can't rebuild Calligra, hence why Calligra has dependency problems and isn't in the lives ATM.
It broke gdal as well (https://koji.fedoraproject.org/koji/buildinfo?buildID=1044737) because the pdftex segv caused the noarch packages to not match between architectures.
It looks like armv7hl is broken as well so it's likely something specific to 32 bit platforms.
yeah, I think basically this hoopy makecpool nuttiness is overflowing something - it's basically stuffing a whole bunch of...contents of some 'pool' files which I think are *also* dynamically generated?...or something...into this 'poolfilearr' array, then iterating over it...'s' in loadpoolstrings is meant to be the next bit read out from poolfilearr, but instead at some point it fails to read what should be the next thing in poolfilearr (because it's too big, or something?) and so s becomes the error message ('Cannot access memory') instead and everything blows up. but that's as far as my limited C skills take me especially at this time of night. it makes fuzzy sense to me that this would happen on 32-bit but not 64-bit - poolfilearr is probably bigger on 64-bit, or something - but not *specific* sense yet.
If it helps anyone fiddling with this, here's how you can get to interact with it: 1. Grab https://kojipkgs.fedoraproject.org//work/tasks/6436/25176436/OpenColorIO-1.1.0-3.fc28.src.rpm 2. Try to build it in a fedora-rawhide-i386 mock: 'mock -r fedora-rawhide-i386 --rebuild OpenColorIO-1.1.0-3.fc28.src.rpm', it should fail 3. Shell into the mock: 'mock -r fedora-rawhide-i386 --shell' (or you may want to use 'mock -r fedora-rawhide-i386 --enable-network --shell' so you can use the network in the mock shell, e.g. to install debuginfo packages) 4. cd /builddir/build/BUILD/OpenColorIO-1.1.0/build/ 5. pdftex -ini -jobname=pdflatex -progname=pdflatex -translate-file=cp227.tcx *pdflatex.ini That last command should trigger the crash each time you run it. You can use 'mock -r fedora-rawhide-i386 --install (package)' from outside the mock to install packages; if you do 'mock -r fedora-rawhide-i386 --install dnf' then shell into the mock with --enable-network you can use dnf from within the mock to install debuginfo packages and stuff.
So it looks from the stack trace like s is bogus, but that is just tracking the generated strings in poolfilearr which is all just static data generated when the C file is created and I can't see any obvious bug. The valgrind trace is much the same as the crash: ==19== Invalid read of size 1 ==19== at 0x17E88D: loadpoolstrings (pdftex-pool.c:1834) ==19== by 0x1181DF: getstringsstarted (pdftexini.c:649) ==19== by 0x125D96: mainbody (pdftexini.c:5301) ==19== by 0x1109F2: main (texmfmp.c:1013) ==19== Address 0x271000 is not stack'd, malloc'd or (recently) free'd So far I'm leaning towards a compiler bug...
Reduced testcase for -m32 -O2: long poolptr; unsigned char *strpool; static const char *poolfilearr[] = { "mu", #define A "", #define B A A A A A A A A A A #define C B B B B B B B B B B #define D C C C C C C C C C C D C C C C C C C B B B A ((void *)0) }; __attribute__((noipa)) long makestring (void) { return 0; } __attribute__((noipa)) int loadpoolstrings (long spare_size) { const char *s; long g = 0; int i = 0, j = 0; while ((s = poolfilearr[j++])) { int l = __builtin_strlen (s); i += l; if (i >= spare_size) return 0; while (l-- > 0) strpool[poolptr++] = *s++; g = makestring (); } return g; } int main () { poolptr = 0; strpool = __builtin_malloc (1000); asm volatile ("" : : : "memory"); volatile int r = loadpoolstrings (1000); __builtin_free (strpool); return 0; } Looking into this.
Should be possible to do a bootstrapped OpenColorIO build (without docs or whatever it uses pdftex for). I'll look into that.
bootstrapped OpenColorIO is underway (looks promising, several archs completed already), opened bug #1547112 to follow this one so that docs can be re-enabled someday.
*** Bug 1546913 has been marked as a duplicate of this bug. ***
It looks like gcc-8.0.1-0.15 contains the fix, so once it's done building, pdftex can be fixed by rebuilding. Big thanks to Jakub, Adam and everyone else involved!
The fun part is that we ship pdftex out of texlive, so we'll get to rebuild the whole of texlive...which is always fun! Especially with a new GCC. It looks like dtardon did manage to build it on 2018-02-15, though, so maybe it'll be OK.
gcc-8.0.1-0.15.fc28 finished to build , who build pdftex ?
I've got it. Here's a fun note: I think texlive's own test suite actually caught this bug, in this build: https://koji.fedoraproject.org/koji/taskinfo?taskID=24899896 at least, two pdftex tests failed on i686 in that build: FAIL: pdftexdir/wprob.test FAIL: pdftexdir/pdfimage.test along with several others...then dtardon just turned off the failing tests and built it again :/ I understand why, though - he wanted to get it rebuilt for a poppler soname bump. Just very unfortunate that the poppler soname bump and the GCC 8 update coincided like that :( I am going to rebuild it with the failing tests still turned off first, then fire a build with the tests turned back *on* and see how many still fail; perhaps we have other issues with GCC to fix here.
This should be fixed in gcc-8.0.1-0.15.fc{28,29}, texlive needs to be rebuilt.
It already is rebuilt. Well, for fc29. The fc28 build ran but failed due to a koji issue; puiterwijk has fixed that and I've re-fired it.
BTW in resume of my though , IMHO, if we disable the tests, we should use the same that is used in gpgme [1] (%bcond_without check) and disable all check. GCC8 have make check || : , and 90% of building time of building is in make test , so disable all make test , we can save a lot of hours . [1] https://src.fedoraproject.org/rpms/gpgme/blob/master/f/gpgme.spec
While GCC uses make check || :, I have scripts that record the test results from the build.log files and compare them regularly, and know what FAILs are blockers and what aren't that important. Especially for the compiler it is a very bad idea to skip the tests.
F28 and F29 builds are done now.
(In reply to Jakub Jelinek from comment #18) > While GCC uses make check || :, I have scripts that record the test results > from the build.log files and compare them regularly, and know what FAILs are > blockers and what aren't that important. Especially for the compiler it is > a very bad idea to skip the tests. I'm just asking you to be reasonable, when we will ignore tests we may also disable it at all and save about 20 hours of build time (or at least a huge amount of time). I'm not asking to disable tests forever just when we need save time, notice that gpgme have tests enabled but we may disable it (when we have soname bump emergency) Best regards,