Description of problem: New crashes in pdflatex while building documentation for certain packages Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. fedpkg co python-geomdl 2. cd python-geomdl 3. fedpkg mockbuild --enablerepo=local Actual results: > […] > /usr/include/c++/12/bits/stl_vector.h:1141: std::vector<_Tp, _Alloc>::const_reference std::vector<_Tp, _Alloc>::operator[](size_type) const [with _Tp = std::pair<std::__cxx11::basic_string<char>, Object>; _Alloc = std::allocator<std::pair<std::__cxx11::basic_string<char>, Object> >; const_reference = const std::pair<std::__cxx11::basic_string<char>, Object>&; size_type = long unsigned int]: Assertion '__n < this->size()' failed. > Latexmk: Index file 'NURBS-Python.idx' was written > Collected error summary (may duplicate other messages): > pdflatex: Command for 'pdflatex' gave return code 0.5234375 > Refer to 'NURBS-Python.log' for details > […] Expected results: pdflatex does not crash! Additional info: Koschei for python-geomdl: https://koschei.fedoraproject.org/package/python-geomdl? Build log for python-geomdl: https://kojipkgs.fedoraproject.org/work/tasks/6843/81426843/build.log This is affecting some but not all other packages that use LaTeX to build PDF documentation, too: at least cairomm/cairomm1.16, but not e.g. python-ncclient or libglademm24.
*** Bug 2042542 has been marked as a duplicate of this bug. ***
Well, dang: $ gdb /usr/bin/pdflatex [snip] ../../gdb/objfiles.h:510: internal-error: sect_index_data not initialized A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n)
I reported the gdb problem as bug 2042664.
lldb comes to the rescue! It throws lots of warnings about various bits of debuginfo that it can't understand, but it does give us a backtrace: (lldb) bt * thread #1, name = 'pdflatex', stop reason = signal SIGABRT * frame #0: 0x00007ffff76f564c libc.so.6`__pthread_kill_implementation(threadid=<unavailable>, signo=6, no_tid=<unavailable>) at pthread_kill.c:44:76 frame #1: 0x00007ffff76a8656 libc.so.6`__GI_raise(sig=6) at raise.c:26:13 frame #2: 0x00007ffff7692833 libc.so.6`abort at abort.c:79:7 frame #3: 0x00007ffff7a3cde5 libstdc++.so.6`std::__glibcxx_assert_fail(char const*, int, char const*, char const*) + 53 frame #4: 0x00005555556018f6 pdflatex`write_epdf + 3318 frame #5: 0x00005555555b83ef pdflatex`zpdfwriteimage + 1247 frame #6: 0x00005555555c372d pdflatex`zpdfshipout + 7005 frame #7: 0x00005555555de24b pdflatex`maincontrol + 987 frame #8: 0x0000555555569d92 pdflatex`main + 4978 frame #9: 0x00007ffff76935d0 libc.so.6`.annobin_libc_start.c at libc_start_call_main.h:58:16 frame #10: 0x00007ffff7693680 libc.so.6`__libc_start_main@@GLIBC_2.34 at libc-start.c:392:3 frame #11: 0x000055555556b905 pdflatex`_start + 37 $ objdump -C -l --disassemble=write_epdf /usr/bin/pdflatex 00000000000acc00 <write_epdf>: [So we want 0xacc00 + 0xcf6 (3318) == 0xad8f6, which jives with the value of rip shown in lldb (0x5555556018f6). Here is the relevant portion of the disassembly.] write_epdf(): /usr/src/debug/texlive-base-20210325-46.fc36.x86_64/source/work/texk/web2c/../../../texk/web2c/pdftexdir/pdftoepdf.cc:971 ad85b: 85 c9 test %ecx,%ecx ad85d: 0f 8e 1b 09 00 00 jle ae17e <write_epdf+0x157e> ad863: 8d 71 ff lea -0x1(%rcx),%esi ad866: 45 31 e4 xor %r12d,%r12d ad869: 48 89 b5 f0 f3 ff ff mov %rsi,-0xc10(%rbp) std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, Object>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, Object> > >::size() const: /usr/include/c++/12/bits/stl_vector.h:987 ad870: 48 bf ab aa aa aa aa movabs $0xaaaaaaaaaaaaaaab,%rdi ad877: aa aa aa ad87a: 48 29 d0 sub %rdx,%rax ad87d: 48 c1 f8 04 sar $0x4,%rax ad881: 48 0f af c7 imul %rdi,%rax std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, Object>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, Object> > >::operator[](unsigned long) const: /usr/include/c++/12/bits/stl_vector.h:1141 ad885: 49 39 c4 cmp %rax,%r12 ad888: 73 4d jae ad8d7 <write_epdf+0xcd7> std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_data() const: /usr/include/c++/12/bits/basic_string.h:235 ad88a: 4b 8d 04 64 lea (%r12,%r12,2),%rax Object::dictRemove(char const*): /usr/include/poppler/Object.h:596 ad88e: 8b 8d 10 f4 ff ff mov -0xbf0(%rbp),%ecx std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_data() const: /usr/include/c++/12/bits/basic_string.h:235 ad894: 48 c1 e0 04 shl $0x4,%rax ad898: 48 8b 34 02 mov (%rdx,%rax,1),%rsi Object::dictRemove(char const*): /usr/include/poppler/Object.h:596 ad89c: 83 f9 07 cmp $0x7,%ecx ad89f: 0f 85 30 08 00 00 jne ae0d5 <write_epdf+0x14d5> /usr/include/poppler/Object.h:597 ad8a5: 48 8b bd 18 f4 ff ff mov -0xbe8(%rbp),%rdi ad8ac: 48 8d 9d 50 f4 ff ff lea -0xbb0(%rbp),%rbx ad8b3: e8 98 65 f6 ff call 13e50 <Dict::remove(char const*)@plt> write_epdf(): /usr/src/debug/texlive-base-20210325-46.fc36.x86_64/source/work/texk/web2c/../../../texk/web2c/pdftexdir/pdftoepdf.cc:971 (discriminator 2) ad8b8: 49 8d 4c 24 01 lea 0x1(%r12),%rcx ad8bd: 4c 39 a5 f0 f3 ff ff cmp %r12,-0xc10(%rbp) ad8c4: 0f 84 b4 08 00 00 je ae17e <write_epdf+0x157e> ad8ca: 49 8b 55 08 mov 0x8(%r13),%rdx ad8ce: 49 8b 45 10 mov 0x10(%r13),%rax /usr/src/debug/texlive-base-20210325-46.fc36.x86_64/source/work/texk/web2c/../../../texk/web2c/pdftexdir/pdftoepdf.cc:971 ad8d2: 49 89 cc mov %rcx,%r12 ad8d5: eb 99 jmp ad870 <write_epdf+0xc70> std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, Object>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, Object> > >::operator[](unsigned long) const: /usr/include/c++/12/bits/stl_vector.h:1141 ad8d7: 48 8d 0d 25 7e 01 00 lea 0x17e25(%rip),%rcx # c5703 <__PRETTY_FUNCTION__.15+0xfe3> ad8de: 48 8d 15 3b 77 01 00 lea 0x1773b(%rip),%rdx # c5020 <__PRETTY_FUNCTION__.15+0x900> ad8e5: be 75 04 00 00 mov $0x475,%esi ad8ea: 48 8d 3d 8f 78 01 00 lea 0x1788f(%rip),%rdi # c5180 <__PRETTY_FUNCTION__.15+0xa60> ad8f1: e8 3a 65 f6 ff call 13e30 <std::__glibcxx_assert_fail(char const*, int, char const*, char const*)@plt> write_epdf(): /usr/src/debug/texlive-base-20210325-46.fc36.x86_64/source/work/texk/web2c/../../../texk/web2c/pdftexdir/pdftoepdf.cc:990 ad8f6: 48 8d 3d 33 88 01 00 lea 0x18833(%rip),%rdi # c6130 <gpower+0x4d0> ad8fd: e8 0e 31 fe ff call 90a10 <pdftex_warn> Page::getContents(): /usr/include/poppler/Page.h:195 ad902: 48 8b bd e8 f3 ff ff mov -0xc18(%rbp),%rdi ad909: 48 8d 85 10 f5 ff ff lea -0xaf0(%rbp),%rax ad910: 31 c9 xor %ecx,%ecx ad912: 48 8d 9d 50 f4 ff ff lea -0xbb0(%rbp),%rbx ad919: 49 89 c6 mov %rax,%r14 ad91c: 48 8b 57 08 mov 0x8(%rdi),%rdx ad920: 48 8d 77 50 lea 0x50(%rdi),%rsi ad924: 48 89 c7 mov %rax,%rdi ad927: e8 44 64 f6 ff call 13d70 <Object::fetch(XRef*, int) const@plt> 0xad8f6 is actually the address immediately after the instruction where the assertion fails (0xad8f1). Now we need to translate this back into a source line. The last one mentioned is line 971 of pdftoepdf.cc, but we may want the near vicinity rather than that exact line. Sadly, lldb's problems reading debuginfo means that it knows nothing whatsoever about local variables in this function, so I can't ask it to show me the contents of dic1 or dic2, for example. But it seems clear that some operation on a Dict is involved, and Dict.h from poppler shows that there is indeed a vector (named "entries") inside a Dict. Reading optimized assembly isn't easy, but it appears to me that execution has gone past the for loop at lines 971-973 of pdftoepdf.cc. You can see the call to dictRemove at 0xad8b3, and what looks like the jump back to the top of the loop at 0xad8d5. That means that we should be on line 974, where the variable l gets dic1->getLength(). But I see no call to std::vector::size() like we see at 0xad869 before the first loop. We go immediately into a vector element fetch, which would be the dic1->getKey(i) call on line 976. But that means we are using the size of dic2, not dic1, to control this loop. If dic2 happens to be smaller than dic1, that's going to produce exactly the result we see. Or wait, there are calls to std::vector::size() at both 0xad834 and 0xad869. Did the second assignment to l get hoisted before the for loop? Argh, I need gdb to work! Any insights you smart people have to offer are much appreciated.
I think I'd try building a non-optimized pdflatex, or at least for that one file. If the same std::vector assertion happens it suggests a real bug in pdflatex, not a gcc optimization bug. I'm trying to reproduce it in mock now, then I'll try building a custom pdflatex.
Thanks to everyone for the debugging help. I went back and looked at this code in pdftoepdf.cc, and there was a particularly prescient comment: /* This part is only a single line groupDict = Object(page->getGroup()); in the original patch. In this case, however, pdftex crashes at "delete pdf_doc->doc" in "delete_document()" for inclusion of some kind of pdf images, for example, figure_missing.pdf in gnuplot. A change groupDict = Object(page->getGroup()).copy(); does not improve the situation. The changes below seem to work fine. */ ... the changes below obviously no longer work fine, so I figured I would start by changing the workaround code back to what should obviously work, along with a sanity check: + if (page->getGroup() != NULL) { + groupDict = Object(page->getGroup()); + } else { + pdftex_fail("PDF inclusion: getGroup failed"); + } In local testing, this resolves the issue... well, except for python-geomdl, that seems to have other problems unrelated to this pdflatex crash. I tested why3, cairomm1.16, and gnuplot, all of which work correctly with my change. I am pushing this change into rawhide now (-47), and hopefully this resolves the issue. If not, I'm sure I will hear about it. :D
Good find! Thanks for figuring it out.
Closing as fixed in rawhide. If not, please reopen.
Thanks! This is working for me.