Bug 2042421 - pdflatex crashes — GCC 12 regression?
Summary: pdflatex crashes — GCC 12 regression?
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: texlive
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Tom "spot" Callaway
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 2042542 (view as bug list)
Depends On:
Blocks: 2042425 2042426 2042427
TreeView+ depends on / blocked
 
Reported: 2022-01-19 13:52 UTC by Ben Beasley
Modified: 2022-02-01 21:28 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2022-01-21 20:28:29 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Ben Beasley 2022-01-19 13:52:43 UTC
Description of problem:

New crashes in pdflatex while building documentation for certain packages

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. fedpkg co python-geomdl
2. cd python-geomdl
3. fedpkg mockbuild --enablerepo=local

Actual results:

> […]
> /usr/include/c++/12/bits/stl_vector.h:1141: std::vector<_Tp, _Alloc>::const_reference std::vector<_Tp, _Alloc>::operator[](size_type) const [with _Tp = std::pair<std::__cxx11::basic_string<char>, Object>; _Alloc = std::allocator<std::pair<std::__cxx11::basic_string<char>, Object> >; const_reference = const std::pair<std::__cxx11::basic_string<char>, Object>&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
> Latexmk: Index file 'NURBS-Python.idx' was written
> Collected error summary (may duplicate other messages):
>   pdflatex: Command for 'pdflatex' gave return code 0.5234375
>       Refer to 'NURBS-Python.log' for details
> […]


Expected results:

pdflatex does not crash!

Additional info:

Koschei for python-geomdl: https://koschei.fedoraproject.org/package/python-geomdl?
Build log for python-geomdl: https://kojipkgs.fedoraproject.org/work/tasks/6843/81426843/build.log

This is affecting some but not all other packages that use LaTeX to build PDF documentation, too: at least cairomm/cairomm1.16, but not e.g. python-ncclient or libglademm24.

Comment 1 Tom "spot" Callaway 2022-01-19 20:13:28 UTC
*** Bug 2042542 has been marked as a duplicate of this bug. ***

Comment 2 Jerry James 2022-01-19 21:18:53 UTC
Well, dang:

$ gdb /usr/bin/pdflatex
[snip]
../../gdb/objfiles.h:510: internal-error: sect_index_data not initialized
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)

Comment 3 Jerry James 2022-01-19 21:26:04 UTC
I reported the gdb problem as bug 2042664.

Comment 4 Jerry James 2022-01-19 22:45:35 UTC
lldb comes to the rescue!  It throws lots of warnings about various bits of debuginfo that it can't understand, but it does give us a backtrace:

(lldb) bt
* thread #1, name = 'pdflatex', stop reason = signal SIGABRT
  * frame #0: 0x00007ffff76f564c libc.so.6`__pthread_kill_implementation(threadid=<unavailable>, signo=6, no_tid=<unavailable>) at pthread_kill.c:44:76
    frame #1: 0x00007ffff76a8656 libc.so.6`__GI_raise(sig=6) at raise.c:26:13
    frame #2: 0x00007ffff7692833 libc.so.6`abort at abort.c:79:7
    frame #3: 0x00007ffff7a3cde5 libstdc++.so.6`std::__glibcxx_assert_fail(char const*, int, char const*, char const*) + 53
    frame #4: 0x00005555556018f6 pdflatex`write_epdf + 3318
    frame #5: 0x00005555555b83ef pdflatex`zpdfwriteimage + 1247
    frame #6: 0x00005555555c372d pdflatex`zpdfshipout + 7005
    frame #7: 0x00005555555de24b pdflatex`maincontrol + 987
    frame #8: 0x0000555555569d92 pdflatex`main + 4978
    frame #9: 0x00007ffff76935d0 libc.so.6`.annobin_libc_start.c at libc_start_call_main.h:58:16
    frame #10: 0x00007ffff7693680 libc.so.6`__libc_start_main@@GLIBC_2.34 at libc-start.c:392:3
    frame #11: 0x000055555556b905 pdflatex`_start + 37

$ objdump -C -l --disassemble=write_epdf /usr/bin/pdflatex
00000000000acc00 <write_epdf>:

[So we want 0xacc00 + 0xcf6 (3318) == 0xad8f6, which jives with the value of rip shown in lldb (0x5555556018f6).  Here is the relevant portion of the disassembly.]

write_epdf():
/usr/src/debug/texlive-base-20210325-46.fc36.x86_64/source/work/texk/web2c/../../../texk/web2c/pdftexdir/pdftoepdf.cc:971
   ad85b:	85 c9                	test   %ecx,%ecx
   ad85d:	0f 8e 1b 09 00 00    	jle    ae17e <write_epdf+0x157e>
   ad863:	8d 71 ff             	lea    -0x1(%rcx),%esi
   ad866:	45 31 e4             	xor    %r12d,%r12d
   ad869:	48 89 b5 f0 f3 ff ff 	mov    %rsi,-0xc10(%rbp)
std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, Object>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, Object> > >::size() const:
/usr/include/c++/12/bits/stl_vector.h:987
   ad870:	48 bf ab aa aa aa aa 	movabs $0xaaaaaaaaaaaaaaab,%rdi
   ad877:	aa aa aa 
   ad87a:	48 29 d0             	sub    %rdx,%rax
   ad87d:	48 c1 f8 04          	sar    $0x4,%rax
   ad881:	48 0f af c7          	imul   %rdi,%rax
std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, Object>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, Object> > >::operator[](unsigned long) const:
/usr/include/c++/12/bits/stl_vector.h:1141
   ad885:	49 39 c4             	cmp    %rax,%r12
   ad888:	73 4d                	jae    ad8d7 <write_epdf+0xcd7>
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_data() const:
/usr/include/c++/12/bits/basic_string.h:235
   ad88a:	4b 8d 04 64          	lea    (%r12,%r12,2),%rax
Object::dictRemove(char const*):
/usr/include/poppler/Object.h:596
   ad88e:	8b 8d 10 f4 ff ff    	mov    -0xbf0(%rbp),%ecx
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_data() const:
/usr/include/c++/12/bits/basic_string.h:235
   ad894:	48 c1 e0 04          	shl    $0x4,%rax
   ad898:	48 8b 34 02          	mov    (%rdx,%rax,1),%rsi
Object::dictRemove(char const*):
/usr/include/poppler/Object.h:596
   ad89c:	83 f9 07             	cmp    $0x7,%ecx
   ad89f:	0f 85 30 08 00 00    	jne    ae0d5 <write_epdf+0x14d5>
/usr/include/poppler/Object.h:597
   ad8a5:	48 8b bd 18 f4 ff ff 	mov    -0xbe8(%rbp),%rdi
   ad8ac:	48 8d 9d 50 f4 ff ff 	lea    -0xbb0(%rbp),%rbx
   ad8b3:	e8 98 65 f6 ff       	call   13e50 <Dict::remove(char const*)@plt>
write_epdf():
/usr/src/debug/texlive-base-20210325-46.fc36.x86_64/source/work/texk/web2c/../../../texk/web2c/pdftexdir/pdftoepdf.cc:971 (discriminator 2)
   ad8b8:	49 8d 4c 24 01       	lea    0x1(%r12),%rcx
   ad8bd:	4c 39 a5 f0 f3 ff ff 	cmp    %r12,-0xc10(%rbp)
   ad8c4:	0f 84 b4 08 00 00    	je     ae17e <write_epdf+0x157e>
   ad8ca:	49 8b 55 08          	mov    0x8(%r13),%rdx
   ad8ce:	49 8b 45 10          	mov    0x10(%r13),%rax
/usr/src/debug/texlive-base-20210325-46.fc36.x86_64/source/work/texk/web2c/../../../texk/web2c/pdftexdir/pdftoepdf.cc:971
   ad8d2:	49 89 cc             	mov    %rcx,%r12
   ad8d5:	eb 99                	jmp    ad870 <write_epdf+0xc70>
std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, Object>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, Object> > >::operator[](unsigned long) const:
/usr/include/c++/12/bits/stl_vector.h:1141
   ad8d7:	48 8d 0d 25 7e 01 00 	lea    0x17e25(%rip),%rcx        # c5703 <__PRETTY_FUNCTION__.15+0xfe3>
   ad8de:	48 8d 15 3b 77 01 00 	lea    0x1773b(%rip),%rdx        # c5020 <__PRETTY_FUNCTION__.15+0x900>
   ad8e5:	be 75 04 00 00       	mov    $0x475,%esi
   ad8ea:	48 8d 3d 8f 78 01 00 	lea    0x1788f(%rip),%rdi        # c5180 <__PRETTY_FUNCTION__.15+0xa60>
   ad8f1:	e8 3a 65 f6 ff       	call   13e30 <std::__glibcxx_assert_fail(char const*, int, char const*, char const*)@plt>
write_epdf():
/usr/src/debug/texlive-base-20210325-46.fc36.x86_64/source/work/texk/web2c/../../../texk/web2c/pdftexdir/pdftoepdf.cc:990
   ad8f6:	48 8d 3d 33 88 01 00 	lea    0x18833(%rip),%rdi        # c6130 <gpower+0x4d0>
   ad8fd:	e8 0e 31 fe ff       	call   90a10 <pdftex_warn>
Page::getContents():
/usr/include/poppler/Page.h:195
   ad902:	48 8b bd e8 f3 ff ff 	mov    -0xc18(%rbp),%rdi
   ad909:	48 8d 85 10 f5 ff ff 	lea    -0xaf0(%rbp),%rax
   ad910:	31 c9                	xor    %ecx,%ecx
   ad912:	48 8d 9d 50 f4 ff ff 	lea    -0xbb0(%rbp),%rbx
   ad919:	49 89 c6             	mov    %rax,%r14
   ad91c:	48 8b 57 08          	mov    0x8(%rdi),%rdx
   ad920:	48 8d 77 50          	lea    0x50(%rdi),%rsi
   ad924:	48 89 c7             	mov    %rax,%rdi
   ad927:	e8 44 64 f6 ff       	call   13d70 <Object::fetch(XRef*, int) const@plt>

0xad8f6 is actually the address immediately after the instruction where the assertion fails (0xad8f1).  Now we need to translate this back into a source line.  The last one mentioned is line 971 of pdftoepdf.cc, but we may want the near vicinity rather than that exact line.  Sadly, lldb's problems reading debuginfo means that it knows nothing whatsoever about local variables in this function, so I can't ask it to show me the contents of dic1 or dic2, for example.  But it seems clear that some operation on a Dict is involved, and Dict.h from poppler shows that there is indeed a vector (named "entries") inside a Dict.

Reading optimized assembly isn't easy, but it appears to me that execution has gone past the for loop at lines 971-973 of pdftoepdf.cc.  You can see the call to dictRemove at 0xad8b3, and what looks like the jump back to the top of the loop at 0xad8d5.  That means that we should be on line 974, where the variable l gets dic1->getLength().  But I see no call to std::vector::size() like we see at 0xad869 before the first loop.  We go immediately into a vector element fetch, which would be the dic1->getKey(i) call on line 976.  But that means we are using the size of dic2, not dic1, to control this loop.  If dic2 happens to be smaller than dic1, that's going to produce exactly the result we see.

Or wait, there are calls to std::vector::size() at both 0xad834 and 0xad869.  Did the second assignment to l get hoisted before the for loop?  Argh, I need gdb to work!

Any insights you smart people have to offer are much appreciated.

Comment 5 Jonathan Wakely 2022-01-20 10:22:22 UTC
I think I'd try building a non-optimized pdflatex, or at least for that one file. If the same std::vector assertion happens it suggests a real bug in pdflatex, not a gcc optimization bug.

I'm trying to reproduce it in mock now, then I'll try building a custom pdflatex.

Comment 6 Tom "spot" Callaway 2022-01-20 20:15:13 UTC
Thanks to everyone for the debugging help. I went back and looked at this code in pdftoepdf.cc, and there was a particularly prescient comment:

/*
This part is only a single line
            groupDict = Object(page->getGroup());
in the original patch. In this case, however, pdftex crashes at
"delete pdf_doc->doc" in "delete_document()" for inclusion of some
kind of pdf images, for example, figure_missing.pdf in gnuplot.
A change
            groupDict = Object(page->getGroup()).copy();
does not improve the situation.
The changes below seem to work fine. 
*/

... the changes below obviously no longer work fine, so I figured I would start by changing the workaround code back to what should obviously work, along with a sanity check:

+            if (page->getGroup() != NULL) {
+                groupDict = Object(page->getGroup());
+            } else {
+                pdftex_fail("PDF inclusion: getGroup failed");
+            }

In local testing, this resolves the issue... well, except for python-geomdl, that seems to have other problems unrelated to this pdflatex crash. I tested why3, cairomm1.16, and gnuplot, all of which work correctly with my change. I am pushing this change into rawhide now (-47), and hopefully this resolves the issue. If not, I'm sure I will hear about it. :D

Comment 7 Jerry James 2022-01-20 20:26:36 UTC
Good find!  Thanks for figuring it out.

Comment 8 Tom "spot" Callaway 2022-01-21 20:28:29 UTC
Closing as fixed in rawhide. If not, please reopen.

Comment 9 Ben Beasley 2022-02-01 21:28:36 UTC
Thanks! This is working for me.


Note You need to log in before you can comment on or make changes to this bug.