Hide Forgot
Created attachment 1513647 [details] the xml file and backtrace Description of problem: As subject Version-Release number of selected component (if applicable): perl-XML-XPath-1.42-3.el8 expat-2.2.5-3.el8+7.x86_64 perl-interpreter-5.26.3-416.el8.x86_64 How reproducible: 100% Steps to Reproduce: 1. Run following cmd: # cat /tmp/a.xml | xpath -q -e '//*[text()="Regression"]/../../td/a/span/text()' free(): invalid next size (normal) [1] 20689 broken pipe cat /tmp/a.xml | 20690 abort (core dumped) xpath -q -e '//*[text()="Regression"]/../../td/a/span/text()' Actual results: As above Expected results: No core dump Additional info: the xml file and backtrace are in the attachment.
This is a known bug triggered by fetching the data from a standard input. See <https://bugzilla.redhat.com/show_bug.cgi?id=1473368#c15>. There is no known fix. A workaround is to pass the XML data from a file by a positional argument: $ xpath -q -e '//*[text()="Regression"]/../../td/a/span/text()' /tmp/a.xml
It seems there is a bug in parse_stream() in Expat.xs. The parse_stream() function allocates BUFSIZE-byte long output buffer. Then it reads a string using PerlIO's read() with a maximal string length tsiz=BUFSIZE characters into a temporary buffer. And then it retrieves length of the string in the temporary buffer in bytes and copies the strings from the temporary location to the output buffer: else { tbuff = newSV(0); tsiz = newSViv(BUFSIZE); buffsize = BUFSIZE; } while (! done) { char *buffer = XML_GetBuffer(parser, buffsize); [...] else { int cnt; SV * rdres; char * tb; PUSHMARK(SP); EXTEND(SP, 3); PUSHs(ioref); PUSHs(tbuff); PUSHs(tsiz); PUTBACK ; cnt = perl_call_method("read", G_SCALAR); SPAGAIN ; if (cnt != 1) croak("read method call failed"); rdres = POPs; if (! SvOK(rdres)) croak("read error"); tb = SvPV(tbuff, br); if (br > 0) → Copy(tb, buffer, br, char); else While it works for byte-stream file handles, when using UTF-8 handles, length in bytes can be greater than length in characters, thus the temporary buffer can contain more bytes than the size of the output buffer and we have a buffer overwrite. This corrupts memory, especially metadata for glibc memory management and subsequent free() abort with "free(): invalid next size (normal)". The input file has a non-ASCII character at 0x101C offset: 00001010 61 73 73 3d 22 6c 6f 67 69 6e 22 3e c2 a0 3c 61 |ass="login">..<a|
Created attachment 1514049 [details] Proposed fix I believe this patch fixes it.
How to test: (1) Run the command from the description. Preferably under valgrind. Before: Valgrind catches invalid write in parse_stream function in Expat.xs source file: $ valgrind -- perl /usr/bin/xpath -q -e '//*[text()="Regression"]/../../td/a/span/text()' <../a.xml ==3998== Memcheck, a memory error detector ==3998== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==3998== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info ==3998== Command: perl /usr/bin/xpath -q -e //*[text()="Regression"]/../../td/a/span/text() ==3998== ==3998== Invalid write of size 8 ==3998== at 0x4C38048: memmove (vg_replace_strmem.c:1271) ==3998== by 0xD5425D5: memcpy (string_fortified.h:34) ==3998== by 0xD5425D5: parse_stream (Expat.xs:390) ==3998== by 0xD542B9A: XS_XML__Parser__Expat_ParseStream (Expat.xs:1475) ==3998== by 0x4F26148: Perl_pp_entersub (in /usr/lib64/libperl.so.5.26.3) ==3998== by 0x4F1DF94: Perl_runops_standard (in /usr/lib64/libperl.so.5.26.3) ==3998== by 0x4E9DFAE: perl_run (in /usr/lib64/libperl.so.5.26.3) ==3998== by 0x108EA9: ??? (in /usr/bin/perl) ==3998== by 0x6059812: (below main) (in /usr/lib64/libc-2.28.so) [...] The glibc SIGABORT happens quite often but not always. After: $ valgrind -- perl -Iblib/{lib,arch} /usr/bin/xpath -q -e '//*[text()="Regression"]/../../td/a/span/text()' <../a.xml ==4000== Memcheck, a memory error detector ==4000== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==4000== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info ==4000== Command: perl -Iblib/lib -Iblib/arch /usr/bin/xpath -q -e //*[text()="Regression"]/../../td/a/span/text() ==4000== at_dt_iscsi_disk.positive_test.cold_plug.volume_disk.direct_source_mode at_dt_iscsi_disk.positive_test.cold_plug.volume_disk.ipv6_host_source_mode at_dt_iscsi_disk.positive_test.cold_plug.volume_disk.save_domain at_dt_iscsi_disk.positive_test.hot_plug.volume_disk.ipv4_host_source_mode at_dt_iscsi_disk.positive_test.hot_plug.volume_disk.snapshot_domain iscsi.iscsi_test.normal_test.device_disk.auth_uuid.auth_place_in_source iscsi.iscsi_test.normal_test.device_disk.iothread multidisks.coldplug.multi_disks_test.disks_boot_nfs_pool multipath.rawio_no.sgio_filtered.coldplug multipath.rawio_no.sgio_filtered.hotplug multipath.rawio_yes.sgio_filtered.coldplug multipath.rawio_yes.sgio_filtered.hotplug ==4000== ==4000== HEAP SUMMARY: ==4000== in use at exit: 14,088,289 bytes in 30,479 blocks ==4000== total heap usage: 233,042 allocs, 202,563 frees, 27,453,399 bytes allocated ==4000== ==4000== LEAK SUMMARY: ==4000== definitely lost: 60,103 bytes in 30 blocks ==4000== indirectly lost: 14,013,620 bytes in 30,409 blocks ==4000== possibly lost: 0 bytes in 0 blocks ==4000== still reachable: 14,566 bytes in 40 blocks ==4000== suppressed: 0 bytes in 0 blocks ==4000== Rerun with --leak-check=full to see details of leaked memory ==4000== ==4000== For counts of detected and suppressed errors, rerun with: -v ==4000== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
I'm sorry, it's a wrong component. The bug and the fix belong to perl-XML-Parser (perl-XML-Parser-2.44-10.el8 is affected).