|Summary:||[abrt] httrack: abortf_(): httrack killed by SIGABRT|
|Product:||[Fedora] Fedora||Reporter:||A.J. Bonnema <gbonnema>|
|Component:||httrack||Assignee:||Christopher Meng <i>|
|Status:||CLOSED ERRATA||QA Contact:||Fedora Extras Quality Assurance <extras-qa>|
|Fixed In Version:||httrack-3.48.19-2.fc20||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2014-08-16 00:35:48 UTC||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
Description A.J. Bonnema 2014-07-27 10:58:12 UTC
Description of problem: I was downloading the site https://www.randstad.nl/vacatures/1562519/medewerker-data-services which is a job opening. The command as "httrack https://www.randstad.nl/vacatures/1562519/medewerker-data-services" without options. Version-Release number of selected component: httrack-3.48.17-1.fc20 Additional info: reporter: libreport-2.2.3 backtrace_rating: 4 cmdline: httrack https://www.randstad.nl/vacatures/1562519/medewerker-data-services crash_function: abortf_ executable: /usr/bin/httrack kernel: 3.15.5-200.fc20.x86_64 runlevel: N 5 type: CCpp uid: 1000 Truncated backtrace: Thread no. 1 (10 frames) #2 abortf_ at htssafe.h:100 #3 fil_normalized at htslib.c:3458 #4 key_adrfil_hashes_generic at htshash.c:121 #5 key_adrfil_hashes at htshash.c:203 #6 coucal_calc_hashes at coucal.c:482 #7 coucal_fetch_value at coucal.c:1212 #8 coucal_read_value at coucal.c:1218 #9 coucal_read at coucal.c:1171 #10 hash_read at htshash.c:304 #11 hts_acceptlink_ at htswizard.c:153
Comment 3 A.J. Bonnema 2014-07-27 10:58:16 UTC
Created attachment 921439 [details] File: core_backtrace
Comment 9 A.J. Bonnema 2014-07-27 10:58:23 UTC
Created attachment 921445 [details] File: proc_pid_status
Comment 10 A.J. Bonnema 2014-07-27 10:58:24 UTC
Created attachment 921446 [details] File: var_log_messages
Comment 11 Xavier Roche 2014-07-28 17:55:51 UTC
Can't reproduce the issue with the given URL - do you have the crash each time you attempt to crawl this URL ?
Comment 12 A.J. Bonnema 2014-07-28 19:13:19 UTC
I just reproduced it. Copy the output here: [gbonnema@mahatma 2-site-httrack]$ httrack https://www.randstad.nl/vacatures/1562519/medewerker-data-services Mirror launched on Mon, 28 Jul 2014 21:08:17 by HTTrack Website Copier/3.48-17 [XR&CO'2014] mirroring https://www.randstad.nl/vacatures/1562519/medewerker-data-services with the wizard help.. strlen(copyBuff) == qLen failed at htslib.c:3458edewerker-data-services (22995 bytes) - OK Aborted (core dumped) [gbonnema@mahatma 2-site-httrack]$ I include a copy of the hts-log.txt: HTTrack3.48-17 launched on Mon, 28 Jul 2014 21:08:17 at https://www.randstad.nl/vacatures/1562519/medewerker-data-services 2 (httrack https://www.randstad.nl/vacatures/1562519/medewerker-data-services ) 3 4 Information, Warnings and Errors reported for this mirror: 5 note: the hts-log.txt file, and hts-cache folder, may contain sensitive information, 6 such as username/password authentication for websites mirrored in this project 7 do not share these files/folders if you want these information to remain private 8 9 21:08:18 Warning: Note: due to https://www.randstad.nl remote robots.txt rules, links beginning with these path will be forbidden: /vacatures?*, /werknemers/intern/, /klm/, /mwp2/faces/confidential/inschrijven.jspx, /ldo/, *.pd f, *.doc, *.docx, *.xls, *.xlsx, *.ppt, *.pptx, /content-snippets/, /system/, /admin/, /roxen-files/, /ifar/, /werknemers/duurzame-inzetbaarheid-secure.html, /unilever (see in the options to disable this) ~ Let me know if this is enough info. I am perfectly willing to do other tests. Kind regards, Guus.
Comment 13 A.J. Bonnema 2014-07-28 19:14:37 UTC
Could it be caused by the "forbidden: /vacatures?*, ....." ? part on the last line?
Comment 14 A.J. Bonnema 2014-07-28 19:43:23 UTC
I am running fedora 20 (updated). The output from gdb as requested: (gdb) set args https://www.randstad.nl/vacatures/1562519/medewerker-data-services (gdb) run Starting program: /usr/bin/httrack https://www.randstad.nl/vacatures/1562519/medewerker-data-services [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Mirror launched on Mon, 28 Jul 2014 21:38:56 by HTTrack Website Copier/3.48-17 [XR&CO'2014] mirroring https://www.randstad.nl/vacatures/1562519/medewerker-data-services with the wizard help.. strlen(copyBuff) == qLen failed at htslib.c:3458edewerker-data-services (22995 bytes) - OK Program received signal SIGABRT, Aborted. 0x00007ffff6b2dc39 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 56 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig); Missing separate debuginfos, use: debuginfo-install keyutils-libs-1.5.9-1.fc20.x86_64 krb5-libs-1.11.5-5.fc20.x86_64 libcom_err-1.42.8-3.fc20.x86_64 nss-mdns-0.10-13.fc20.x86_64 openssl-libs-1.0.1e-38.fc20.x86_64 zlib-1.2.8-3.fc20.x86_64 (gdb) up 3 #3 0x00007ffff7b83c18 in fil_normalized (source=source@entry=0x7ffffffe00b0 "/intent/tweet?url=https%3A%2F%2Fwww.randstad.nl%2Fvacatures%2F1562519%2Fmedewerker-data-services&text=Medewerker+Data+Services&via=randstadnl", dest=0x7ffffffe729f "/intent/tweet") at htslib.c:3458 3458 assertf(strlen(copyBuff) == qLen); (gdb) set print elements 8000 (gdb) p source $1 = 0x7ffffffe00b0 "/intent/tweet?url=https%3A%2F%2Fwww.randstad.nl%2Fvacatures%2F1562519%2Fmedewerker-data-services&text=Medewerker+Data+Services&via=randstadnl" (gdb) p copyBuff $2 = 0x67d510 "?text=Medewerker+Data+Services&url=https%3A%2F%2Fwww.randstad.nl%2Fvac&via=randstadnl" (gdb) p amps $3 = 0x7ffffffe72ff "" (gdb) p amps $4 = 0x7ffffffe72ac "" (gdb) p amps $5 = 0x7ffffffe731d "" (gdb) Hope this helps, kind regards, Guus.
Comment 15 A.J. Bonnema 2014-07-28 19:49:59 UTC
I still had the gdb session open, so np: (gdb) p amps+1 $6 = 0x7ffffffe7300 "text=Medewerker+Data+Services" (gdb) p amps+1 $7 = 0x7ffffffe72ad "url=https%3A%2F%2Fwww.randstad.nl%2Fvacatures%2F1562519%2Fmedewerker-data-services" (gdb) p amps+1 $8 = 0x7ffffffe731e "via=randstadnl" (gdb) Regards, Guus.
Comment 16 Xavier Roche 2014-07-28 19:54:27 UTC
Can you tell me what gcc version is in use ? It seems that the second string hasn't been copied correctly (!) and I'm still scratching my head to find out why :)
Comment 17 A.J. Bonnema 2014-07-28 19:57:03 UTC
[gbonnema@mahatma ~]$ gcc --version gcc (GCC) 4.8.3 20140624 (Red Hat 4.8.3-1) Copyright (C) 2013 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. [gbonnema@mahatma ~]$
Comment 18 A.J. Bonnema 2014-07-28 20:00:23 UTC
TBH I do not know what gcc version was used to generate the httrack of fedora 20, because I probably just got the binary version of httrack, as fedora packages it.
Comment 19 Xavier Roche 2014-07-28 20:02:37 UTC
Ahah! I can reproduce the issue with thie GCC release!!!
Comment 20 A.J. Bonnema 2014-07-28 20:05:08 UTC
Good show! I will log out in a minute, unless you have more tests to do?
Comment 21 Xavier Roche 2014-07-28 20:07:31 UTC
No, thank you - I should be able to understand a bit more what's going on from now. Thanks again for your precious help!
Comment 22 Xavier Roche 2014-07-28 20:40:34 UTC
Issue spotted and fixed inside src/htssafe.h. strncat(A, B, (size_t) -1) is NOT safe at all, and does not appears to behave like strcat(A, B), because of optimized version.
Comment 23 Xavier Roche 2014-07-28 20:42:03 UTC
Created attachment 921928 [details] Patch to fix the issue
Comment 24 Xavier Roche 2014-07-28 21:12:27 UTC
Created attachment 921942 [details] Final patch for htssafe.h
Comment 25 Xavier Roche 2014-07-28 21:27:22 UTC
Fixed in 3.48.19 (http://mirror.httrack.com/historical/httrack-3.48.19.tar.gz)
Comment 26 Fedora Update System 2014-08-06 08:09:50 UTC
httrack-3.48.19-1.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/httrack-3.48.19-1.fc20
Comment 27 Fedora Update System 2014-08-07 15:29:44 UTC
Package httrack-3.48.19-1.fc20: * should fix your issue, * was pushed to the Fedora 20 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing httrack-3.48.19-1.fc20' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2014-9199/httrack-3.48.19-1.fc20 then log in and leave karma (feedback).
Comment 28 Fedora Update System 2014-08-09 07:32:59 UTC
Package httrack-3.48.19-2.fc20: * should fix your issue, * was pushed to the Fedora 20 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing httrack-3.48.19-2.fc20' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2014-9199/httrack-3.48.19-2.fc20 then log in and leave karma (feedback).
Comment 29 Fedora Update System 2014-08-16 00:35:48 UTC
httrack-3.48.19-2.fc20 has been pushed to the Fedora 20 stable repository. If problems still persist, please make note of it in this bug report.
Comment 30 Xavier Roche 2014-08-16 10:00:31 UTC
Probable upstream bug in the the GLIBC reported as Bug 17279 (https://sourceware.org/bugzilla/show_bug.cgi?id=17279)