Bug 1123625

Summary: [abrt] httrack: abortf_(): httrack killed by SIGABRT
Product: [Fedora] Fedora Reporter: A.J. Bonnema <gbonnema>
Component: httrackAssignee: Christopher Meng <i>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 20CC: i, roche
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
URL: https://retrace.fedoraproject.org/faf/reports/bthash/0a8d4a1ae162aea41a9b52fde878d5eabbe89ee5
Whiteboard: abrt_hash:996aa7c24c6f79ae883ab12f2df7603f7e46f59b
Fixed In Version: httrack-3.48.19-2.fc20 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-08-16 00:35:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Description Flags
File: backtrace
File: cgroup
File: core_backtrace
File: dso_list
File: environ
File: limits
File: maps
File: open_fds
File: proc_pid_status
File: var_log_messages
Patch to fix the issue
Final patch for htssafe.h none

Description A.J. Bonnema 2014-07-27 10:58:12 UTC
Description of problem:
I was downloading the site https://www.randstad.nl/vacatures/1562519/medewerker-data-services
which is a job opening.

The command as "httrack https://www.randstad.nl/vacatures/1562519/medewerker-data-services" without options.

Version-Release number of selected component:

Additional info:
reporter:       libreport-2.2.3
backtrace_rating: 4
cmdline:        httrack https://www.randstad.nl/vacatures/1562519/medewerker-data-services
crash_function: abortf_
executable:     /usr/bin/httrack
kernel:         3.15.5-200.fc20.x86_64
runlevel:       N 5
type:           CCpp
uid:            1000

Truncated backtrace:
Thread no. 1 (10 frames)
 #2 abortf_ at htssafe.h:100
 #3 fil_normalized at htslib.c:3458
 #4 key_adrfil_hashes_generic at htshash.c:121
 #5 key_adrfil_hashes at htshash.c:203
 #6 coucal_calc_hashes at coucal.c:482
 #7 coucal_fetch_value at coucal.c:1212
 #8 coucal_read_value at coucal.c:1218
 #9 coucal_read at coucal.c:1171
 #10 hash_read at htshash.c:304
 #11 hts_acceptlink_ at htswizard.c:153

Comment 1 A.J. Bonnema 2014-07-27 10:58:14 UTC
Created attachment 921437 [details]
File: backtrace

Comment 2 A.J. Bonnema 2014-07-27 10:58:15 UTC
Created attachment 921438 [details]
File: cgroup

Comment 3 A.J. Bonnema 2014-07-27 10:58:16 UTC
Created attachment 921439 [details]
File: core_backtrace

Comment 4 A.J. Bonnema 2014-07-27 10:58:17 UTC
Created attachment 921440 [details]
File: dso_list

Comment 5 A.J. Bonnema 2014-07-27 10:58:19 UTC
Created attachment 921441 [details]
File: environ

Comment 6 A.J. Bonnema 2014-07-27 10:58:20 UTC
Created attachment 921442 [details]
File: limits

Comment 7 A.J. Bonnema 2014-07-27 10:58:21 UTC
Created attachment 921443 [details]
File: maps

Comment 8 A.J. Bonnema 2014-07-27 10:58:22 UTC
Created attachment 921444 [details]
File: open_fds

Comment 9 A.J. Bonnema 2014-07-27 10:58:23 UTC
Created attachment 921445 [details]
File: proc_pid_status

Comment 10 A.J. Bonnema 2014-07-27 10:58:24 UTC
Created attachment 921446 [details]
File: var_log_messages

Comment 11 Xavier Roche 2014-07-28 17:55:51 UTC
Can't reproduce the issue with the given URL - do you have the crash each time you attempt to crawl this URL ?

Comment 12 A.J. Bonnema 2014-07-28 19:13:19 UTC
I just reproduced it. Copy the output here:

[gbonnema@mahatma 2-site-httrack]$ httrack https://www.randstad.nl/vacatures/1562519/medewerker-data-services
Mirror launched on Mon, 28 Jul 2014 21:08:17 by HTTrack Website Copier/3.48-17 [XR&CO'2014]
mirroring https://www.randstad.nl/vacatures/1562519/medewerker-data-services with the wizard help..
strlen(copyBuff) == qLen failed at htslib.c:3458edewerker-data-services (22995 bytes) - OK
Aborted (core dumped)
[gbonnema@mahatma 2-site-httrack]$ 

I include a copy of the hts-log.txt:

 HTTrack3.48-17 launched on Mon, 28 Jul 2014 21:08:17 at https://www.randstad.nl/vacatures/1562519/medewerker-data-services
  2 (httrack https://www.randstad.nl/vacatures/1562519/medewerker-data-services )
  4 Information, Warnings and Errors reported for this mirror:
  5 note:   the hts-log.txt file, and hts-cache folder, may contain sensitive information,
  6     such as username/password authentication for websites mirrored in this project
  7     do not share these files/folders if you want these information to remain private
  9 21:08:18    Warning:    Note: due to https://www.randstad.nl remote robots.txt rules, links beginning with these path will be forbidden: /vacatures?*, /werknemers/intern/, /klm/, /mwp2/faces/confidential/inschrijven.jspx, /ldo/, *.pd    f, *.doc, *.docx, *.xls, *.xlsx, *.ppt, *.pptx, /content-snippets/, /system/, /admin/, /roxen-files/, /ifar/, /werknemers/duurzame-inzetbaarheid-secure.html, /unilever (see in the options to disable this)

Let me know if this is enough info. I am perfectly willing to do other tests.

Kind regards, Guus.

Comment 13 A.J. Bonnema 2014-07-28 19:14:37 UTC
Could it be caused by the "forbidden: /vacatures?*, ....." ? part on the last line?

Comment 14 A.J. Bonnema 2014-07-28 19:43:23 UTC
I am running fedora 20 (updated).

The output from gdb as requested:

(gdb) set args https://www.randstad.nl/vacatures/1562519/medewerker-data-services
(gdb) run
Starting program: /usr/bin/httrack https://www.randstad.nl/vacatures/1562519/medewerker-data-services
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Mirror launched on Mon, 28 Jul 2014 21:38:56 by HTTrack Website Copier/3.48-17 [XR&CO'2014]
mirroring https://www.randstad.nl/vacatures/1562519/medewerker-data-services with the wizard help..
strlen(copyBuff) == qLen failed at htslib.c:3458edewerker-data-services (22995 bytes) - OK

Program received signal SIGABRT, Aborted.
0x00007ffff6b2dc39 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56	  return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
Missing separate debuginfos, use: debuginfo-install keyutils-libs-1.5.9-1.fc20.x86_64 krb5-libs-1.11.5-5.fc20.x86_64 libcom_err-1.42.8-3.fc20.x86_64 nss-mdns-0.10-13.fc20.x86_64 openssl-libs-1.0.1e-38.fc20.x86_64 zlib-1.2.8-3.fc20.x86_64
(gdb) up 3
#3  0x00007ffff7b83c18 in fil_normalized (source=source@entry=0x7ffffffe00b0 "/intent/tweet?url=https%3A%2F%2Fwww.randstad.nl%2Fvacatures%2F1562519%2Fmedewerker-data-services&text=Medewerker+Data+Services&via=randstadnl", 
    dest=0x7ffffffe729f "/intent/tweet") at htslib.c:3458
3458	    assertf(strlen(copyBuff) == qLen);
(gdb) set print elements 8000
(gdb) p source
$1 = 0x7ffffffe00b0 "/intent/tweet?url=https%3A%2F%2Fwww.randstad.nl%2Fvacatures%2F1562519%2Fmedewerker-data-services&text=Medewerker+Data+Services&via=randstadnl"
(gdb) p copyBuff
$2 = 0x67d510 "?text=Medewerker+Data+Services&url=https%3A%2F%2Fwww.randstad.nl%2Fvac&via=randstadnl"
(gdb) p amps[0]
$3 = 0x7ffffffe72ff ""
(gdb) p amps[1]
$4 = 0x7ffffffe72ac ""
(gdb) p amps[2]
$5 = 0x7ffffffe731d ""

Hope this helps, kind regards, Guus.

Comment 15 A.J. Bonnema 2014-07-28 19:49:59 UTC
I still had the gdb session open, so np:

(gdb) p amps[0]+1
$6 = 0x7ffffffe7300 "text=Medewerker+Data+Services"
(gdb) p amps[1]+1
$7 = 0x7ffffffe72ad "url=https%3A%2F%2Fwww.randstad.nl%2Fvacatures%2F1562519%2Fmedewerker-data-services"
(gdb) p amps[2]+1
$8 = 0x7ffffffe731e "via=randstadnl"

Regards, Guus.

Comment 16 Xavier Roche 2014-07-28 19:54:27 UTC
Can you tell me what gcc version is in use ? It seems that the second string hasn't been copied correctly (!) and I'm still scratching my head to find out why :)

Comment 17 A.J. Bonnema 2014-07-28 19:57:03 UTC
[gbonnema@mahatma ~]$ gcc --version
gcc (GCC) 4.8.3 20140624 (Red Hat 4.8.3-1)
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO

[gbonnema@mahatma ~]$

Comment 18 A.J. Bonnema 2014-07-28 20:00:23 UTC
TBH I do not know what gcc version was used to generate the httrack of fedora 20, because I probably just got the binary version of httrack, as fedora packages it.

Comment 19 Xavier Roche 2014-07-28 20:02:37 UTC
Ahah! I can reproduce the issue with thie GCC release!!!

Comment 20 A.J. Bonnema 2014-07-28 20:05:08 UTC
Good show! I will log out in a minute, unless you have more tests to do?

Comment 21 Xavier Roche 2014-07-28 20:07:31 UTC
No, thank you - I should be able to understand a bit more what's going on from now. Thanks again for your precious help!

Comment 22 Xavier Roche 2014-07-28 20:40:34 UTC
Issue spotted and fixed inside src/htssafe.h. strncat(A, B, (size_t) -1) is NOT safe at all, and does not appears to behave like strcat(A, B), because of optimized version.

Comment 23 Xavier Roche 2014-07-28 20:42:03 UTC
Created attachment 921928 [details]
Patch to fix the issue

Comment 24 Xavier Roche 2014-07-28 21:12:27 UTC
Created attachment 921942 [details]
Final patch for htssafe.h

Comment 25 Xavier Roche 2014-07-28 21:27:22 UTC
Fixed in 3.48.19 (http://mirror.httrack.com/historical/httrack-3.48.19.tar.gz)

Comment 26 Fedora Update System 2014-08-06 08:09:50 UTC
httrack-3.48.19-1.fc20 has been submitted as an update for Fedora 20.

Comment 27 Fedora Update System 2014-08-07 15:29:44 UTC
Package httrack-3.48.19-1.fc20:
* should fix your issue,
* was pushed to the Fedora 20 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing httrack-3.48.19-1.fc20'
as soon as you are able to.
Please go to the following url:
then log in and leave karma (feedback).

Comment 28 Fedora Update System 2014-08-09 07:32:59 UTC
Package httrack-3.48.19-2.fc20:
* should fix your issue,
* was pushed to the Fedora 20 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing httrack-3.48.19-2.fc20'
as soon as you are able to.
Please go to the following url:
then log in and leave karma (feedback).

Comment 29 Fedora Update System 2014-08-16 00:35:48 UTC
httrack-3.48.19-2.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 30 Xavier Roche 2014-08-16 10:00:31 UTC
Probable upstream bug in the the GLIBC reported as Bug 17279 (https://sourceware.org/bugzilla/show_bug.cgi?id=17279)