Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.

Bug 1123625

Summary: [abrt] httrack: abortf_(): httrack killed by SIGABRT
Product: [Fedora] Fedora Reporter: A.J. Bonnema <gbonnema>
Component: httrackAssignee: Christopher Meng <i>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 20CC: i, roche
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
URL: https://retrace.fedoraproject.org/faf/reports/bthash/0a8d4a1ae162aea41a9b52fde878d5eabbe89ee5
Whiteboard: abrt_hash:996aa7c24c6f79ae883ab12f2df7603f7e46f59b
Fixed In Version: httrack-3.48.19-2.fc20 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-08-15 20:35:48 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Description Flags
File: backtrace
File: cgroup
File: core_backtrace
File: dso_list
File: environ
File: limits
File: maps
File: open_fds
File: proc_pid_status
File: var_log_messages
Patch to fix the issue
Final patch for htssafe.h none

Description A.J. Bonnema 2014-07-27 06:58:12 EDT
Description of problem:
I was downloading the site https://www.randstad.nl/vacatures/1562519/medewerker-data-services
which is a job opening.

The command as "httrack https://www.randstad.nl/vacatures/1562519/medewerker-data-services" without options.

Version-Release number of selected component:

Additional info:
reporter:       libreport-2.2.3
backtrace_rating: 4
cmdline:        httrack https://www.randstad.nl/vacatures/1562519/medewerker-data-services
crash_function: abortf_
executable:     /usr/bin/httrack
kernel:         3.15.5-200.fc20.x86_64
runlevel:       N 5
type:           CCpp
uid:            1000

Truncated backtrace:
Thread no. 1 (10 frames)
 #2 abortf_ at htssafe.h:100
 #3 fil_normalized at htslib.c:3458
 #4 key_adrfil_hashes_generic at htshash.c:121
 #5 key_adrfil_hashes at htshash.c:203
 #6 coucal_calc_hashes at coucal.c:482
 #7 coucal_fetch_value at coucal.c:1212
 #8 coucal_read_value at coucal.c:1218
 #9 coucal_read at coucal.c:1171
 #10 hash_read at htshash.c:304
 #11 hts_acceptlink_ at htswizard.c:153
Comment 1 A.J. Bonnema 2014-07-27 06:58:14 EDT
Created attachment 921437 [details]
File: backtrace
Comment 2 A.J. Bonnema 2014-07-27 06:58:15 EDT
Created attachment 921438 [details]
File: cgroup
Comment 3 A.J. Bonnema 2014-07-27 06:58:16 EDT
Created attachment 921439 [details]
File: core_backtrace
Comment 4 A.J. Bonnema 2014-07-27 06:58:17 EDT
Created attachment 921440 [details]
File: dso_list
Comment 5 A.J. Bonnema 2014-07-27 06:58:19 EDT
Created attachment 921441 [details]
File: environ
Comment 6 A.J. Bonnema 2014-07-27 06:58:20 EDT
Created attachment 921442 [details]
File: limits
Comment 7 A.J. Bonnema 2014-07-27 06:58:21 EDT
Created attachment 921443 [details]
File: maps
Comment 8 A.J. Bonnema 2014-07-27 06:58:22 EDT
Created attachment 921444 [details]
File: open_fds
Comment 9 A.J. Bonnema 2014-07-27 06:58:23 EDT
Created attachment 921445 [details]
File: proc_pid_status
Comment 10 A.J. Bonnema 2014-07-27 06:58:24 EDT
Created attachment 921446 [details]
File: var_log_messages
Comment 11 Xavier Roche 2014-07-28 13:55:51 EDT
Can't reproduce the issue with the given URL - do you have the crash each time you attempt to crawl this URL ?
Comment 12 A.J. Bonnema 2014-07-28 15:13:19 EDT
I just reproduced it. Copy the output here:

[gbonnema@mahatma 2-site-httrack]$ httrack https://www.randstad.nl/vacatures/1562519/medewerker-data-services
Mirror launched on Mon, 28 Jul 2014 21:08:17 by HTTrack Website Copier/3.48-17 [XR&CO'2014]
mirroring https://www.randstad.nl/vacatures/1562519/medewerker-data-services with the wizard help..
strlen(copyBuff) == qLen failed at htslib.c:3458edewerker-data-services (22995 bytes) - OK
Aborted (core dumped)
[gbonnema@mahatma 2-site-httrack]$ 

I include a copy of the hts-log.txt:

 HTTrack3.48-17 launched on Mon, 28 Jul 2014 21:08:17 at https://www.randstad.nl/vacatures/1562519/medewerker-data-services
  2 (httrack https://www.randstad.nl/vacatures/1562519/medewerker-data-services )
  4 Information, Warnings and Errors reported for this mirror:
  5 note:   the hts-log.txt file, and hts-cache folder, may contain sensitive information,
  6     such as username/password authentication for websites mirrored in this project
  7     do not share these files/folders if you want these information to remain private
  9 21:08:18    Warning:    Note: due to https://www.randstad.nl remote robots.txt rules, links beginning with these path will be forbidden: /vacatures?*, /werknemers/intern/, /klm/, /mwp2/faces/confidential/inschrijven.jspx, /ldo/, *.pd    f, *.doc, *.docx, *.xls, *.xlsx, *.ppt, *.pptx, /content-snippets/, /system/, /admin/, /roxen-files/, /ifar/, /werknemers/duurzame-inzetbaarheid-secure.html, /unilever (see in the options to disable this)

Let me know if this is enough info. I am perfectly willing to do other tests.

Kind regards, Guus.
Comment 13 A.J. Bonnema 2014-07-28 15:14:37 EDT
Could it be caused by the "forbidden: /vacatures?*, ....." ? part on the last line?
Comment 14 A.J. Bonnema 2014-07-28 15:43:23 EDT
I am running fedora 20 (updated).

The output from gdb as requested:

(gdb) set args https://www.randstad.nl/vacatures/1562519/medewerker-data-services
(gdb) run
Starting program: /usr/bin/httrack https://www.randstad.nl/vacatures/1562519/medewerker-data-services
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Mirror launched on Mon, 28 Jul 2014 21:38:56 by HTTrack Website Copier/3.48-17 [XR&CO'2014]
mirroring https://www.randstad.nl/vacatures/1562519/medewerker-data-services with the wizard help..
strlen(copyBuff) == qLen failed at htslib.c:3458edewerker-data-services (22995 bytes) - OK

Program received signal SIGABRT, Aborted.
0x00007ffff6b2dc39 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56	  return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
Missing separate debuginfos, use: debuginfo-install keyutils-libs-1.5.9-1.fc20.x86_64 krb5-libs-1.11.5-5.fc20.x86_64 libcom_err-1.42.8-3.fc20.x86_64 nss-mdns-0.10-13.fc20.x86_64 openssl-libs-1.0.1e-38.fc20.x86_64 zlib-1.2.8-3.fc20.x86_64
(gdb) up 3
#3  0x00007ffff7b83c18 in fil_normalized (source=source@entry=0x7ffffffe00b0 "/intent/tweet?url=https%3A%2F%2Fwww.randstad.nl%2Fvacatures%2F1562519%2Fmedewerker-data-services&text=Medewerker+Data+Services&via=randstadnl", 
    dest=0x7ffffffe729f "/intent/tweet") at htslib.c:3458
3458	    assertf(strlen(copyBuff) == qLen);
(gdb) set print elements 8000
(gdb) p source
$1 = 0x7ffffffe00b0 "/intent/tweet?url=https%3A%2F%2Fwww.randstad.nl%2Fvacatures%2F1562519%2Fmedewerker-data-services&text=Medewerker+Data+Services&via=randstadnl"
(gdb) p copyBuff
$2 = 0x67d510 "?text=Medewerker+Data+Services&url=https%3A%2F%2Fwww.randstad.nl%2Fvac&via=randstadnl"
(gdb) p amps[0]
$3 = 0x7ffffffe72ff ""
(gdb) p amps[1]
$4 = 0x7ffffffe72ac ""
(gdb) p amps[2]
$5 = 0x7ffffffe731d ""

Hope this helps, kind regards, Guus.
Comment 15 A.J. Bonnema 2014-07-28 15:49:59 EDT
I still had the gdb session open, so np:

(gdb) p amps[0]+1
$6 = 0x7ffffffe7300 "text=Medewerker+Data+Services"
(gdb) p amps[1]+1
$7 = 0x7ffffffe72ad "url=https%3A%2F%2Fwww.randstad.nl%2Fvacatures%2F1562519%2Fmedewerker-data-services"
(gdb) p amps[2]+1
$8 = 0x7ffffffe731e "via=randstadnl"

Regards, Guus.
Comment 16 Xavier Roche 2014-07-28 15:54:27 EDT
Can you tell me what gcc version is in use ? It seems that the second string hasn't been copied correctly (!) and I'm still scratching my head to find out why :)
Comment 17 A.J. Bonnema 2014-07-28 15:57:03 EDT
[gbonnema@mahatma ~]$ gcc --version
gcc (GCC) 4.8.3 20140624 (Red Hat 4.8.3-1)
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO

[gbonnema@mahatma ~]$
Comment 18 A.J. Bonnema 2014-07-28 16:00:23 EDT
TBH I do not know what gcc version was used to generate the httrack of fedora 20, because I probably just got the binary version of httrack, as fedora packages it.
Comment 19 Xavier Roche 2014-07-28 16:02:37 EDT
Ahah! I can reproduce the issue with thie GCC release!!!
Comment 20 A.J. Bonnema 2014-07-28 16:05:08 EDT
Good show! I will log out in a minute, unless you have more tests to do?
Comment 21 Xavier Roche 2014-07-28 16:07:31 EDT
No, thank you - I should be able to understand a bit more what's going on from now. Thanks again for your precious help!
Comment 22 Xavier Roche 2014-07-28 16:40:34 EDT
Issue spotted and fixed inside src/htssafe.h. strncat(A, B, (size_t) -1) is NOT safe at all, and does not appears to behave like strcat(A, B), because of optimized version.
Comment 23 Xavier Roche 2014-07-28 16:42:03 EDT
Created attachment 921928 [details]
Patch to fix the issue
Comment 24 Xavier Roche 2014-07-28 17:12:27 EDT
Created attachment 921942 [details]
Final patch for htssafe.h
Comment 25 Xavier Roche 2014-07-28 17:27:22 EDT
Fixed in 3.48.19 (http://mirror.httrack.com/historical/httrack-3.48.19.tar.gz)
Comment 26 Fedora Update System 2014-08-06 04:09:50 EDT
httrack-3.48.19-1.fc20 has been submitted as an update for Fedora 20.
Comment 27 Fedora Update System 2014-08-07 11:29:44 EDT
Package httrack-3.48.19-1.fc20:
* should fix your issue,
* was pushed to the Fedora 20 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing httrack-3.48.19-1.fc20'
as soon as you are able to.
Please go to the following url:
then log in and leave karma (feedback).
Comment 28 Fedora Update System 2014-08-09 03:32:59 EDT
Package httrack-3.48.19-2.fc20:
* should fix your issue,
* was pushed to the Fedora 20 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing httrack-3.48.19-2.fc20'
as soon as you are able to.
Please go to the following url:
then log in and leave karma (feedback).
Comment 29 Fedora Update System 2014-08-15 20:35:48 EDT
httrack-3.48.19-2.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 30 Xavier Roche 2014-08-16 06:00:31 EDT
Probable upstream bug in the the GLIBC reported as Bug 17279 (https://sourceware.org/bugzilla/show_bug.cgi?id=17279)