Bug 84611

Summary: rundig dumps core
Product: Red Hat Enterprise Linux 4 Reporter: Joe Orton <jorton>
Component: htdigAssignee: Adam Tkac <atkac>
Status: CLOSED ERRATA QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: grdetil, jturner, jukka, mark, ovasik, toniw
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2008-0149 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-02-27 16:27:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 130528    
Bug Blocks: 160161    
Attachments:
Description Flags
patch to fix segfaults in htfuzzy none

Description Joe Orton 2003-02-19 16:14:05 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
    
Actual results:


Expected results:


Additional info:

Comment 1 Joe Orton 2003-02-19 16:15:02 UTC
Oops, sorry, hit enter a bit quick there.


Comment 2 Joe Orton 2003-02-19 16:17:40 UTC
Gah! And again.  Default Phoebe beta5 install, no config changes - I ran:

# htdig
# rundig fish
/usr/bin/rundig: line 81:  3747 Segmentation fault      /usr/bin/htfuzzy
$verbose metaphone
/usr/bin/rundig: line 82:  3748 Segmentation fault      /usr/bin/htfuzzy
$verbose soundex


Comment 3 Matt Wilson 2003-02-21 01:11:45 UTC
Works OK here - could you give more setup info?

+ /usr/bin/htfuzzy metaphone
+ /usr/bin/htfuzzy soundex


Comment 4 Joe Orton 2003-03-03 16:01:30 UTC
From a full Shrike install:

# service httpd start
# rundig
[root@blane root]# rundig
/usr/bin/rundig: line 81: 23133 Segmentation fault      /usr/bin/htfuzzy
$verbose metaphone
/usr/bin/rundig: line 82: 23134 Segmentation fault      /usr/bin/htfuzzy
$verbose soundex



Comment 5 Phil Knirsch 2003-05-15 10:20:05 UTC
*** Bug 90915 has been marked as a duplicate of this bug. ***

Comment 6 Toni Willberg 2003-05-23 11:58:53 UTC
Any workarounds available?

Comment 7 Phil Knirsch 2003-09-02 13:17:03 UTC
Worked and still works fine for me.

And just in case, Taroon Beta 2 contains a htdig-3.1.6 for those who want to use
a real stable version of htdig.

Read ya, Phil

Comment 8 Mark Dadgar 2004-01-23 16:29:17 UTC
I just did a clean install of RHEL ES 3 Update 1 and I have exactly the same problem.

[root@mail /]# htfuzzy metaphone
Segmentation fault
[root@mail /]# htfuzzy soundex
Segmentation fault
[root@mail /]# rpm -q htdig
htdig-3.1.6-3

I have checked all the permissions on the appropriate htdig directories and everything 
looks fine.

- Mark

Comment 9 Joe Orton 2004-11-25 15:41:50 UTC
Fails exactly like this for me in the RHEL4-b2 code, htdig-3.2.0b6-3,
on two separate i686 machines both with vanilla installations, no
config changes, straight out of the box.  Let me know if you want
access for debugging, Phil.

Comment 10 Phil Knirsch 2004-12-22 13:33:24 UTC
That would be good.

I've just tried to verify this bug on different current RHEL4-pre-rc
trees and didn't get the error at all.

Read ya, Phil

Comment 11 Jay Turner 2005-01-14 14:42:32 UTC
Moving back to assigned.

Comment 12 Joe Orton 2005-02-04 16:03:28 UTC
The trigger for this is actually simple: simply whether or not
http://localhost/ gives a 403 error as per the default;

[root@blane ~]# rundig
/usr/bin/rundig: line 101: 30994 Segmentation fault     
/usr/bin/htfuzzy $verbose metaphone
/usr/bin/rundig: line 102: 30995 Segmentation fault     
/usr/bin/htfuzzy $verbose soundex
[root@blane ~]# echo hello > /var/www/html/index.html
[root@blane ~]# rundig
[root@blane ~]# rm -f /var/www/html/index.html
[root@blane ~]# rundig
/usr/bin/rundig: line 101: 31024 Segmentation fault     
/usr/bin/htfuzzy $verbose metaphone
/usr/bin/rundig: line 102: 31025 Segmentation fault     
/usr/bin/htfuzzy $verbose soundex
[root@blane ~]#


Comment 13 Joe Orton 2005-04-27 14:56:41 UTC
*** Bug 156094 has been marked as a duplicate of this bug. ***

Comment 14 Phil Knirsch 2005-09-05 08:54:22 UTC
This sounds very much like #130528. Can someone with access to the machine where
this failed give it a shot and add the $opts to the last 2 lines of the rundig
script there and see if it fails?

Because that's my main suspicion at the moment.

Read ya, Phil

Comment 15 Joe Orton 2005-09-05 09:20:29 UTC
This is the "rundig dumps core if http://localhost/ gives a 403 error" bug.

[root@tango ~]# diff -u /usr/bin/rundig rundig
--- /usr/bin/rundig     2005-01-27 11:09:07.000000000 +0000
+++ rundig      2005-09-05 10:20:17.000000000 +0100
@@ -98,5 +98,5 @@
     $BINDIR/htfuzzy $opts synonyms
 fi

-/usr/bin/htfuzzy $verbose metaphone
-/usr/bin/htfuzzy $verbose soundex
+/usr/bin/htfuzzy $opts $verbose metaphone
+/usr/bin/htfuzzy $opts $verbose soundex
[root@tango ~]# service httpd status
httpd (pid 8691 8690 8689 8688 8687 8686 8685 8684 21507) is running...
[root@tango ~]# > /var/log/httpd/access_log
[root@tango ~]# service httpd start
Starting httpd:
[root@tango ~]# diff -u /usr/bin/rundig rundig
--- /usr/bin/rundig     2005-01-27 11:09:07.000000000 +0000
+++ rundig      2005-09-05 10:20:17.000000000 +0100
@@ -98,5 +98,5 @@
     $BINDIR/htfuzzy $opts synonyms
 fi

-/usr/bin/htfuzzy $verbose metaphone
-/usr/bin/htfuzzy $verbose soundex
+/usr/bin/htfuzzy $opts $verbose metaphone
+/usr/bin/htfuzzy $opts $verbose soundex
[root@tango ~]# > /var/log/httpd/access_log
[root@tango ~]# service httpd status
httpd (pid 8691 8690 8689 8688 8687 8686 8685 8684 21507) is running...
[root@tango ~]# ./rundig
./rundig: line 101: 18938 Segmentation fault      /usr/bin/htfuzzy $opts
$verbose metaphone
./rundig: line 102: 18939 Segmentation fault      /usr/bin/htfuzzy $opts
$verbose soundex
[root@tango ~]# tail /var/log/httpd/access_log
127.0.0.1 - - [05/Sep/2005:10:20:53 +0100] "HEAD /robots.txt HTTP/1.1" 404 - "-"
"htdig"
127.0.0.1 - - [05/Sep/2005:10:20:53 +0100] "HEAD / HTTP/1.1" 403 - "-" "htdig"


Comment 17 Rick Buford 2005-10-14 21:20:52 UTC
(In reply to comment #14)
> This sounds very much like #130528. Can someone with access to the machine where
> this failed give it a shot and add the $opts to the last 2 lines of the rundig
> script there and see if it fails?
> 
> Because that's my main suspicion at the moment.
> 
> Read ya, Phil

[root@5vx1y51 htdig]# cat /etc/issue
Red Hat Enterprise Linux ES release 3 (Taroon Update 6)
Kernel \r on an \m

[root@5vx1y51 htdig]# rundig -a
DB2 problem...: missing or empty key value specified
/usr/bin/rundig: line 87:  9486 Segmentation fault      /usr/bin/htfuzzy
$verbose $opts metaphone
/usr/bin/rundig: line 88:  9487 Segmentation fault      /usr/bin/htfuzzy
$verbose $opts soundex

Additonal information:
1. rundig (with or without -a) works the first time
2. the second and subsequent times it fails with this seg fault
3. uninstalling/reinstalling the package only fixes the problem if I don't use
the same config file (i.e., don't copy /etc/htdig.conf.rpmsave /etc/htdig.conf)
4. even after reinstall, rundig only works the first time after installation
--------------
[root@5vx1y51 root]# rundig -a
DB2 problem...: missing or empty key value specified
/usr/bin/rundig: line 86:  9556 Segmentation fault      /usr/bin/htfuzzy
$verbose $opts metaphone
/usr/bin/rundig: line 87:  9557 Segmentation fault      /usr/bin/htfuzzy
$verbose $opts soundex
[root@5vx1y51 root]# rpm -e htdig
warning: /etc/htdig.conf saved as /etc/htdig.conf.rpmsave
[root@5vx1y51 root]# up2date htdig

Fetching Obsoletes list for channel: rhel-i386-es-3...

Fetching Obsoletes list for channel: rhel-i386-es-3-extras...

Fetching rpm headers...
########################################

Name                                    Version        Rel
----------------------------------------------------------
htdig                                   3.1.6          3                 i386


Testing package set / solving RPM inter-dependencies...
########################################
htdig-3.1.6-3.i386.rpm:     ########################## Done.
Preparing              ########################################### [100%]

Installing...
   1:htdig                  ########################################### [100%]
[root@5vx1y51 root]# vi /etc/htdig.conf
[root@5vx1y51 root]# diff -u /etc/htdig.orig /etc/htdig.conf
--- /etc/htdig.orig     2005-10-14 15:41:08.000000000 -0500
+++ /etc/htdig.conf     2005-10-14 15:42:54.000000000 -0500
@@ -30,7 +30,7 @@
 # You could also index all the URLs in a file like so:
 # start_url:          `${common_dir}/start.url`
 #
-start_url:             http://localhost
+start_url:             http://wiki.svc.cfx/

 #
 # This attribute limits the scope of the indexing process.  The default is to
@@ -53,7 +53,7 @@
 # may not work on your web server.  Check the  path prefix used on your web
 # server.)
 #
-exclude_urls:          /cgi-bin/ .cgi
+exclude_urls:          /cgi-bin/ .cgi do=backlinks do=export_html do=export_raw
do=index

 #
 # Since ht://Dig does not (and cannot) parse every document type, this
[root@5vx1y51 root]# rundig -a -v

New server: wiki.svc.cfx, 80
0:0:0:http://wiki.svc.cfx/doku.php?id=: *-++---+---+*********---**----*------
size = 16503
1:6:1:http://wiki.svc.cfx/doku.php?id=start:
*-**---*---**********---**----*------ size = 16503
2:7:1:http://wiki.svc.cfx/doku.php?id=groups:operations_management:
*-*+---*---+*+++++++-+++*------ size = 14312
3:8:1:http://wiki.svc.cfx/doku.php?id=groups:sysadmin:
*-**---*---+*++++++++++++++++*------ size = 12965
4:9:1:http://wiki.svc.cfx/doku.php?id=groups:sysdba: *-**--- size = 20695
....and so forth...

Stopping the process and restarting it doesn't seem to cause any problems.

While rundig is running:
[root@5vx1y51 htdig]# pwd
/var/lib/htdig
[root@5vx1y51 htdig]# ll
total 1.7M
drwxr-xr-x    2 root     root         4.0K Oct 14 15:48 .
drwxr-xr-x   13 root     root         4.0K Oct 14 15:33 ..
-rw-r--r--    1 root     root         8.0K Oct 14 15:48 db.docdb
-rw-r--r--    1 root     root         566K Oct 14 15:51 db.docdb.work
-rw-r--r--    1 root     root         2.0K Oct 14 15:48 db.docs.index
-rw-r--r--    1 root     root         249K Oct 14 15:48 db.metaphone.db
-rw-r--r--    1 root     root         211K Oct 14 15:48 db.soundex.db
-rw-r--r--    1 root     root         3.7K Oct 14 15:48 db.wordlist
-rw-r--r--    1 root     root         601K Oct 14 15:51 db.wordlist.work
-rw-r--r--    1 root     root         8.0K Oct 14 15:48 db.words.db

After the initial run, rundig fails with the same segfault @ the htfuzzy commands.

Comment 18 Rick Buford 2005-10-17 12:58:58 UTC
I hate it when computers make me look silly...after the 6th iteration of
install/reinstall, the problem is not re-occuring.

Additionally, the exclude directive now appears to be working correctly as well
(previously it was not ignoring export_raw export_html or backlinks)

Comment 19 Adam Tkac 2006-11-01 13:01:32 UTC
Can you please tell me if this is occurs in latest htdig package (fc6, rhel5)

Comment 20 Joe Orton 2006-12-11 12:08:20 UTC
[root@trash ~]# service httpd status
httpd (pid 17300 17299 17298 17297 17296 17294 17293 17292 27898) is running...
[root@trash ~]# ls /var/www/html/
[root@trash ~]# rundig
/usr/bin/rundig: line 101:  1001 Segmentation fault      /usr/bin/htfuzzy
$verbose metaphone
/usr/bin/rundig: line 102:  1002 Segmentation fault      /usr/bin/htfuzzy
$verbose soundex
[root@trash ~]# rpm -q htdig
htdig-3.2.0b6-6.4.2.2.1


Comment 21 Adam Tkac 2006-12-20 15:23:22 UTC
Yes, this is same bug as bug #130528
(In reply to comment #20)

Could you please tell me which arch are you using? I can't reproduce this bug on
i386 but on s390x I can



Comment 23 Gilles Detillieux 2006-12-20 18:24:17 UTC
Adam, I saw your patch on the ht://Dig bug tracker.  I'm no longer involved with
maintaining this package, but I thought I'd try to help you with this bug fix
anyway.

Rather than simply commenting out the dict = 0; statement in the Fuzzy
constructor in htfuzzy/Fuzzy.cc, it seems the proper fix would be to replace it
with dict = new Dictionary; instead.  That way, when it gets to the writeDB()
method, which seems to assume that dict is already set, it actually will be even
if there were no words in the database.  I haven't actually tried this, but it
seems that should fix the problem properly.  This fix should be implemented in
both the 3.1.x and 3.2.x code bases of ht://Dig.

My understanding is that this problem only occurs if the word database is empty,
i.e. that htdig didn't actually index any words from any web pages, and that if
there are words htfuzzy works correctly.  I hadn't been able to reproduce the
problem before (it had been reported on the htdig-general list a long time ago)
but you pointed me in the right direction.

Comment 24 Adam Tkac 2006-12-21 09:47:22 UTC
(In reply to comment #23)

> Rather than simply commenting out the dict = 0; statement in the Fuzzy
> constructor in htfuzzy/Fuzzy.cc, it seems the proper fix would be to replace >it
> with dict = new Dictionary; instead.  That way, when it gets to the writeDB()
> method, which seems to assume that dict is already set, it actually will be >even
> if there were no words in the database.  I haven't actually tried this, but it
> seems that should fix the problem properly.  This fix should be implemented in
> both the 3.1.x and 3.2.x code bases of ht://Dig.

I think, dict is allocated correctly before Fuzzy's constructor. So if you
replace "dict = 0" by "dict = new Dictionary" instead "//dict = 0" it could
causes memory leak. But I can be wrong. Before apply your patch, please check
potential memory leaks in htfuzzy.

Thanks Adam

Comment 27 Gilles Detillieux 2007-03-06 22:36:42 UTC
Created attachment 149397 [details]
patch to fix segfaults in htfuzzy

Hi, Adam.  I noticed that you dropped your segfault patch from the htdig rpm
for fc6, according to the latest update notice.  I had a feeling that simply
commenting out the assignment to dict wouldn't do it.

The problem is that dict is a pointer, not an actual Dictionary object, and the
object doesn't actually get created until there are words to add to it.  There
are several other places in the code with similar list pointers which may be
null when there are no words, which is the reason for all the segfaults when
htdig fails to index any words.  This patch fixes all the problems I could find
with null list pointers in htfuzzy.  I won't guarantee that it will get rid of
ALL segfaults, but in my testing it fixed the problem with segfaults on an
empty word database in both htfuzzy and htsearch, i.e. it should fix bug 84611,
and possibly bug 230931 as well.  I believe that for bug 230931 the hypothesis
that the problem is due to shared library conflicts is a bit of a red herring,
and the more likely cause is an empty word database.

By the way, the proper test for this is to edit /etc/htdig/htdig.conf and
change the value of start_url to a non-existent URL on a working web server. 
If htdig can't contact the web server, it doesn't create the word database, but
if it can contact it and gets a 404 error, it will create an empty word
database, which is what caused all the trouble in htfuzzy (on any
architecture).

Comment 28 Adam Tkac 2007-03-07 07:49:13 UTC
(In reply to comment #27)

I did quick test with your patch and problem looks fixed. I love upstreams like
you :) I'm going to immediately put your patch to rawhide and report you any
potential problems. Perfect work, thanks

Regards, Adam

Comment 36 errata-xmlrpc 2008-02-27 16:27:53 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0149.html