Bug 126482 - htdig does not understand robots.txt file
htdig does not understand robots.txt file
Product: Fedora
Classification: Fedora
Component: htdig (Show other bugs)
All Linux
medium Severity medium
: ---
: ---
Assigned To: Phil Knirsch
Depends On:
  Show dependency treegraph
Reported: 2004-06-22 06:45 EDT by Jacek Piskozub
Modified: 2015-03-04 20:14 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2004-07-06 12:17:05 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Jacek Piskozub 2004-06-22 06:45:01 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040619

Description of problem:
hidig-3.2.0b5 does not work if the website has _any_ robots.txt file.
It is a known bug of this version of htdig. A patch is available. See

The patch is a very simple one:

--- htdig/Server.cc.orig	2003-10-27 17:28:52.000000000 -0600
+++ htdig/Server.cc	2003-11-13 11:31:24.000000000 -0600
@@ -338,6 +338,8 @@
     String	fullpatt = "^[^:]*://[^/]*(";
     fullpatt << pattern << ')';
+    if (pattern.length() == 0)
+	fullpatt = "";
     _disallow.set(fullpatt, config->Boolean("case_sensitive"));

I have the same symptoms with FC2 (i386). Removing the (correct!)
robot.txt file makes htdig index my site again.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Set up a website
2. Create a robot.txt file (with at least "User Agent: *" line)
3. Update to FC2
4. See that the word database is empty

Actual Results:  Searching the website gives an arror about
db.words.db not found

Expected Results:  The website is searchable
Comment 1 Phil Knirsch 2004-07-06 12:17:05 EDT
OK, looks sane and logical.

Included in htdig-3.2.0b6-1 and later.

Read ya, Phil

Note You need to log in before you can comment on or make changes to this bug.