From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0) Description of problem: Hardware-pentium 133 w. 64 MB Ram, RH 7.1 distribution obtained from RH ftpsite running Squid 2.4stable1 with SquidGuard and DansGuardian. On the text-laden pages DansGuardian processes extremely slow, cirka 5 minutes for less than 200k of text. Upon discussion in the DG forum, the problem was pinpointed to glibc. (see except at the bottom). I downloaded and replaced glibc with the version 2.2.3-14 (rpm -Fvh did not work, it complained at the dependencies.) I used rpm -Fvh --nodeps, it installed OK. Unfortunately, (day after) I see it broke at least samba (mount -t smbfs - o username=xxx,password=yyy /mnt/zzz //winserver/share does not work anymore, and who knows what else). Anyway, the original problem (slowness of DG) was cured with this crude "upgrade". Oh, yes, in wait for RH 7.2 - I-ll appreciate any good advice how to correct broken dependencies... I am not overly expert in Linux, working mostly with Windows so far, although by far not linux-illiterate. Best wishes. Sergey. Except of original discussion with Daniel Barron, author of DG follows: > On Wed, Jul 18, 2001 at 10:58:27PM +0000, Daniel Barron wrote: > > Without any problems, I am able to completely replicate the slowness problem. > > > > It does not lie in nb++, or DansGuardian. The problem occurs with both DG1 and DG2. nb++ is not used in DG2. > > > > When the following function is called: > > #include <regex.h> > > int regexec(const regex_t *preg, const char *string, > > size_t nmatch, regmatch_t pmatch[], int flags); > > > > ...in either nb++(DG1) or DG2, it maxes the cpu out for ages. It does return, eventually, and it does return the correct answer. > > > > > I strongly suspect it is a bug in RH7.1 now. Can anyone help? > > This function is defined in glibc, so you'll need a glibc upgrade to fix it. How reproducible: Always Steps to Reproduce: 1.Install RH7.1, Squid 2.4stable1, DansGuardian 1.1.4 Actual Results: access heavy-laden with text website, observe the slowness. Top indicates how dansguardian maxes the CPU, tying the machine. Expected Results: reasonable processing Additional info: rpm -Fvh --nodeps glibc 2.2.3-14, reboot, repeat access to web, observe that slowness goes away. This bug does not exist in RH 6.2 (I heard not in 7.0 either)
glibc in RHL 7.1 has multi-byte character set support in regex, the one in rawhide has most of regex code compiled in twice, once for single-byte character sets which is considerably faster and once for multi-byte character sets. I don't know which dependencies was rpm complaining about, but it certainly matters a lot, --nodeps has to be used with care. Were you rpm -Fvh glibc and glibc-common at the same time at least?
>Were you rpm -Fvh glibc and glibc-common at the same time at least? Yes, I took glibc, glibc-common and glibc-devel just to be sure.
The slowdown might indeed be caused by regex with multibyte char support. Star tthe process in the C locale if you don't need internationalization. In any case, this is old code. Try a recent version. I close this as not a bug since the slowdown is expected and can be avoided where possible.