From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-US; rv:1.0.1)
Description of problem:
I am having trouble with netatalk-1.6.3 (not the redhat rpm), available from
netatalk.sourceforge.net. The master process will hang during a call to syslog
randomly. The system ran find for a month, but then it hung 6 times within two
weeks. I don't think that it's an application issue because I've only heard of
this problem from other people running redhat 9.
Here is the process I followed to trace the problem.
User notifies us that they can't log in.
Found that the server is rejecting connections to port 548.
Used strace to show what the afpd master process is doing. The process was hung
on a call to futex.
Attached ddd/gdb debugger to the master process and forced a core dump for
Killed and restarted the afpd master process.
Using the debugger, found that the hang occurred in a call to syslog from
logger.c:566. The parameters that were passed to syslog seem to be fine.
I traced two hung master processes this way and both of them had hung in the
same place. This leads me to believe that the real problem is in glibc. Is that
a reasonable assumption?
In an attempt to work around the problem, I recompiled netatalk with DO_SYSLOG
undefined, but the system hasn't been up long enough to determine if it's stable.
If anybody would like any additional information let me know. I have the source,
original executable, and core dumps that I used to research the problem if needed.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
The problem will occur randomly during execution. We've had it happen three
times on one day, and once on several other random days.
The problem appears to be associated with re-entrant calls to syslog. One person
who was having serious problems with this, downgraded their system to redhat 8
and the problem went away.
The problem may not be with the redhat release, but it seems quite suspicious.
I suspect problems with the use of syslog. It does locking, yes, but that
locking cannot deadlock by itself.
Does the program use threads? Does it use fork? If yes, how is fork used, I
mean, which functions are called after fork?
No response in a month. I see no reason to believe that glibc has a
bug here. Closing.