Bug 103392

Summary: application hangs during call to syslog
Product: [Retired] Red Hat Linux Reporter: Erik Horn <erik_horn>
Component: glibcAssignee: Jakub Jelinek <jakub>
Status: CLOSED NOTABUG QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 9CC: fweimer
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-11-05 18:56:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Erik Horn 2003-08-29 18:08:13 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-US; rv:1.0.1)
Gecko/20020823 Netscape/7.0

Description of problem:
I am having trouble with netatalk-1.6.3 (not the redhat rpm), available from
netatalk.sourceforge.net. The master process will hang during a call to syslog
randomly. The system ran find for a month, but then it hung 6 times within two
weeks. I  don't think that it's an application issue because I've only heard of
this problem from other people running redhat 9.



Here is the process I followed to trace the problem.

User notifies us that they can't log in.

Found that the server is rejecting connections to port 548.

Used strace to show what the afpd master process is doing. The process was hung
on a call to futex.

Attached ddd/gdb debugger to the master process and forced a core dump for
offline analysis.

Killed and restarted the afpd master process.

Using the debugger, found that the hang occurred in a call to syslog from
logger.c:566. The parameters that were passed to syslog seem to be fine.


I traced two hung master processes this way and both of them had hung in the
same place. This leads me to believe that the real problem is in glibc. Is that
a reasonable assumption?

In an attempt to work around the problem, I recompiled netatalk with DO_SYSLOG
undefined, but the system hasn't been up long enough to determine if it's stable.

If anybody would like any additional information let me know. I have the source,
original executable, and core dumps that I used to research the problem if needed.

Thanks,

Erik

Version-Release number of selected component (if applicable):
glibc-2.3.2-27.9

How reproducible:
Sometimes

Steps to Reproduce:
The problem will occur randomly during execution. We've had it happen three
times on one day, and once on several other random days.

Additional info:

Comment 1 Erik Horn 2003-09-17 20:29:08 UTC
The problem appears to be associated with re-entrant calls to syslog. One person
who was having serious problems with this, downgraded their system to redhat 8
and the problem went away.

The problem may not be with the redhat release, but it seems quite suspicious.

Comment 2 Ulrich Drepper 2003-10-02 09:03:17 UTC
I suspect problems with the use of syslog.  It does locking, yes, but that
locking cannot deadlock by itself.

Does the program use threads?  Does it use fork?  If yes, how is fork used, I
mean, which functions are called after fork?

Comment 3 Ulrich Drepper 2003-11-05 18:56:58 UTC
No response in a month.  I see no reason to believe that glibc has a
bug here.  Closing.