Bug 98740 - Rsync daemon SIGCHLD/SIG_IGN/wait bug
Summary: Rsync daemon SIGCHLD/SIG_IGN/wait bug
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: rsync
Version: 9
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jay Fenlason
QA Contact: Mike McLean
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-07-08 11:28 UTC by Jos Vos
Modified: 2014-08-31 23:25 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2003-08-04 19:03:21 UTC
Embargoed:


Attachments (Terms of Use)
signal.patch (766 bytes, patch)
2003-07-24 22:09 UTC, Hardy Merrill
no flags Details | Diff

Description Jos Vos 2003-07-08 11:28:28 UTC
Description of problem:
Use of "rsync --daemon" results in the following kernel log messages:

Jul  8 12:56:59 ftp kernel: application bug: rsync(1051) has SIGCHLD set to
SIG_IGN but calls wait().
Jul  8 12:56:59 ftp kernel: (see the NOTES section of 'man 2 wait'). Workaround
activated.

These messages seem to appear every time a new connection is made (not
verified).

Version-Release number of selected component (if applicable):
rsync-2.5.5-4 with kernel-bigmem-2.4.20-18.9

Comment 1 Hardy Merrill 2003-07-08 19:05:38 UTC
Does the rsync complete successfully in these cases, or does it crash?  And,
what is the size of the physical memory on that rsync server box?

Please post the rsync server's /etc/rsyncd.conf file, and an example of an rsync
command line that causes it to fail, or get the messages you reported.

FYI, I don't see these messages on a stock RHL9 system (rsync-2.5.5-4) on an SMP
box with up2date'd kernel 2.4.20-18.9smp.  So my thought is that it has
something to do with the bigmem kernel - I'll test that next.

Comment 2 Jos Vos 2003-07-08 19:30:57 UTC
Rsync server seems to work fine.  When I test it, it gives the error when there
is nothing to do (after "skipping directory" at the client site), but I'm not
sure if this is all (it's a public rsync server). In either case, it seems to
complete ok.

The server system has 8 GB of memory and runs the bigmem kernel.

Relevant part from /etc/rsyncd.conf:

[vol]
        comment = /vol hierarchy
        path = /var/ftp/vol/
        read only = true
        uid = ftp
        gid = ftp
        hosts allow = *

Command and result that causes the message to appear in the server kernel log:

[jos@test x]$ rsync rsync.server.name::vol/1/nilo /tmp/
skipping directory /1/nilo
client: nothing to do
[jos@test x]$

Note that vol/1/nilo is a directory, so this is meant to fail.  It doesn't seem
to give the error when I specify -avx, when tree is actually retrieved.


Comment 3 Hardy Merrill 2003-07-08 20:27:42 UTC
Are there valid cases (not the case you supplied for rsync'ing a directory that
is meant to fail) where this is causing you a problem, or are these cases of
clients trying to rsync *incorrectly*?  I guess my question is what problem is
this really causing you?

Comment 4 Jos Vos 2003-07-08 21:04:09 UTC
I looked in more detail at the log files.  It seems that in more than 10% of all
rsync sessions we get the error. The error occurs sometimes when the rsyncd log
does not show actual transfers, but also sometimes when bytes are xferred. Looks
like some timing-related issue, that more often occurs when no data is xferred...

Comment 5 Karsten Hopp 2003-07-16 11:08:11 UTC
These messages are a warning that rsync is not standards compliant with respect to its 
handling of child processes. According  to  POSIX  (3.3.1.3)  it  is  unspecified  what happens 
when  SIGCHLD is set to SIG_IGN. 
 

Comment 6 Hardy Merrill 2003-07-24 22:09:08 UTC
Created attachment 93129 [details]
signal.patch

Wayne Davison, one of the rsync maintainers, proposed this signal handling
patch.

Comment 7 Hardy Merrill 2003-07-24 22:12:23 UTC
Here are Wayne's comments in the email to which he attached the patch:

I finally educated myself on this issue, and would like to propose a
patch.  Since there are reports that zombies can get created when using
SIG_IGN on FreeBSD as well as other unices, I think we should change the
code to catch the signal and cleanup the zombies in the signal handler.
This would make the code similar to the handling in main.c.

Comment 8 Hardy Merrill 2003-07-24 22:14:55 UTC
JW Schultz, another rsync maintainer, responds to Wayne Davison's patch:

Something along these lines might be appropriate.  I did a
little more digging as a result of your message here and it
looks like this routine should either be setting up it's own signal
handler or integrate with wait_process and the signal hander
in main().  Repeatedly setting SICHGLD to SIG_IGN is dumb.

Comment 9 Hardy Merrill 2003-07-25 14:21:46 UTC
I've tested the patch and it works for me - no more warnings.  Please test the
patch (Jos Vos) and post your results here.  If it works for you, then I'll
contact the rsync maintainers via the rsync mailing list and let them know.

Comment 10 Jos Vos 2003-07-27 20:12:07 UTC
I have applied the patch, after modifying it (the patch does not apply cleanly
to the rsync-2.5.5-4 code as in RHL 9, due to the context).  Tomorrow morning I
can say if the warnings are now gone.

Comment 11 Jos Vos 2003-07-28 19:37:06 UTC
Seems to work for me, no kernel warnings anymore since the upgrade to the
patched version.

Comment 12 Hardy Merrill 2003-08-04 19:03:21 UTC
I posted a message to the rsync mailing list today saying
that the patch worked for myself and the person who posted
the bug, and asked that the patch be committed so that future
rsync versions contained the patch code.  Wayne Davison, one
of the rsync maintainers, responded:

I think it's at least better than what's currently there,
so I've committed it to CVS.

--------------------------------------------------------

The next release of rsync is expected to be 2.5.7, and since
that hasn't been released yet, it will not make it into
Red Hat's next release.  Since the patch has been committed
upstream, I am not planning to create a Red Hat specific patch.
Users wanting this patch functionality will need to
  1. apply the patch, or
  2. wait for the Red Hat release containing rsync 2.5.7
     (current release is RHL9 - rsync 2.5.7 will not be in
     the next release), or
  3. wait for rsync 2.5.7 to be released and download it from
     the rsync website http://samba.anu.edu.au/rsync/, or
  4. download the patched code now (8/4/2003) from the rsync
     website from CVS

Comment 13 Jos Vos 2003-08-04 19:11:38 UTC
Interesting statement: "Since the patch has been committed upstream, I am not
planning to create a Red Hat specific patch".  I would have thought the
opposite: if a patch is committed upstream, this is a good reason to temporarily
add a RH patch.

Comment 14 Hardy Merrill 2003-08-05 20:30:01 UTC
Ok, I've added the signal patch to the rpm - the new package can be found here:

  http://people.redhat.com/hmerrill/rsync-2.5.6-19.i386.rpm

and will appear in rawhide soon.  This package is intended for the next Red Hat
release, but should work on RHL9 also.


Note You need to log in before you can comment on or make changes to this bug.