Red Hat Bugzilla – Bug 208155
crash on syncing large directory trees
Last modified: 2010-03-05 04:46:09 EST
rsync on RHEL4 (version 2.6.3, protocol 28) crashes when syncing large directory
trees over ssh. The command I use is:
rsync -Pav --delete --progress /source/path/ user@remote:/dest/path/
The sync will start to run and crash after a while. The message given is phase
"unknown" [sender]: connection reset by peer. The tree I was trying to sync was
1198MB in size with 1046 directories and 25516 files.
It seems related to this problem:
I've been able to work around the problem a bit playing with the --bwlimit and
--timeout options, but those just seem to delay the crash. I have not really
looked in to it beyond searching for an existing bug.
We are experiencing similar problems -- it looks like this bug was opened over 3
months ago and has no resolution yet. Has this been investigated?
Would like a fix for this to show up in a RHEL4 update. Setting flags.
Was this between 2 RHEL4 servers?
Or different machines?
Can you reproduce this at will? And if so how do you reproduce it?
Sort of. It was between a CentOS 4.4 server and a RHEL 4.4 server. I can
reproduce it using the description given in the first comment. It's pretty
simple to reproduce.
Checked with David, and it looks like this was happening between 2.6.3 on both
ends, and isn't reproducible when both ends are running 2.6.9.
I can't reproduce this with 1-2G data sets on RHEL4 <-> RHEL4
If someone has more info on how to reliably reproduce it, it will make a lot
more easier to find what's wrong and patch it.
It seem that David can't reproduce the bug as he changed the test environment
and I can't reproduce it as well.
Can you Matt provide a reproducible test case?
It would really help a lot to find out the exact conditions to reproduce it so
that I can actually fix it.
We worked around the issue by changing our iptables rulesets. Originally we
used connection tracking to protect the system. When we removed any rules that
use connection tracking, we stopped experiencing the issues reported here.
As such, I cannot tell if this is rsync related or not, but I would recommend
setting up connection tracking rules something roughly as follows to see if this
reproduces the problem:
iptables -I INPUT 1 -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -I INPUT 2 -j DROP
iptables -I OUTPUT 1 -m state --state NEW -j ACCEPT
Given our environment, with the above, the following rough command would fail
after several minutes given gigabit ethernet between hosts:
rsync -avr --delete --progress --stats user@remote:/data /data
This request was evaluated by Red Hat Product Management for
inclusion, but this component is not scheduled to be updated in
the current Red Hat Enterprise Linux release. If you would like
this request to be reviewed for the next minor release, ask your
support representative to set the next rhel-x.y flag to "?".
Since the workaround implies that this issue might not be rsync related, I'm closing this bug.