rsync on RHEL4 (version 2.6.3, protocol 28) crashes when syncing large directory trees over ssh. The command I use is: rsync -Pav --delete --progress /source/path/ user@remote:/dest/path/ The sync will start to run and crash after a while. The message given is phase "unknown" [sender]: connection reset by peer. The tree I was trying to sync was 1198MB in size with 1046 directories and 25516 files. It seems related to this problem: https://bugzilla.samba.org/show_bug.cgi?id=2208 I've been able to work around the problem a bit playing with the --bwlimit and --timeout options, but those just seem to delay the crash. I have not really looked in to it beyond searching for an existing bug.
We are experiencing similar problems -- it looks like this bug was opened over 3 months ago and has no resolution yet. Has this been investigated?
Would like a fix for this to show up in a RHEL4 update. Setting flags.
Was this between 2 RHEL4 servers? Or different machines? Can you reproduce this at will? And if so how do you reproduce it?
Sort of. It was between a CentOS 4.4 server and a RHEL 4.4 server. I can reproduce it using the description given in the first comment. It's pretty simple to reproduce.
Checked with David, and it looks like this was happening between 2.6.3 on both ends, and isn't reproducible when both ends are running 2.6.9.
I can't reproduce this with 1-2G data sets on RHEL4 <-> RHEL4 If someone has more info on how to reliably reproduce it, it will make a lot more easier to find what's wrong and patch it. Thanks.
It seem that David can't reproduce the bug as he changed the test environment and I can't reproduce it as well. Can you Matt provide a reproducible test case? It would really help a lot to find out the exact conditions to reproduce it so that I can actually fix it. Thanks.
We worked around the issue by changing our iptables rulesets. Originally we used connection tracking to protect the system. When we removed any rules that use connection tracking, we stopped experiencing the issues reported here. As such, I cannot tell if this is rsync related or not, but I would recommend setting up connection tracking rules something roughly as follows to see if this reproduces the problem: iptables -F iptables -I INPUT 1 -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -I INPUT 2 -j DROP iptables -I OUTPUT 1 -m state --state NEW -j ACCEPT Given our environment, with the above, the following rough command would fail after several minutes given gigabit ethernet between hosts: rsync -avr --delete --progress --stats user@remote:/data /data
This request was evaluated by Red Hat Product Management for inclusion, but this component is not scheduled to be updated in the current Red Hat Enterprise Linux release. If you would like this request to be reviewed for the next minor release, ask your support representative to set the next rhel-x.y flag to "?".
Since the workaround implies that this issue might not be rsync related, I'm closing this bug.