Bug 2229654 - rsync - buffer overflow detected
Summary: rsync - buffer overflow detected
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: rsync
Version: 39
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Michal Ruprich
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: AcceptedFreezeException
Depends On:
Blocks: F39BetaFreezeException
TreeView+ depends on / blocked
 
Reported: 2023-08-07 08:43 UTC by dpawlik
Modified: 2023-08-29 17:44 UTC (History)
6 users (show)

Fixed In Version: rsync-3.2.7-5.fc39
Clone Of:
Environment:
Last Closed: 2023-08-29 17:44:55 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github WayneD rsync issues 511 0 None open rsync crashes with "*** buffer overflow detected ***: terminated" 2023-08-17 12:36:18 UTC
Red Hat Issue Tracker FC-949 0 None None None 2023-08-22 19:29:40 UTC

Description dpawlik 2023-08-07 08:43:21 UTC
Description of problem:

On executing rsync command,I receive an error:

*** buffer overflow detected ***: terminated
rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(644) [sender=3.1.3]

on executing command: 

/usr/bin/rsync --delay-updates -F --compress --delete-after --archive --no-owner --no-group --rsh=/usr/bin/ssh -S none -o Port=22 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null /tmp/toolbox zuul-worker:~/src/


Version-Release number of selected component (if applicable):

On host, where the rsync is executed: rsync-3.1.3-19.el8_7.1.x86_64
Remote host (Fedora Rawhide): rsync-3.2.7-4.fc39.x86_64

Steps to Reproduce:

* By using shell:

1. git clone https://github.com/containers/toolbox /tmp/toolbox
2. /usr/bin/rsync --delay-updates -F --compress --delete-after --archive --no-owner --no-group --rsh=/usr/bin/ssh -S none -o Port=22 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null /tmp/toolbox zuul-worker:~/src/

* By using Ansible (2.9.27):

1. git clone https://github.com/containers/toolbox /tmp/toolbox
2. Create playbook + inventory:

cat << EOF > test.yaml
---
- name: test
  hosts: test.dev
  gather_facts: false
  tasks:
  - name: Synchronize src repos to workspace directory.
    synchronize:
      delete: true
      dest: "~/src/"
      recursive: true
      src: "/tmp/toolbox"
      owner: no
      group: no
EOF

cat << EOF > inventory.yaml
---
all:
  hosts:
    test.dev:
      ansible_port: 22
      ansible_host: <Fedora Rawhide ip address>
      ansible_user: zuul-worker
EOF

3. ansible-playbook -i invenory.yaml test.yaml 


Actual results:

rsync: connection unexpectedly closed (15 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(226) [sender=3.1.3]

Expected results:

Everything is synced.

Comment 1 Fabien Boucher 2023-08-14 08:12:20 UTC
The affected product has been set to rhel8 however the issue is likely to have been introduced with the last versions of rsync into Fedora rawhide.

Comment 2 Michal Ruprich 2023-08-15 12:53:44 UTC
Hi,

so you say that this started with version 3.2.7 but this version is in Fedora for over 10 months, was this really the first time this started to happen? This is definitely something in the new version, it crashes even between 3.2.7 and 3.2.7 on both sides.

Can you just try to compare the directory structures on both sides even after the crash? Seems to me that everything is actually transferred but it crashes after the transfer.

Regards,
Michal

Comment 3 Michal Ruprich 2023-08-16 10:05:13 UTC
Correction, everything is not sent, I was probably looking at wrong output. I would like to ask you why are you using the -F option? Do you need it to filter something? Looking at the definition:

-F     The -F option is a shorthand for adding two --filter rules to your command.  The first time it is used is a shorthand for this rule:

                  ‐‐filter=’dir‐merge /.rsync‐filter’

       This  tells  rsync  to look for per‐directory .rsync‐filter files that have been sprinkled through the hierarchy and use their rules to filter the files in the transfer.

I don't see any such files in the source location and without -F there is no crash. There is definitely a bug here but this might be a suggestion for temporary workaround.

Comment 4 Fabien Boucher 2023-08-16 10:08:53 UTC
Hi,

So it seems that even if the rsync is the same version then the base system introduce a behavior change when using the "--delete-after" option on rawhide.

Here is the log of a new investigation:

Sender node: rsync-3.1.2-12.el7_9.x86_64
Receiver node: rsync-3.2.7-4.fc39.x86_64

Running:

git clone https://src.fedoraproject.org/rpms/python-gear
/usr/bin/rsync --delay-updates -F --compress --delete-after --archive --no-owner --no-group python-gear zuul-worker.83.xxx:/tmp/test-1
*** buffer overflow detected ***: terminated
^CKilled by signal 2.
rsync error: unexplained error (code 255) at rsync.c(638) [sender=3.1.2]

On both side the output of "find python-gear | wc -l" is similar (48) then it seems the transfer was complete.

Also note that:

/usr/bin/rsync -v --delay-updates -F --compress --archive --no-owner --no-group python-gear zuul-worker.83.xxx:/tmp/test-4
Running the same command but without the "--delete-after" option the rsync command complete with success.


Running the same rsync command but the receiver in now (same sender):
$ cat /etc/fedora-release 
Fedora release 38 (Thirty Eight)
$ rpm -qa | grep rsync
rsync-3.2.7-2.fc38.x86_64

/usr/bin/rsync --delay-updates -F --compress --delete-after --archive --no-owner --no-group python-gear zuul-worker.83.yyy:/tmp/test-1
The command run with success.

Comment 5 Fabien Boucher 2023-08-16 10:15:35 UTC
Yes removing "-F" or "--delete-after" avoid the overflow issue.

All this options are set in the test command because those options are set by the Ansible synchronize module as used by our CI: https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/prepare-workspace/tasks/main.yaml

Comment 6 Michal Ruprich 2023-08-16 10:22:23 UTC
This quite interesting. There is absolutely no difference in code between rsync-3.2.7-2 and rsync-3.2.7-4 which makes this even more interesting.

Comment 7 Michal Ruprich 2023-08-17 12:36:18 UTC
Most likely the same bug already filed in Upstream - https://github.com/WayneD/rsync/issues/511

Comment 8 dpawlik 2023-08-21 07:08:27 UTC
Hi,
thanks Michal for checking. I guess Fabien add enough information in that bug.

Let me know if you need some more details.

Dan

Comment 9 Fedora Update System 2023-08-22 19:31:55 UTC
FEDORA-2023-563d5c4a26 has been submitted as an update to Fedora 39. https://bodhi.fedoraproject.org/updates/FEDORA-2023-563d5c4a26

Comment 10 Michal Ruprich 2023-08-22 19:35:52 UTC
Sometimes it takes a while for fixes to be accepted in rsync Upstream so I went ahead and pushed this.

Comment 11 Debarshi Ray 2023-08-22 20:09:13 UTC
Thanks for all the hard work on this, Michal!  :)

Comment 12 Adam Williamson 2023-08-22 22:23:03 UTC
Proposing this as an FE for F39 Beta, as the Beta freeze is in effect. I think it makes sense to give this an FE to avoid problems in Fedora CI tests.

Comment 13 Fedora Update System 2023-08-23 02:09:14 UTC
FEDORA-2023-563d5c4a26 has been pushed to the Fedora 39 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-563d5c4a26`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-563d5c4a26

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 14 Debarshi Ray 2023-08-23 13:27:27 UTC
This also affects upstream Toolbx CI running on Fedora 39 and Rawhide.  These pull requests were where the problem first showed up:
https://github.com/containers/toolbox/pull/1344
https://github.com/containers/toolbox/pull/1331

Comment 15 Fabien Boucher 2023-08-24 12:28:25 UTC
Since the patched package landed on the rawhide repository our CI jobs are working as expected [1]. Thanks for the fix !

[1]. https://fedora.softwarefactory-project.io/zuul/builds?job_name=rpm-install-test&branch=rawhide&skip=0&limit=100

Comment 16 Debarshi Ray 2023-08-24 14:39:00 UTC
Yes, it works!  See how the tests running on Fedora Rawhide nodes actually get run again, instead of hitting RETRY_LIMIT:
https://github.com/containers/toolbox/pull/1344

Some of the tests still fail on Fedora Rawhide because of other changes in Rawhide, but that's not related to this bug.

Comment 17 Adam Williamson 2023-08-27 16:35:41 UTC
+5 in https://pagure.io/fedora-qa/blocker-review/issue/1182 , marking accepted.

Comment 18 Fedora Update System 2023-08-29 17:44:55 UTC
FEDORA-2023-563d5c4a26 has been pushed to the Fedora 39 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.