Bug 1415295
Summary: | lsyncd.service: Main process exited, code=exited, status=255/n/a | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Charles R. Anderson <cra> | ||||||||||
Component: | lsyncd | Assignee: | Jason Taylor <jtfas90> | ||||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||
Priority: | unspecified | ||||||||||||
Version: | 24 | CC: | ferdnyc, filip, jtfas90, lkundrak, martin, pwouters, scenek, troxor0 | ||||||||||
Target Milestone: | --- | ||||||||||||
Target Release: | --- | ||||||||||||
Hardware: | Unspecified | ||||||||||||
OS: | Unspecified | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | lsyncd-2.2.2-1.fc24 lsyncd-2.2.2-1.fc25 | Doc Type: | If docs needed, set a value | ||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2017-02-27 22:48:40 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Description
Charles R. Anderson
2017-01-20 19:34:45 UTC
This may be the same bug as #1196710 Failure with a list width exitcode = 2. I just checked the lsync log and it has this at the end: /home/cra/.cache/mozilla/firefox/7u6riz94.default/cache2/entries/5C7DD8CEFC3DA1517138DD8C7AF8AE57862EBBEE /home/cra/.cache/mozilla/firefox/7u6riz94.default/cache2/entries/12727A8AA6DB17B6F46A446B9D783D7FBB122D5E /home/cra/.cache/mozilla/firefox/7u6riz94.default/cache2/entries/55AD675C4FC6E4658DC1B97AF24DF464B7D8A8A8 /home/cra/.cache/mozilla/firefox/7u6riz94.default/cache2/entries/AE800BE3D43F2B1829D865A1DA506B118F33C9D8 /home/cra/.cache/mozilla/firefox/7u6riz94.default/cache2/entries/D433919A2D95FDB7D61AEA753F86DE7CAE72386F /home/cra/.cache/mozilla/firefox/7u6riz94.default/cache2/entries/E44B7E34C6A535D8D7ED04F1CA15E16BFBD297C4 /home/cra/.cache/mozilla/firefox/7u6riz94.default/cache2/entries/E1B92D58525CDAB8AA41123359BD4C86F65C6081 /home/cra/.cache/mozilla/firefox/7u6riz94.default/cache2/entries/CE31F1A0EBA42BC45F6C5D4C360A21D0F74050F5 /home/cra/.cache/mozilla/firefox/7u6riz94.default/cache2/entries/B504B08C1DDC0FCF571A7FDF3824FBD67732B458 Fri Jan 20 14:35:11 2017 Error: Failure with a list with exitcode = 2 Fri Jan 20 14:35:11 2017 Error: Critical exitcode. Created attachment 1242971 [details]
lsyncd.conf
Created attachment 1242972 [details]
lsyncd.log
Log showing entire lsyncd run from initial startup to failure with a list with exitcode = 2.
Here is more detail from the journal. It looks like the problem is cause by "ABORTING due to invalid path from sender: home/cra/.cache/mozilla/firefox/7u6riz94.default/safebrowsing-backup/base-track-d". Jan 20 14:07:36 hostname lsyncd[1064]: file has vanished: "/home/cra/.cache/mozilla/firefox/7u6riz94.default/safebrowsing-to_delete" Jan 20 14:07:36 hostname lsyncd[1064]: file has vanished: "/home/cra/.cache/mozilla/firefox/7u6riz94.default/safebrowsing-backup" Jan 20 14:07:36 hostname lsyncd[1064]: file has vanished: "/home/cra/.cache/mozilla/firefox/7u6riz94.default/safebrowsing-to_delete" Jan 20 14:07:36 hostname lsyncd[1064]: file has vanished: "/home/cra/.cache/mozilla/firefox/7u6riz94.default/safebrowsing-backup" Jan 20 14:07:36 hostname lsyncd[1064]: file has vanished: "/home/cra/.cache/mozilla/firefox/7u6riz94.default/safebrowsing-to_delete" Jan 20 14:07:36 hostname lsyncd[1064]: ABORTING due to invalid path from sender: home/cra/.cache/mozilla/firefox/7u6riz94.default/safebrowsing-backup/base-track-d Jan 20 14:07:36 hostname lsyncd[1064]: rsync error: protocol incompatibility (code 2) at generator.c(1271) [generator=3.1.2] Jan 20 14:07:36 hostname lsyncd[1064]: 14:07:36 Error: Failure with a list with exitcode = 2 Jan 20 14:07:36 hostname lsyncd[1064]: 14:07:36 Error: Critical exitcode. Jan 20 14:07:36 hostname lsyncd[1064]: .default/storage/permanent/chrome/idb/2918063365piupsah.sqlite-wal Jan 20 14:07:36 hostname lsyncd[1064]: /cra/.mozilla/firefox/7u6riz94.default/storage/permanent/chrome/idb/2918063365piupsah.sqlite-shm Jan 20 14:07:36 hostname lsyncd[1064]: /cra/.mozilla/firefox/7u6riz94.default/storage/permanent/chrome/idb/2918063365piupsah.sqlite-shm Jan 20 14:07:36 hostname lsyncd[1064]: /cra/.mozilla/firefox/7u6riz94.default/storage/permanent/chrome/idb/2918063365piupsah.sqlite-shm Jan 20 14:07:36 hostname lsyncd[1064]: /cra/.mozilla/firefox/7u6riz94.default/storage/permanent/chrome/idb/2918063365piupsah.sqlite-wal Jan 20 14:07:36 hostname lsyncd[1064]: /cra/.mozilla/firefox/7u6riz94.default/storage/permanent/chrome/idb/2918063365piupsah.sqlite-wal Jan 20 14:07:36 hostname lsyncd[1064]: /cra/.mozilla/firefox/7u6riz94.default/storage/permanent/chrome/idb/2918063365piupsah.sqlite Jan 20 14:07:36 hostname lsyncd[1064]: /cra/.mozilla/firefox/7u6riz94.default/sessionstore-backups/recovery.js.tmp Jan 20 14:07:36 hostname lsyncd[1064]: /cra/.mozilla/firefox/7u6riz94.default/sessionstore-backups/recovery.js.tmp Jan 20 14:07:36 hostname lsyncd[1064]: /cra/.mozilla/firefox/7u6riz94.default/sessionstore-backups/recovery.js Jan 20 14:07:36 hostname lsyncd[1064]: /cra/.mozilla/firefox/7u6riz94.default/sessionstore-backups/recovery.bak Jan 20 14:07:36 hostname lsyncd[1064]: /cra/.mozilla/firefox/7u6riz94.default/sessionstore-backups/recovery.js.tmp Jan 20 14:07:36 hostname lsyncd[1064]: /cra/.mozilla/firefox/7u6riz94.default/storage/permanent/chrome/idb/2918063365piupsah.sqlite-shm Jan 20 14:07:36 hostname lsyncd[1064]: /cra/.mozilla/firefox/7u6riz94.default/storage/permanent/chrome/idb/2918063365piupsah.sqlite-shm Jan 20 14:07:36 hostname lsyncd[1064]: /cra/.mozilla/firefox/7u6riz94.default/storage/permanent/chrome/idb/2918063365piupsah.sqlite-wal Jan 20 14:07:36 hostname lsyncd[1064]: /cra/.mozilla/firefox/7u6riz94.default/storage/permanent/chrome/idb/2918063365piupsah.sqlite-wal Jan 20 14:07:36 hostname lsyncd[1064]: /cra/.mozilla/firefox/7u6riz94.default/storage/permanent/chrome/idb/2918063365piupsah.sqlite Jan 20 14:07:36 hostname lsyncd[1064]: 14:07:36 Normal: Finished a list after exitcode: 0 Jan 20 14:07:36 hostname systemd[1]: lsyncd.service: Main process exited, code=exited, status=255/n/a Jan 20 14:07:36 hostname systemd[1]: lsyncd.service: Unit entered failed state. Jan 20 14:07:36 hostname systemd[1]: lsyncd.service: Failed with result 'exit-code'. Hi Charles, Thanks for the filing the report. Are you able to add an exclude entry for the transient files? If so, does the lsyncd service stay running? I'm struggling with the proper exclude syntax. I can't get it to exlude these files. Here is what I just tried, but it is still going into /home/cra/.cache sync { default.rsync, source = "/home/", target = "/mnt/hddbackup/home", exclude = { "/home/cra/.cache", "/home/cra/.cache/*", "/cra/.cache", "/cra/.cache/*", "/.cache", "/.cache/*", "/*/mnt", "/*/.ccache", "/*/.buildroot-ccache", "/mock/var/cache", "/mock/var/lib/mock/*/root" }, rsync = { one_file_system = true, owner = true, group = true, perms = true, hard_links = true, acls = true, xattrs = true, sparse = true, -- delete_excluded = true - unsupported option } } It looks like it might be getting picked up in the source "/" sync definition? (In reply to Jason Taylor from comment #7) > It looks like it might be getting picked up in the source "/" sync > definition? If that's true then one_file_system = true is broken in this version. I'll do some more testing and see if I can recreate what you are seeing. Created attachment 1243162 [details]
lsyncd config test
Hi Charles, I attached the config that I used for testing. I was unable to reproduce the exit/errors you are seeing with this configuration. The excludes worked as expected, I was able to restart lsyncd a number of times and run lsyncd for varied times up to and exceeding 60 minutes. Let me know if there any additional details you see in the logs that may be of help if you continue to run into the problem. Thanks JT This appears to be a regression in 2.2.1 — I'm seeing the same issue in the test update, whereas 2.1.5 doesn't exhibit the same behavior. I also notice that 2.2.1 is listing a lot of duplicate files in the sync logs. For instance, where 2.1.5 might log... Sun Jan 22 02:24:44 2017 Normal: Rsyncing list /path/to/dir1 /path/to/dir1/file1 /path/to/dir1/file2 /path/to/dir2 /path/to/dir2/file1 /path/to/file3 2.2.1 often logs... Sun Jan 22 02:24:44 2017 Normal: Rsyncing list /path/to/dir1 /path/to/dir1/file1 /path/to/dir1/file1 /path/to/dir1/file2 /path/to/dir1/file2 /path/to/dir2 /path/to/dir2/file1 /path/to/file3 /path/to/file3 /path/to/file3 This makes me SUSPECT that the issue might be a race condition between multiple local rsync processes being fired off, so that when deletions are enabled they're essentially pulling files out from under each other. I've seen references to a "max processes" config file directive, in lsync's github Issues area... I'm going to look into that and test whether it has any effect on this issue with 2.2.1. Alas, adding "maxProcesses = 1" to my config file's settings {} stanza has no effect, lsyncd-2.2.1 still crashes when attempting to sync deletions. Created attachment 1243306 [details] testcase: lsynctest.conf — lsync config file used to reproduce bug Re: comment #7, I can provide steps to reproduce this error. (Ignoring the exclusion possibility, as I'm seeing this with regular file trees — local offline archives of HTML pages — that do need to be synced, so exclusion is not an option.) 1. Store the attached lsyncd config file as /var/tmp/lsynctest.conf 2. `mkdir /var/tmp/lsynctest` on both source and destination machines 3. `lsyncd --nodaemon /var/tmp/lsynctest.conf` 4. Once lsyncd has started up, on the source machine run: `cp -R /etc/systemd /var/tmp/lsynctest/` 5. Wait until lsyncd syncs all of the files in /var/tmp/lsynctest/ 6. `rm -r /var/tmp/lsynctest/systemd/system/multi-user.target.wants` 7. Wait for lsyncd to attempt to sync the deletion, and... boom. file has vanished: "/var/tmp/lsynctest/systemd/system/multi-user.target.wants" ABORTING due to invalid path from sender: systemd/system/multi-user.target.wants/ModemManager.service rsync error: protocol incompatibility (code 2) at generator.c(1271) [generator=3.1.2] 05:05:09 Error: Failure (list): 2 05:05:09 Error: Critical exitcode. (NB: In step 4, /etc/systemd could be any other directory tree with a little depth to it. We need a subdirectory populated with multiple files, so we can delete it in step 6.) Whoops, I left my destination machine's .local name in the config file as the host="" parameter. You should edit that accordingly, of course. I've just run this by the upstream developers as well, see: https://github.com/axkibe/lsyncd/issues/422 Thanks guys for testing this and reporting the issues. Thanks Frank, I saw your report in my github feed. I'll follow up with Axel as well if we don't hear anything in the near future. JT Frank or Charles, Any updates on this? Axel responded and it looks like lsyncd is functioning as expected and the error being seen is potentially related to .bashrc[0]. [0]https://github.com/axkibe/lsyncd/issues/422 Yeah, I saw that response... I can't deny that environmental factors are a POSSIBILITY, but nevertheless, 2.1.5 exhibits no problems at all with the same rsync version, in the exact same user environment. So, my response to "must be a bad config" would be, "config worked fine with 2.1.5". But I did want to run some experiments under newly-created, minimal user accounts, to explore/eliminate that possibility. I have not had the chance to do so yet, I'm afraid. I'll see if I can fit it in tomorrow (Sunday) or Monday. (In reply to "FeRD" (Frank Dana) from comment #19) > Yeah, I saw that response... I can't deny that environmental factors are a > POSSIBILITY, but nevertheless, 2.1.5 exhibits no problems at all with the > same rsync version, in the exact same user environment. So, my response to > "must be a bad config" would be, "config worked fine with 2.1.5". Same here. (In reply to Jason Taylor from comment #18) > Axel responded and it looks like lsyncd is functioning as expected and the > error being seen is potentially related to .bashrc[0]. That makes no sense--there is no remote rsyncing happening in my case, and the local rsync runs as root. I have not modified root's bashrc, but the default one does source /etc/bashrc, so it would seem rsync could be at the mercy of Fedora's default bashrc configuration. If this truly is causing the problem, it is a very fragile setup and lsyncd should be fixed to not rely on the bashrc configuration. I have a hunch that this problem has always existed, even with 2.1.5, but that I worked around it by excluding rapidly-changing directories like the mozilla cache and other caches, and that 2.2.1 somehow broke exclusion or one_file_system as I said in comment #6 and comment #7. lsyncd-2.2.2-1.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2017-4085621c29 Charles and Frank, Axel released version 2.2.2 which should resolve the issues described above. The 2.2.2 packages have been pushed to testing. Please test and post the outcomes when you all get a chance. Thanks in advance! JT lsyncd-2.2.2-1.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2017-0214bf1405 lsyncd-2.2.2-1.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-4085621c29 lsyncd-2.2.2-1.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-0214bf1405 (In reply to Jason Taylor from comment #22) > Axel released version 2.2.2 which should resolve the issues described above. > The 2.2.2 packages have been pushed to testing. Looks much better..I have been running it for the last 5 days and it hasn't crashed. Thanks! lsyncd-2.2.2-1.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report. lsyncd-2.2.2-1.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report. |