Bug 380791
Summary: | Does beagle write to files being indexed? | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Linus Torvalds <torvalds> |
Component: | beagle | Assignee: | David Nielsen <gnomeuser> |
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | high | Docs Contact: | |
Priority: | low | ||
Version: | 8 | CC: | wwoods |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2009-01-09 05:11:47 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Linus Torvalds
2007-11-13 19:42:40 UTC
Side note: the way I verified that it really *is* beagle, was to run one of the tests that fails most easily in a loop: while ./t3404-rebase-interactive.sh -i; do echo ok; done With beagle indexing disabled with beagle-settings, it can do this for what appears to be forever, even if I also have something like a "find" going on in the background (just to emulate at least something looking at the tree at the same time). If I enable beagle indexing, it will take a while (usually several minutes, and several tens of successful test-runs), but eventually, it will fail on subtest 17 (most of the git tests won't ever even care about the stat information, so the fact that it always tends to fail on a specific test just means that that particular test happens to care - it doesn't really tell anything else). In other words, I don't know exactly what it is in beagle that triggers the failure, but I'm pretty sure that it really is beagle that causes it. One more comment: running the git "filter-branch" test suite seems to be even better at catching this, ie replacing t3404-rebase-interactive.sh in the above loop with t7003-filter-branch.sh, together with telling beagle to just index the test subdirectory, seems to trigger it more quickly. If the issue really is that beagle tries to reset the access times (or something similar), then I would suggest that rather than doing that (which is race-prone *and* changes st_ctime, not to mention wastes disk IO), beagle open all files it indexes with O_NOATIME, which should avoid the atime update from the indexing in the first place. But I still have no idea what beagle actually does to trigger this, so the above is just a total guess.. In F8, beagle writes xattrs to the files it indexes. (IIRC this is a change in behavior since F7.) Could that be the cause here? Yes, the problem does seem to be the fact that beagle uses xattrs. That may seem like a clever idea, but it has a couple of downsides: - xattrs obviously will never work on a read-only filesystem (and doesn't work on many other filesystems either, for that matter) - it does actually change the filesystem it indexes. In this case, the reason git doesn't like beagle seems to be the fact that while git doesn't care about the extended attributes themselves, just the act of writing them does actually change the inode: it sets the "ctime" of the inode to the time of the write. - xattrs are slow and fairly inefficient. You cannot read-ahead on xattrs across many files, so reading them all in after startup is prohibitively expensive. Much more so than having a separate beagle-internal flat index file. That said, git doesn't really deeply care about the ctime value of an inode. We could easily make git just ignore changes to ctime, but git tries to be very good at finding if a file has changed, and the ctime check is there *exactly* so that git will see and notice a file modification even if the program that modified it tried to hide it by setting the length and mtime back to before it got modified. That's exactly what ctime gives you, since you cannot reset mtime without updating ctime. So I think I now understand what beagle does, and why git doesn't like it (and why probably very few other programs will ever care - although I would not be at all surprised if the same beagle thing basically ends up messing with some backup applications too, for the same reason). And right now we'll probably document that you shouldn't use beagle with git. And if people *really* want to use it, we can make git ignore the atime. But I think beagle is doing something fairly intrusive and moderately stupid, and would likely be *better* off just maintaining a special database of its own rather than relying of xattrs. Beagle falls back to using its sqlite database for readonly filesystems and other places you can't use xattrs (e.g. NFS/CIFS homedirs). You can also disable its use of xattrs altogether. According to the FAQ (pulled from google's cache since http://beagle-project.org/ is dead right now, grr): "If you want to run with extended attributes disabled, set the environment variable BEAGLE_DISABLE_XATTR. Keep in mind that beagle will run slower with extended attribute disabled." We should probably consider setting that variable by default if we decide to start installing beagle by default again. Good pointers. And yes, I'm not at all surprised that xattrs can perform better than some sqlite database. Most databases have all the same problems xattrs have, and then some (ie bad access patterns and totally synchronous accesses with no effective readahead). The only way to get good performance from any disk reads (with current rotating media) is to have access patterns that are amenable to readahead and big accesses. That's why git itself uses a single linear file for its "attribute cache" (index). Databases are horrid, but so are filesystem metadata. This message is a reminder that Fedora 8 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 8. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '8'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 8's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 8 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Fedora 8 changed to end-of-life (EOL) status on 2009-01-07. Fedora 8 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed. |