Bug 380791

Summary: Does beagle write to files being indexed?
Product: [Fedora] Fedora Reporter: Linus Torvalds <torvalds>
Component: beagleAssignee: David Nielsen <gnomeuser>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: 8CC: wwoods
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-01-09 05:11:47 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Linus Torvalds 2007-11-13 19:42:40 UTC
Description of problem:
When trying out beagle in the current Fedora 8 on my desktop, git occasionally
gets unhappy.

What seems to happen is that if beagle is busy indexing the tree just as git is
creating *its* indexes of cached stat data, the git index doesn't end up
matching the working tree, and a file ends up being considered "dirty" in git,
which then disables some things (eg you cannot merge into such a dirty file).

Git is fairly unusual in that it cares *deeply* about the file 'lstat()' data,
since it uses that data to match its index up in order to avoid having to
re-read the file data itself. So if some tool changes things like inode ctimes
or mtimes, git gets very unhappy.

Version-Release number of selected component (if applicable):
Current up-to-date Fedora 8, beagle-0.2.18-1.fc8


How reproducible:
Very hard to reproduce. It seems to be very timing-dependent, and beagle has to
be indexing the files *just* as git creates the index. But it has happened
several times to me now in the last day, and I seem to be able to make it
trigger more by running the git test-suite while having told beagle to only
index that one subdirectory.

Steps to Reproduce:
1. Run beagle
2. Do a lot of git operations
3. .. be very unlucky
  
Actual results:
Very occasional dirty git index file (which means that the index file that git
maintains has 'stat()' data that doesn't actually match the directory contents
when actually stat'ing the file.

Expected results:
When git does a 'lstat()' on a file, it doesn't expect the stat data to then
later change unless the user actually changes the file! But it seems that beagle
changes some of the stat data..

Additional info:
git cares about the following stat fields being stable:
 - st_mode
 - st_mtime and st_ctime
 - st_uid/st_gid
 - st_ino
 - st_size

Does beagle do anything to the file that might change the inode? Like touch the
st_ctime() by setting some file attribute or re-setting the access times, or
other details?

I'm sorry for not having wonderful debug data, but it took me a while to even
realize that triggering this odd git behaviour seems to be related to beagle.

Comment 1 Linus Torvalds 2007-11-13 20:13:19 UTC
Side note: the way I verified that it really *is* beagle, was to run one of the
tests that fails most easily in a loop:

      while ./t3404-rebase-interactive.sh -i; do echo ok; done

With beagle indexing disabled with beagle-settings, it can do this for what
appears to be forever, even if I also have something like a "find" going on in
the background (just to emulate  at least something looking at the tree at the
same time).

If I enable beagle indexing, it will take a while (usually several minutes, and
several tens of successful test-runs), but eventually, it will fail on subtest
17 (most of the git tests won't ever even care about the stat information, so
the fact that it always tends to fail on a specific test just means that that
particular test happens to care - it doesn't really tell anything else).

In other words, I don't know exactly what it is in beagle that triggers the
failure, but I'm pretty sure that it really is beagle that causes it.

Comment 2 Linus Torvalds 2007-11-13 21:01:36 UTC
One more comment: running the git "filter-branch" test suite seems to be even
better at catching this, ie replacing t3404-rebase-interactive.sh in the above
loop with t7003-filter-branch.sh, together with telling beagle to just index the
test subdirectory, seems to trigger it more quickly. 

If the issue really is that beagle tries to reset the access times (or something
similar), then I would suggest that rather than doing that (which is race-prone
*and* changes st_ctime, not to mention wastes disk IO), beagle open all files it
indexes with O_NOATIME, which should avoid the atime update from the indexing in
the first place.

But I still have no idea what beagle actually does to trigger this, so the above
is just a total guess..

Comment 3 Will Woods 2007-11-13 22:02:08 UTC
In F8, beagle writes xattrs to the files it indexes. (IIRC this is a change in
behavior since F7.) Could that be the cause here?

Comment 4 Linus Torvalds 2007-11-13 22:39:04 UTC
Yes, the problem does seem to be the fact that beagle uses xattrs. 

That may seem like a clever idea, but it has a couple of downsides:

 - xattrs obviously will never work on a read-only filesystem (and doesn't work
on many other filesystems either, for that matter)

 - it does actually change the filesystem it indexes. In this case, the reason
git doesn't like beagle seems to be the fact that while git doesn't care about
the extended attributes themselves, just the act of writing them does actually
change the inode: it sets the "ctime" of the inode to the time of the write.

 - xattrs are slow and fairly inefficient. You cannot read-ahead on xattrs
across many files, so reading them all in after startup is prohibitively
expensive. Much more so than having a separate beagle-internal flat index file.

That said, git doesn't really deeply care about the ctime value of an inode. We
could easily make git just ignore changes to ctime, but git tries to be very
good at finding if a file has changed, and the ctime check is there *exactly* so
that git will see and notice a file modification even if the program that
modified it tried to hide it by setting the length and mtime back to before it
got modified. That's exactly what ctime gives you, since you cannot reset mtime
without updating ctime.

So I think I now understand what beagle does, and why git doesn't like it (and
why probably very few other programs will ever care - although I would not be at
all surprised if the same beagle thing basically ends up messing with some
backup applications too, for the same reason).

And right now we'll probably document that you shouldn't use beagle with git.
And if people *really* want to use it, we can make git ignore the atime. But I
think beagle is doing something fairly intrusive and moderately stupid, and
would likely be *better* off just maintaining a special database of its own
rather than relying of xattrs.

Comment 5 Will Woods 2007-11-13 23:21:52 UTC
Beagle falls back to using its sqlite database for readonly filesystems and
other places you can't use xattrs (e.g. NFS/CIFS homedirs).

You can also disable its use of xattrs altogether. According to the FAQ (pulled
from google's cache since http://beagle-project.org/ is dead right now, grr):

"If you want to run with extended attributes disabled, set the environment
variable BEAGLE_DISABLE_XATTR. Keep in mind that beagle will run slower with
extended attribute disabled."

We should probably consider setting that variable by default if we decide to
start installing beagle by default again.

Comment 6 Linus Torvalds 2007-11-13 23:42:57 UTC
Good pointers.

And yes, I'm not at all surprised that xattrs can perform better than some
sqlite database. Most databases have all the same problems xattrs have, and then
some (ie bad access patterns and totally synchronous accesses with no effective
readahead).

The only way to get good performance from any disk reads (with current rotating
media) is to have access patterns that are amenable to readahead and big
accesses.  That's why git itself uses a single linear file for its "attribute
cache" (index). Databases are horrid, but so are filesystem metadata.

Comment 7 Bug Zapper 2008-11-26 08:24:26 UTC
This message is a reminder that Fedora 8 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 8.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '8'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 8's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 8 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 8 Bug Zapper 2009-01-09 05:11:47 UTC
Fedora 8 changed to end-of-life (EOL) status on 2009-01-07. Fedora 8 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.