From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021218 Description of problem: I've enabled dir_index and run fsck -fD on a large filesystem of one of my boxes, then did some rsync -acnv testing against backup copies on three other hosts in my network. It turned out that rsync, without -n, would have removed the contents of some directories in the remote machines, which means rsync running on the local machine missed them. It missed exactly 3 directories, one per machine, from a list of 660k files in 38k directories, totalling 17GB. I found 3 occurrences of the following message in /var/log/messages: kernel: VFS: brelse: Trying to free free buffer. Their timestamps seem to match the relative position of the directories in the entire tree, so I'm inclined to believe they're related. I haven't tried to disable dir_index and repeat the test, but I've seen these messages in the console of the other machines (some of which are running the beta too, and also underwent dir_index filesystem conversion), and it seems to always happen when the machine is under high disk load. It doesn't seem like it's actually losing any files in the source tree: if I look again, the files and directories are still there, they just weren't (apparently) reported to rsync. In case it matters, these workloads were all on ext3 filesystems on logical volumes of volume groups built out of 2 or 4 physical volumes, depending on the machine. The logical volumes were striped by hand, i.e., consecutive 4MB logical extents live in different physical volumes, in disks. BTW, with this hand-striped LVM, the kernel 2.4.18 wouldn't be very efficient when reading large files, since it wouldn't read from all stripes in parallel, whereas the kernel seems to do it. Now I have top performance while copying large files without sacrificing the most common workloads that benefit from large extents (as opposed to small RAID0 blocks). Version-Release number of selected component (if applicable): How reproducible: Sometimes Steps to Reproduce: 1.Enable the dir_index option in an ext3 filesystem, and fsck -fD it 2.Have an exact copy elsewhere of a relatively large tree with lots of directories 3.Run rsync -arvHSpltcn --delete to try to get the trees in sync, redirecting the output to a file Actual Results: You may get some `deleting ...' messages, that are quite unexpected. Watch out for `brelse: Trying to free free buffer' messages in the console too. Expected Results: Since the trees were originally identical, no files should be going away. Additional info: I'm not sure -c is actually needed to make the disk load high enough to trigger the bug (if it's actually load-related). My nightly backups will probably tell. It's possible that something as simple as du -ks or find can trigger the bug. I'm investigating. I've seen the message on both i686 and athlon uniprocessor boxes, each one running the default kernel. Would we get more info from these messages using a debug kernel?
Ok, I've been able to duplicate the problem with something as simple as several `find . | wc -l' running in parallel, and it looks more and more like the kernel messages are related with missing files or directories in a traversal, since the end result is only different when the messages actually show up in /var/log/messages. It seems to be critical to have several processes concurrently bringing in directory information from disk. If the inodes and directory contents all fit in memory, I can't seem to duplicate the problem.
As suspected, after disabling the dir_index option in the ext3 filesystem, I couldn't duplicate the problem any longer. It appears that the errors logged by the kernel, or related problems, may turn a filesystem non-unmountable. I had to reboot the machine in order to disable the dir_index option in the filesystems, because umount would claim it was busy, but neither lsof nor /proc/mounts mentioned anything that might prevent the filesystem from being unmounted.
Alexandre, what exactly was your recipe for reproducing this? There's one possible problem I'm aware of which might result in things missing from readdir, but that can only happen on a filesystem which is being actively modified (specifically, one in which we're growing a directory from non-indexed to indexed while it is being read.)
Runing several `find . | wc -l' in parallel in a relatively large directory tree (i.e., one that wouldn't entirely fit in the cache) in an otherwise inactive filesystem would give me the problem.
I'm seeing something which might be related. If I do lots of stupid recursive disk access # find / -exec file {} \; or similar, I get Jan 14 16:26:40 station6 kernel: VFS: brelse: Trying to free free buffer fairly regularly.
htree has proven insufficiently stable for now; it has been backed out of our current test kernels.
I have been hearing you say this a fair ammount lately. Where does this put beta users who have ext3 htree filesystems? Are there any problems with using a non-htree patched kernel with a htree enabled filesystem?
No, there should be no problems at all. The htree indexes just look like empty space to non-htree kernels, and if a non-htree kernel modifies a directory it will clear any htree flags on the dir, so booting to an htree kernel later on won't cause problems either.