Description of problem: When anaconda formats a ext4 filesystem, it sets the Maximum mount count to -1, disabling the filesystem check it does after the Maximum mount count has been reached. This can be dangerous, especially on a journalling filesystem where it really never gets marked as being dirty. While journal recovery can prevent a lot of filesystem errors, errors still do occur, and require periodic checking to find them. Version-Release number of selected component (if applicable): This is happening on F18, but I also notice my F17 drives formatted by anaconda are the same way. How reproducible: Format a ext4 filesystem with anaconda. It sets the Maximum mount count to -1 Steps to Reproduce: 1. 2. 3. Actual results: Expected results: It should set the Maximum mount count, so the filesystem will be periodocally checked. It should especially set the Maximum mount could on the root (/) and other system filesystems since they can't be easily checked manually once the system is booted. Additional info: tune2fs -l output of root (/) filesystem after a F18 install where anaconda formatted. ----- [root@tower20 schemas]# tune2fs -l /dev/sdb4 tune2fs 1.42.5 (29-Jul-2012) Filesystem volume name: <none> Last mounted on: / Filesystem UUID: 1eb839b4-d6d3-4433-bc7f-b59af7012ba8 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize Filesystem flags: signed_directory_hash Default mount options: user_xattr acl Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 3276800 Block count: 13107200 Reserved block count: 655360 Free blocks: 10672473 Free inodes: 2955437 First block: 0 Block size: 4096 Fragment size: 4096 Reserved GDT blocks: 1020 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 8192 Inode blocks per group: 512 Flex block group size: 16 Filesystem created: Mon Nov 12 20:24:50 2012 Last mount time: Thu Nov 22 07:13:48 2012 Last write time: Thu Nov 22 07:13:44 2012 Mount count: 40 Maximum mount count: -1 Last checked: Mon Nov 12 20:24:50 2012 Check interval: 0 (<none>) Lifetime writes: 27 GB Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 256 Required extra isize: 28 Desired extra isize: 28 Journal inode: 8 Default directory hash: half_md4 Directory Hash Seed: d887244d-19e3-49d1-a457-63dad58131c6 Journal backup: inode blocks -----
IIUC, anaconda is using the defaults for extN file systems. The subject was discussed by the experts in this bug that was closed 2011-02-15: Bug 649089 - anaconda should not disable automatic filesystem checks on journaled ext3/4 See also: $ grep enable_periodic_fsck /etc/mke2fs.conf enable_periodic_fsck = 0 $ rpm -qf /etc/mke2fs.conf e2fsprogs-1.42.3-3.fc17.x86_64 $ man mke2fs.conf In a test[1], setting enable_periodic_fsck to 1 on the Live CD before installing resulted in a positive value for both "Maximum mount count" and "Check interval". It would nice if there were a better a way to manage those settings. The gnome-disk-utility package might be the place to manage file system checks: http://git.gnome.org/browse/gnome-disk-utility [1] Tested with: $ qemu-kvm -m 2048 -hda f18-test-3.img -cdrom ~/xfr/fedora/F18/F18-Beta/RC1/Fedora-18-Beta-x86_64-Live-Desktop.iso -usb -vga qxl -boot menu=on -usbdevice mouse
That is correct - anaconda just follows the default settings here, and if those are not optimal, that is where they need to be changed. We're simply not the best place to store that knowledge and chase down what ought to be done filesystem-wise.
Then where are the defaults changed? I know that filesystems I create (not through anaconda) have the Maximum mount count set to a number from 20-40 (approx., I never really noticed the exact values) without having to specify it, so it is in the defaults once Fedora is installed to set it to something other than a -1.
Upstream has disabled periodic fsck for over a year and a half, or since v1.42: commit 3daf592646b668133079e2200c1e776085f2ffaf Author: Eric Sandeen <sandeen> Date: Thu Feb 17 15:55:15 2011 -0600 e2fsprogs: turn off enforced fsck intervals by default The forced fsck often comes at unexpected and inopportune moments, and even enterprise customers are often caught by surprise when this happens. Because a filesystem with an error condition will be marked as requiring fsck anyway, I submit that the time-based and mount-based checks are not particularly useful, and that administrators can schedule fscks on their own time, or tune2fs the enforced intervals if they so choose. This patch disables the intervals by default, and I've added a new mkfs.conf option to turn on the old behavior of random, unexpected, time-consuming fscks at boot time. ;) Signed-off-by: Eric Sandeen <sandeen> Signed-off-by: Theodore Ts'o <tytso> and a short while after that, the default max mount count was changed from 0 to -1 for housekeeping reasons: commit 14b283ae565930144ef5ace12483d602cc3e7539 Author: Theodore Ts'o <tytso> Date: Wed Sep 28 22:45:12 2011 -0400 mke2fs: set s_max_mnt_count to -1 by default If the enable_periodic_fsck option is false in /etc/mke2fs.conf (which is also the default), s_max_mnt_count needs to be set to -1, instead of 0. Kernels newer than 3.0 will interpret 0 to disable periodic checks, but older kernels will print a warning message on each mount, which will annoy users. Addresses-Debian-Bug: #632637 Signed-off-by: "Theodore Ts'o" <tytso> You can re-enable the periodic check in mke2fs.conf if you really like it: enable_periodic_fsck This boolean relation specifies whether periodic filesystem checks should be enforced at boot time. If set to true, checks will be forced every 180 days, or after a random number of mounts. These values may be changed later via the -i and -c command-line options to tune2fs(8). (In reply to comment #3) > Then where are the defaults changed? > > I know that filesystems I create (not through anaconda) have the Maximum > mount count set to a number from 20-40 (approx., I never really noticed the > exact values) without having to specify it, so it is in the defaults once > Fedora is installed to set it to something other than a -1. Can you show me an example (you can mkfs a file, if you have no spare devices), along with the e2fsprogs version and the contents of mke2fs.conf? Thanks, -Eric
I just checked F17, and a fresh mkfs creates a filesystem with a 0 check interval and a -1 max mount count . . .
Created attachment 651049 [details] F16 output from "tune2fs -l /dev/sda2" after clean, default, minimal install With F16, after a clean, default, minimal install, the file system check settings are positive numbers: [Snippet from attached F16 tune2fs output] $ egrep 'Maximum mount count|Check interval' f16-tune2fs-l-sda2.txt Maximum mount count: 22 Check interval: 15552000 (6 months) Did you do an upgrade from F16? What is the output from this? $ ls -l /etc/mke2fs* Tested with: $ qemu-img create f16-test-1.img 12G $ qemu-kvm -m 2048 -hda f16-test-1.img -cdrom ~/xfr/fedora/F16/Fedora-16-x86_64-DVD.iso -usb -vga qxl -boot menu=on -usbdevice mouse
Well, as I said, the changes were made in V1.42. Your output is from V1.41, when it did default to the semi-random-e2fsck-on-boot behavior. I'm going to close this NOTABUG; it was a design decision a year or so ago. You can always tune your filesystems to force a fsck if you wish.
(In reply to comment #7) > Well, as I said, the changes were made in V1.42. Your output is from V1.41, > when it did default to the semi-random-e2fsck-on-boot behavior. Eric: I am Steve, not Daniel. I was reporting the results of an F16 test install in Comment 6 to help Daniel diagnose the problem. > I'm going to close this NOTABUG; it was a design decision a year or so ago. > You can always tune your filesystems to force a fsck if you wish. Daniel: I would suggest opening an RFE for gnome-disk-utility asking for file system check support. In particular, users should be able to schedule a file system check at some future time.
(In reply to comment #8) ... > Daniel: I would suggest opening an RFE for gnome-disk-utility asking for > file system check support. In particular, users should be able to schedule a > file system check at some future time. Or on the next reboot ...
(In reply to comment #7) > ... a design decision a year or so ago ... While I fully support the design decision, the documentation is completely inconsistent with it. The scary wording in the tune2fs man page is enough to make a reader wonder if the e2fsprogs maintainers and developers have their heads screwed on ... $ man tune2fs ... -c max-mount-counts ... You should strongly consider the consequences of disabling mount-count-dependent checking entirely. Bad disk drives, cables, memory, and kernel bugs could all corrupt a filesystem without marking the filesystem dirty or in error. If you are using journaling on your filesystem, your filesystem will never be marked dirty, so it will not normally be checked. A filesystem error detected by the kernel will still force an fsck on the next reboot, but it may already be too late to prevent data loss at that point. ... e2fsprogs-1.42.3-3.fc17.x86_64
Steve: Sorry ;) As for scary stuff in the man page, I guess that could be fixed. I've just always thought the user/admin should fsck on their schedule, not e2fsck's schedule. You can always touch /forcefsck to force a full fsck on the next boot.
As the man page for tune2fs states, a journalling filesystem will never be marked dirty, so the reasoning to disable the periodic filesystem checks is flawed. They state that a filesystem with errors will be marked as dirty and filesystem check will be forced on boot anyway, but that is not so if you are using a journalling filesystem (like ext4 is, and is the default in Fedora) If you know you have a filesystem error on your root (/) filesystem, then doing a touch /forcefsck is a bad thing to be doing. You don't want to write to a known corrupted filesystem until after it's checked. You are much better off using the kernel parameter fsck.mode=force. I know I have had several issues with my filesystems due to a periodic check not being forced, and have now changed all of my drives set a maximum mount count. Even though this issue is actually not a bug per-se in Fedora, I do feel it is still a problem, and opening a RFE in gnome-disk-utility won't help me one iota since I have no desire to run Gnome anymore and won't even put it on my system here.
Oh, and to answer Eric's question above. My other filesystems were created under Fedora 16, so used the older e2fsprogs before the change was implemented. A filesystem created under Fedora 17 and Fedora 18 do set the maximum mount count to a -1
(In reply to comment #12) ... > I know I have had several issues with my filesystems due to a periodic check > not being forced, and have now changed all of my drives set a maximum mount > count. ... Were you able to determine what was causing the issues with your file systems? Running fsck may be a good idea, but it does not solve the problem that is causing file system corruption.
One time it was a bad shutdown. Had a power outage. On boot, the journal was recovered, but there was filesystem corruption that wasn't caught for several boot cycles. If I had not manually forced a filesystem check, it may have gone quite awhile without being detected since the periodic checks are disabled, and possibly caused a really big problem. As it was, there wasn't anything that it couldn't fix. Only thing that causes me a problem is that when I force a filesystem check on boot, it checks all of my drives.. So, I have to wait until over 30TB of drive space is checked. It would be much more convenient if it periodically checked them based on the mount count, so it won't check so much at one time. It's rare that more than one filesystem hits the maximum mount count on the same boot. Or have a way to force filesystem checks on just specific filesystems instead of a "all or nothing" forced check.
(In reply to comment #12) > As the man page for tune2fs states, a journalling filesystem will never be > marked dirty, so the reasoning to disable the periodic filesystem checks is > flawed. They state that a filesystem with errors will be marked as dirty and > filesystem check will be forced on boot anyway, but that is not so if you > are using a journalling filesystem (like ext4 is, and is the default in > Fedora) That is not correct. If a filesystem encounters a runtime consistency error, the error flag is set on the fs, and a full fsck will be done on the next boot. "Error state" is not the same thing as dirty (i.e. not-cleanly-unmounted). All of the runtime overhead of journaling is there to *remove* the need for boot-time fsck. Doing it at random times anyway, just out of paranoia, by default, is simply not warranted. > If you know you have a filesystem error on your root (/) filesystem, then > doing a touch /forcefsck is a bad thing to be doing. You don't want to write > to a known corrupted filesystem until after it's checked. > > You are much better off using the kernel parameter fsck.mode=force. Fair enough; there's more than one way to do it. > I know I have had several issues with my filesystems due to a periodic check > not being forced, and have now changed all of my drives set a maximum mount > count. Yes, you have that choice as a system administrator. Do you have the bug numbers handy for the filesystem problems you encountered? I'd like to be sure they are getting the attention they need. > Even though this issue is actually not a bug per-se in Fedora, I do feel it > is still a problem, and opening a RFE in gnome-disk-utility won't help me > one iota since I have no desire to run Gnome anymore and won't even put it > on my system here. You are welcome to re-open the discussion upstream, if you think the mke2fs defaults need to be changed.
For what it's worth, even a power loss should not cause corruption or necessitate a boot-time fsck - that is what the journal is for, after all. (Unless you possibly are running with -o nobarrier / -o barrier=0 which could break things for you again). So it may not have been the power loss which caused the problems; you might have just been looking more carefully after that?
The tune2fs man page says all of these attributes can be changed, so it might be possible to implement a variety of file system check policies by varying them: -c max-mount-counts -C mount-count -i interval-between-checks[d|m|w] -T time-last-checked ISTM, there should be an immutable lifetime-mount-count attribute too ...