Bug 963283 - e2fsck.conf makes e2fsck ignore check intervals
Summary: e2fsck.conf makes e2fsck ignore check intervals
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: e2fsprogs
Version: 20
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Eric Sandeen
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1167202 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-05-15 14:58 UTC by Till Maas
Modified: 2015-03-14 07:14 UTC (History)
7 users (show)

Fixed In Version: e2fsprogs-1.42.12-3.fc21
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-02-21 04:24:28 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Till Maas 2013-05-15 14:58:16 UTC
Description of problem:
/etc/e2fsck.conf contains:
[options] 
# This will prevent e2fsck from stopping boot just because the clock is wrong                                                                           broken_system_clock = 1

This makes e2fsck ignore check intervals defined with "tune2fs -i".

Version-Release number of selected component (if applicable):
e2fsprogs-1.42.3-3.fc17

How reproducible:
always

Steps to Reproduce:
1. Use tune2fs -c 0 to ignore mount counts (e.g. on a desktop system to ignore too often fsck runs)
  
Actual results:
Filesystem is never checked

Expected results:
Filesystem should be checked according to check interval

Additional info:

Comment 1 Eric Sandeen 2013-05-15 15:12:03 UTC
broken_system_clock = 1 should not in and of itself disable interval checks (barring a bug, of course)

mke2fs does now, however, disable interval checks by default.

for the fs you're looking at, what does 

# dumpe2fs -h | grep interval

say?

How have you arrived at the conclusion that broken_system_clock breaks this?

It does look like check_if_skip() might have broken logic but just wanted to double check the items above as well.

Thanks,
-Eric

Comment 2 Eric Sandeen 2013-05-15 15:16:30 UTC
FWIW, this was added because:

+* Fri Apr 20 2012 Eric Sandeen <sandeen@@redhat.com> 1.42.2-5
+- Add broken system clock config to e2fsck.conf to let boot
+  continue even if system clock very wrong.

Gah, I always hated this time based check stuff.  Guess I'll have to go back & revisit it all.  Especially since I didn't put a bug number on the above change :(

Comment 3 Till Maas 2013-05-15 15:23:19 UTC
(In reply to comment #1)
> broken_system_clock = 1 should not in and of itself disable interval checks
> (barring a bug, of course)
> 
> mke2fs does now, however, disable interval checks by default.

Why is this? I always found time based checks more useful than count based checks for desktop systems.

> for the fs you're looking at, what does 
> 
> # dumpe2fs -h | grep interval
> 
> say?

# dumpe2fs -h /dev/mapper/_dev_sda1 | grep interval 
dumpe2fs 1.42.3 (14-May-2012)
Check interval:           15552000 (6 months)

> How have you arrived at the conclusion that broken_system_clock breaks this?

I looked at the source of e2fsck/unix.c:
 379 .·······} else if (!broken_system_clock && fs->super->s_checkinterval &&
 380 .·······.·······   ((ctx->now - lastcheck) >=
 381 .·······.·······    ((time_t) fs->super->s_checkinterval))) {
 382 .·······.·······reason = _(" has gone %u days without being checked");
 383 .·······.·······reason_arg = (ctx->now - fs->super->s_lastcheck)/(3600*24);
 384 .·······.·······if (batt && ((ctx->now - fs->super->s_lastcheck) <
 385 .·······.·······.·······     fs->super->s_checkinterval*2))
 386 .·······.·······.·······reason = 0;
 387 .·······}

 
Also I tried manually to check the filesystem with fsck and it only worked after I changed the config file.

Comment 4 Till Maas 2013-05-15 15:26:24 UTC
(In reply to comment #2)
> FWIW, this was added because:
> 
> +* Fri Apr 20 2012 Eric Sandeen <sandeen@@redhat.com> 1.42.2-5
> +- Add broken system clock config to e2fsck.conf to let boot
> +  continue even if system clock very wrong.
> 
> Gah, I always hated this time based check stuff.  Guess I'll have to go back
> & revisit it all.  Especially since I didn't put a bug number on the above
> change :(

Yes, I missed a bug number as well. On launchpad there is also shown how to mask time related errors specifically:
https://bugs.launchpad.net/ubuntu/+source/initramfs-tools/+bug/563618/comments/4

This might disable the boot problems without breaking the check scheduling.

Comment 5 Eric Sandeen 2013-05-15 15:36:30 UTC
Yeah, it seems like this is too big a hammer.  I'll have to try to remember why boot was actually failing (dropping to a shell IIRC) when the clock was wrong.

the time / mount-count based checks are off by default for a couple reasons:

a) ext[34] are journaling filesystems which, barring bugs, misconfiguration, and hardware problems, don't require routine full fscks.

b) sysadmins generally prefer to schedule downtime, and not have it set randomly at mkfs time.  Unexpected, long boot times at random intervals is much less preferable than intentionally scheduled maintenance windows, if periodic checks are desired.

Comment 6 Fedora End Of Life 2013-07-03 22:22:44 UTC
This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 7 Till Maas 2013-07-04 03:00:51 UTC
still broken in git 2cc6bb2850cde9e77d9676252b5007bb60d58565, therefore I assume it is also still broken in earlier Fedora releases.

Comment 8 Fedora End Of Life 2015-01-09 18:07:37 UTC
This message is a notice that Fedora 19 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 19. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained. Approximately 4 (four) weeks from now this bug will
be closed as EOL if it remains open with a Fedora 'version' of '19'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 19 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 9 Eric Sandeen 2015-02-07 19:49:41 UTC
*** Bug 1167202 has been marked as a duplicate of this bug. ***

Comment 10 Till Maas 2015-02-08 06:54:46 UTC
still broken in commit:f21542d0bf5b7a3dc97d3fe519682e239b4dc072

It is now 1.5 years since I reported this bug.

(In reply to Eric Sandeen from comment #5)

> b) sysadmins generally prefer to schedule downtime, and not have it set
> randomly at mkfs time.  Unexpected, long boot times at random intervals is
> much less preferable than intentionally scheduled maintenance windows, if
> periodic checks are desired.

I add this to my e2fsck.conf files:

allow_cancellation = true

This allows to cancel the checks, so the checks can be aborted (maybe include this in the fedora default config?). Also if sysadmins do not want filesystems to be checked, they can disable it at the filesystem level or manually add the entry to the config file. IMHO it is still not a sane default.

Comment 11 Eric Sandeen 2015-02-09 16:01:20 UTC
I apologize for this bug staying open for so long.

Upstream, new filesystems are no longer created with a periodic check enabled, so it hasn't been at the top of my list.  And the problems which started this before (forced fsck / boot failure when the clock was wrong) was pretty unfortunate.

I need to go back through and re-remember all the various hacks which exist to try to work around that problem, and pick the least-worst behavior when checks are enabled and the clock is wrong.

-Eric

Comment 12 Fedora Update System 2015-02-17 21:24:31 UTC
e2fsprogs-1.42.12-2.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/e2fsprogs-1.42.12-2.fc21

Comment 13 Fedora Update System 2015-02-17 21:30:39 UTC
e2fsprogs-1.42.12-2.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/e2fsprogs-1.42.12-2.fc20

Comment 14 Fedora Update System 2015-02-19 02:57:28 UTC
Package e2fsprogs-1.42.12-2.fc21:
* should fix your issue,
* was pushed to the Fedora 21 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing e2fsprogs-1.42.12-2.fc21'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2015-2241/e2fsprogs-1.42.12-2.fc21
then log in and leave karma (feedback).

Comment 15 Fedora Update System 2015-02-21 04:24:28 UTC
e2fsprogs-1.42.12-2.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 16 Fedora Update System 2015-02-24 17:55:51 UTC
e2fsprogs-1.42.12-3.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/e2fsprogs-1.42.12-3.fc21

Comment 17 Fedora Update System 2015-03-04 10:23:01 UTC
e2fsprogs-1.42.12-3.fc21 has been pushed to the Fedora 21 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 18 Jonathan S 2015-03-12 20:23:29 UTC
This update seems to cause a serious problem.

I have two Fedora 21 computers - one dual-boots to Windows and so uses local-time, the other uses UTC.

The computer using UTC is fine.
However, since this update, the dual-booter using local-time reports at every boot "Superblock last write time is in the future" and therefore proceeds to perform a fsck at *every* boot. This has happened every time (10 times) since this update was applied. This error *never* happened before.

I'm in the Netherlands (currently UTC+1), so I assume the check is made under UTC looking at a superblock written using UTC+1. If this assumption is correct, all local-time machines in timezones ahead of UTC will get fsck'd at every boot.

I notice:
http://forums.fedoraforum.org/showthread.php?p=1726736
The first poster is in Madrid (also UTC+1), using Fedora 21, and note he says everything was fine until March 5th, which accords exactly with the release of this update.

Comment 19 Eric Sandeen 2015-03-12 20:29:59 UTC
> This update seems to cause a serious problem.

Hohum, that's what the testing repo is for :(

Now I'm behind a rock and a hard place; the old commit tried to fix this problem, but broke all time-based checking.  With it reverted, you get checks on every boot.

Well, please file a bug, and we'll try to give it one more round, though I'm not sure there's any real hope of a fix for time-based checks if we can't trust the clock.

In the meantime, you can work around it with:

broken_system_clock = 1

in /etc/e2fsck.conf

That may just be what you need to do in your case...

-Eric

Comment 20 Jonathan S 2015-03-12 20:41:38 UTC
FURTHER INFO

Seems that the ultimate cause may be the changing of system-time when switching root at startup. The first fsck of the root drive occurs before switching root (does it update the superblock last write time??). Then, after switching root, all drives are checked again - at which time fsck reports "Superblock last write time is in the future."

Extract of 'journalctl -t systemd-fsck' for ONE boot. (PC BIOS time = 21:10, which is really Netherlands local-time - that is, what my watch says!):

-- Reboot --
Mar 12 22:10:36 -REDACTED- systemd-fsck[262]: My_Passport: clean, 195374/60661760 files, 11677452/242646016 blocks
Mar 12 21:10:43 -REDACTED- systemd-fsck[372]: My_Passport: Superblock last write time is in the future.
Mar 12 21:10:43 -REDACTED- systemd-fsck[372]: (by less than a day, probably due to the hardware clock being incorrectly set).  FIXED.
Mar 12 21:11:24 -REDACTED- systemd-fsck[372]: My_Passport: 195374/60661760 files (1.0% non-contiguous), 11677452/242646016 blocks

Note the first time is one hour ahead of PC BIOS time - the correct time only gets applied when switching root (when the journal is stopped and restarted with the correct time).
Looks like Fedora assumes it's using UTC (and applying one hour ahead to get 22:10) before it rights itself later in the boot process.

Comment 21 Till Maas 2015-03-12 21:10:34 UTC
(In reply to Eric Sandeen from comment #19)

> Well, please file a bug, and we'll try to give it one more round, though I'm
> not sure there's any real hope of a fix for time-based checks if we can't
> trust the clock.

IMHO the real problem here is that fsck runs twice, once in the initramfs one once afterwards. Maybe this can just be fixed? Also the initramfs could be fixed to have the proper time instead of just assuming BIOS time is UTC.

Comment 22 Eric Sandeen 2015-03-13 14:16:17 UTC
Perhaps.  The other problem is that e2fsprogs is supposed to ignore deltas within 24h, but it seems to only go one way (i.e. if you are up to 24h ahead, it's ok, but not if you are behind ((or is it vice-versa)).

Anyway, if you open a new bug I'd appreciate it, and we'll work this out there, since this bug is now closed.

Thanks,
-Eric

Comment 23 Till Maas 2015-03-14 07:14:37 UTC
Bug about dracut using the wrong time:
https://bugzilla.redhat.com/show_bug.cgi?id=1201978


Note You need to log in before you can comment on or make changes to this bug.