522969 – ext3 superblock in future

Bug 522969 - ext3 superblock in future

Summary: ext3 superblock in future

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	e2fsprogs
Sub Component:
Version:	rawhide
Hardware:	i686
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	---
Assignee:	Eric Sandeen
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:	https://fedoraproject.org/wiki/Common...
Duplicates (2):	523378 529647 (view as bug list)
Depends On:
Blocks:	F12Blocker, F12FinalBlocker
TreeView+	depends on / blocked

Reported:	2009-09-12 20:57 UTC by shmuel siegel
Modified:	2010-04-04 14:27 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Clone Of:	441107
Environment:
Last Closed:	2009-10-21 17:15:39 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description shmuel siegel 2009-09-12 20:57:01 UTC

+++ This bug was initially created as a clone of Bug #441107 +++

The problem went away in fc9 but is now back. Each reboot complains that the superblock is in the future and forces an fsck - very bad for a laptop. The problem is probably related to my system clock being on local time, but the boot sequence and file maintenance should be using the same definition of time.

Comment 1 Eric Sandeen 2009-09-14 17:12:50 UTC

How far off is it, is it the delta between local time & UTC?

Just to be sure about what you mean when you say system clock - do you mean the hardware clock?  Is it on UTC or local time?

May be an initscripts problem ...

-Eric

Comment 2 shmuel siegel 2009-09-14 20:47:51 UTC

system config date says that I am located in UTC +2. With Daylight time that makes it UTC + 3. The dialog box says that the system clock is NOT using UTC. That makes a 3 hour difference, the same as my problem. The filesystem check gets "now" right according to my local time but think that the superblock is 3 hours in the future. It is as if it thinks that the time in the superblock is UTC so it converts it to local time.

Comment 3 Eric Sandeen 2009-09-14 21:10:22 UTC

If you can reproduce this maybe you can look at /etc/rc.d/rc.sysinit and around where it says "Checking filesystems" have it echo the date?

Maybe something like:

+       echo -n "Date: "; date
+       echo -n "UTC Date: "; date -u
        STRING=$"Checking filesystems"
        echo $STRING
        fsck -T -t noopts=_netdev -A $fsckoptions

There's a big ol' comment in e2fsck code:

        /*
         * Some buggy distributions (such as Ubuntu) have init scripts
         * and/or installers which fail to correctly set the system
         * clock before running e2fsck and/or formatting the
         * filesystem initially.  Normally this happens because the
         * hardware clock is ticking localtime, instead of the more
         * proper and less error-prone UTC time.  So while the kernel
         * is booting, the system time (which in Linux systems always
         * ticks in UTC time) is set from the hardware clock, but
         * since the hardware clock is ticking localtime, the system
         * time is incorrect.  Unfortunately, some buggy distributions
         * do not correct this before running e2fsck.  If this option
         * is set to a boolean value of true, we attempt to work
         * around this situation by allowing the superblock last write
         * time, last mount time, and last check time to be in the
         * future by up to 24 hours.
         */

and maybe we've regressed a bit in that respect.  I'll see if I can replicate het here; which versions of e2fsprogs and initscripts are you running?

-eric

Comment 4 Eric Sandeen 2009-09-14 21:19:59 UTC

Hmm actually I think this thread is relevant:

http://www.spinics.net/lists/linux-ext4/msg15354.html

and this sums it up:

http://www.spinics.net/lists/linux-ext4/msg15374.html

Can you set your system clock to UTC?  I think that would solve it ...

-Eric

Comment 5 shmuel siegel 2009-09-14 23:26:30 UTC

I think the above thread is correct. At least the logic explains the system.

Step 1 - Set clock to system clock i.e. local time
Step 2 - Run journal repair which updates the super block
Step 3 - Set system clock to utc
Step 4 - Run fsck

Step 3 should not be between steps 2 and 4. Best is if it is before step 2. If it is after step 4, you would run into the problem that you mentioned in comment 3 but that wouldn't affect me, only the people west of utc.

Comment 6 Eric Sandeen 2009-09-15 15:58:42 UTC

Bill, can you shed any light on the initscript sequences of time-setting vs. when fsck runs?

Thanks,
-Eric

Comment 7 Bill Nottingham 2009-09-15 16:32:57 UTC

Time setting is done via udev, should be before fsck, but after (root, r/o) mount.

Comment 8 Eric Sandeen 2009-09-16 15:33:55 UTC

Ted just sent a patch for this same problem, to handle it in kernelspace:

[PATCH] ext3: Don't update superblock write time when filesystem is read-only

This avoids updating the superblock write time when we are mounting
the root file system read/only but we need to replay the journal; at
that point, for people who are east of GMT and who make their clock
tick in localtime for Windows bug-for-bug compatibility, and this will
cause e2fsck to complain and force a full file system check.

Signed-off-by: "Theodore Ts'o" <tytso>
---

 fs/ext3/super.c |   13 ++++++++++++-
 1 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/fs/ext3/super.c b/fs/ext3/super.c
index a8d80a7..62c86af 100644
--- a/fs/ext3/super.c
+++ b/fs/ext3/super.c
@@ -2321,7 +2321,18 @@ static int ext3_commit_super(struct super_block *sb,
 
 	if (!sbh)
 		return error;
-	es->s_wtime = cpu_to_le32(get_seconds());
+	/*
+	 * If the file system is mounted read-only, don't update the
+	 * superblock write time.  This avoids updating the superblock
+	 * write time when we are mounting the root file system
+	 * read/only but we need to replay the journal; at that point,
+	 * for people who are east of GMT and who make their clock
+	 * tick in localtime for Windows bug-for-bug compatibility,
+	 * the clock is set in the future, and this will cause e2fsck
+	 * to complain and force a full file system check.
+	 */
+	if (!(sb->s_flags & MS_RDONLY))
+		es->s_wtime = cpu_to_le32(get_seconds());
 	es->s_free_blocks_count = cpu_to_le32(ext3_count_free_blocks(sb));
 	es->s_free_inodes_count = cpu_to_le32(ext3_count_free_inodes(sb));
 	BUFFER_TRACE(sbh, "marking dirty");

Seems like the simplest approach, I can push that to Fedora after a bit of review, and after it makes it upstream.

-Eric

Comment 9 shmuel siegel 2009-09-16 22:47:51 UTC

I hope that this is considered a temporary patch since it seems like a bad strategy. The basic problem, as I see it,  is that the clock changes between journal recovery and fsck time. Every internal mechanism on the system should be using the same definition of time. This patch is trying to circumvent a design problem but leave the problem intact. Is that really a good idea? Is it proper to not update the modification time of the superblock? Can the system determine the proper setting of the clock before applying the journal?

Comment 10 Eric Sandeen 2009-09-17 03:01:48 UTC

To be honest, I have some reservations about the patch as well, but I've not got a ton of time to dig into the issue... 

As for figuring out the timezone prior to root fs journal replay...  I'm not sure.  I suppose that the initrd would need this information... perhaps that is possible.

Maybe it's worth bringing up on the linux-ext4 list ...

Ted may say that the real solution is to set the hardware clock to UTC, and I might tend to agree with that.  I don't know what all is involved w/ setting up the proper time, and how much would need to be stuffed into the initrd...

Comment 11 shmuel siegel 2009-09-17 05:04:05 UTC

Setting the hardware clock to UTC might be a solution to this problem but it introduces a host of others, mostly revolving around removing a user choice that has been around for a long time. But there is one thing for sure, you can't offer an option which doesn't work. That is what we have now.

Comment 12 Yanko Kaneti 2009-10-04 22:44:31 UTC

I am in UTC+3 and all the hardware clocks around me are running on localtime. I've been doing routine yum dist upgrades of all these machines without much issues since FC6. With this new e2fsck/initscripts setup every unclean shutdown would result in a boot failure, for no good reason. Gigantic fail.

Comment 13 Yanko Kaneti 2009-10-04 23:01:05 UTC

Unless I am mistaken this will also affect everyone in UTC+ with hardware clock on localtime that is upgrading via the supported methods. This needs to be a F12Blocker.

Comment 14 Eric Sandeen 2009-10-05 03:57:41 UTC

Do you know what changed to suddenly make this a problem - initscripts or e2fsprogs?

Guess I need to take some time to try to sort it out but if you know more or less when things started going bad, I'd appreciate the tip ;)

Thanks,
-Eric

Comment 15 Yanko Kaneti 2009-10-05 06:08:13 UTC

Its a change between e2fsprogs 1.41.8 and 1.41.9. In fedora package terms between
e2fsprogs-1.41.8-5.fc12 (+libs) and e2fsprogs-1.41.9-1.fc12 (+ libs). Everything else latest rawhide
If you press the reset button with 1.41.8 you get something like
/dev/mapper/test-root: Superblock last write time is in the future.  FIXED.
if you do the same with 1.41.9 the superblock write time diff becomes fatal and you are dropped into emergency shell.

Comment 16 Eric Sandeen 2009-10-05 17:03:57 UTC

Ok, thanks for the confirmation that it's in e2fsprogs; thanks too for grabbing my attention - I'll go see what changed.  :)

-Eric

Comment 17 Eric Sandeen 2009-10-05 19:13:20 UTC

As a temporary workaround Ted reminded me that there is an e2fsck.conf which can be placed in /etc - putting this in:

[problems]
     0x000031 = {
     	      preen_ok = true
     }     
     0x000032 = {
     	      preen_ok = true
     }     

may make e2fsck less picky about times in the future ...

-Eric

Comment 18 Robert Laverick 2009-10-05 22:16:24 UTC

FWIW I'm seeing this issue too, when set to london timezone with clock set to UTC since I'm still in summertime so running at +1, I've been working around the problem by keeping my clock in local time. (which I think windows expects anyway when I'm dual booting)

Comment 19 Eric Sandeen 2009-10-06 02:29:48 UTC

I talked w/ Ted today and I think we have a plan to just make e2fsck silently eat up to 24h of mismatch.

Are you guys seeing it drop to a shell for manual intervention, or is it just doing a full automatic check?

thanks,
-Eric

Comment 20 Yanko Kaneti 2009-10-06 03:25:53 UTC

(In reply to comment #19)
> Are you guys seeing it drop to a shell for manual intervention, or is it just
> doing a full automatic check?

Its dropping into a shell. If then I just reboot again without doing anything the next boot it automatically goes to the forced check.

Comment 21 shmuel siegel 2009-10-06 07:11:46 UTC

(In reply to comment #19)
> Are you guys seeing it drop to a shell for manual intervention, or is it just
> doing a full automatic check?

(From comment #19)
>Its dropping into a shell. If then I just reboot again without doing anything
>the next boot it automatically goes to the forced check.  
 
Just to make sure that you know that it is not just one machine, I see the same symptoms.

Comment 22 Eric Sandeen 2009-10-06 15:25:09 UTC

Blocking F12Blocker; we really need to fix this pre-f12 (could make it block the beta but I can't guarantee I'll get it done prior to that w/ my upcoming personal schedule, sorry).  The change shouldn't be invasive.

Comment 23 Yanko Kaneti 2009-10-19 10:33:18 UTC

*** Bug 529647 has been marked as a duplicate of this bug. ***

Comment 24 Eric Sandeen 2009-10-19 16:00:44 UTC

Ted ultimately decided on the patch he wanted to merge to fix this and sent it to the list; unfortunately it's not yet pushed to the upstream repo - I may just need to pull the patch in on the assumption that it -will- get merged upstream.

Thanks,
-Eric

Comment 25 Eric Sandeen 2009-10-19 18:01:41 UTC

Ok, the patch is checked in & built for F-12 and rawhide.

Testing (either from rawhide or the koji build from F12) would be great!

Thanks,

-Eric

Comment 26 Yanko Kaneti 2009-10-20 05:00:05 UTC

/dev/mapper/test-root: Superblock last mount time is in the future.
	(by less than a day, probably due to the hardware clock being incorrectly set)  FIXED.
/dev/mapper/test-root: clean, 48442/655360 files, 311503/2621440 blocks
...<normal boot>

Works for me. A bit presumptuous...

Comment 27 Eric Sandeen 2009-10-20 05:07:15 UTC

Thanks for the testing.  I'm ignoring the presumptuous part - an awful lot of comments & warnings & verbiage etc, all for one small, reasonable change in behavior, but oh well ;)

-Eric

Comment 28 Adam Williamson 2009-10-21 17:15:39 UTC

has been tagged into F12 final. Closing bug. Shmuel, please re-open if you still have issues with e2fsprogs-1.41.9-5.fc12 or later.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 29 Joachim Frieben 2009-10-22 16:03:57 UTC

*** Bug 523378 has been marked as a duplicate of this bug. ***

Comment 30 shmuel siegel 2009-10-25 06:45:23 UTC

(In reply to comment #28)
> has been tagged into F12 final. Closing bug. Shmuel, please re-open if you
> still have issues with e2fsprogs-1.41.9-5.fc12 or later.
> 
This bug was opened against a user experience and the patch solves the problem so I am not going to reopen the bug. But I still contend that from a software development point of view it is the wrong patch. The system clock should not be reset between accesses. The clock should be set to local time before the journal is run. It is unacceptable that two different processes see a different definition of time.

Comment 31 Eric Sandeen 2009-10-26 14:51:59 UTC

Shmuel, I don't necessarily disagree with you ... in some ways this was just the most expedient path to resolution for F12.

If you want to open a new bug in rawhide against the component you think is the most likely root cause, that'd be helpful.

Thanks,
-Eric

Note You need to log in before you can comment on or make changes to this bug.