Bug 758690 - File system check - fsck - handling & behaviour with ext3 & ext4
Summary: File system check - fsck - handling & behaviour with ext3 & ext4
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Deadline: 2011-12-19
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: doc-Storage_Admin_Guide
Version: 6.0
Hardware: All
OS: Linux
unspecified
low
Target Milestone: rc
: ---
Assignee: Jacquelynn East
QA Contact: ecs-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-11-30 13:07 UTC by Andreas Vollmer
Modified: 2015-07-26 22:10 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-06-22 00:12:43 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Andreas Vollmer 2011-11-30 13:07:35 UTC
Description of problem: 
This is an excellent resource for covering storage issues. Many thanks for it.

Unfortunately the important file system check (fsck) is not covered - is not referenced in the index!

Under ext3 a volume must be umouned in order to perform a fsck. Does this apply also for ext4?
Is there a possibility to perform fsck on a mounted volume without causing corruption?
On high availability systems a umount of volumes is not possible - it would cause a service interruption. What kind of suggestions do you have for us?
The larger the volume as longer needs the fsck - another reason not to do fsck because of the umount and therefore service downtime.


Version-Release number of selected component (if applicable):
Edition 0

Actual results:


Expected results:
Please update of referenced the document :-) and give the reader some hints - practical guides - for the use of fsck on production systems.
If applicable please forward this issue to the developer group for a rework of fsck.

Additional info:

Comment 3 David Howells 2011-12-07 11:59:26 UTC
Yes, it applies to ext4 for the same reason: fsck may change the on-disk filesystem image behind the kernel's back and it doesn't inform the kernel when it does so.

That said, you can do "fsck -n" on a live filesystem.  It won't make any changes and may give spurious results if it encounters partially-written metadata.

Comment 4 Lukáš Czerner 2011-12-07 12:26:46 UTC
The other approach would be to take an lvm snapshot of the file system and check that instead of live file system. Then, of course, remove the snapshot. However that would be only possible if lvm was used in the stack.

Another "workaround" would be to remount file system read only. Since all pending metadata updates (and writes) are forced to disk prior the remount, the file system should be in consistent state (if there is no corruption of course). Then you can do the fsck, but I would suggest doing fsck -n as well (just to be sure).

Sadly extN does not have any online fsck tool so far and there are no plans to do it in the near future AFAIK. On the other hand upstream ext4 is working on metadata checksumming feature, which should give you some assurance that when working with live file system, your metadata are valid, otherwise file system will scream at you, possibly umounting or remounting read only. This should help preventing silent metadata corruptions.

Thanks!
-Lukas

Comment 5 David Howells 2011-12-07 12:58:33 UTC
(In reply to comment #4)
> Another "workaround" would be to remount file system read only. Since all
> pending metadata updates (and writes) are forced to disk prior the remount, the
> file system should be in consistent state (if there is no corruption of
> course). Then you can do the fsck, but I would suggest doing fsck -n as well
> (just to be sure).

You would still have to be very careful doing "fsck" rather than "fsck -n" on a R/O mounted filesystem.  If fsck changes things, the kernel won't know.  This may lead to the kernel returning EIO and getting confused - and if someone remounts R/W after doing the fsck then you're asking for corruption to occur in your filesystem.


Note You need to log in before you can comment on or make changes to this bug.