Bug 488499 - [RFE] GFS: New mount option: -o errors=continue|remount-ro
Summary: [RFE] GFS: New mount option: -o errors=continue|remount-ro
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: gfs-utils
Version: 5.5
Hardware: All
OS: Linux
low
low
Target Milestone: rc
: 5.5
Assignee: Robert Peterson
QA Contact: Cluster QE
URL:
Whiteboard:
: 461065 509233 (view as bug list)
Depends On: 517145
Blocks: 5.4, TechnicalNotes 515348
TreeView+ depends on / blocked
 
Reported: 2009-03-04 17:30 UTC by Corey Marthaler
Modified: 2016-04-26 15:56 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
A new mount option (-o errors=continue) will be added to the GFS file system for Red Hat Enterprise Linux version 5.5. It will not be available in prior releases. The option controls how GFS behaves in the unlikely event that a file system error occurs. The normal behaviour is to withdraw from the file system and make it inaccessible until the next reboot. If -o errors=continue is specified, the file system will report the error as a kernel error but the error will be otherwise ignored. This mount option is intended for file system developers and quality testers only and is not intended for general use.
Clone Of:
: 515348 517145 (view as bug list)
Environment:
Last Closed: 2011-01-21 19:40:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
FIrst crack at a patch (9.91 KB, patch)
2009-07-31 21:48 UTC, Robert Peterson
no flags Details | Diff
Second prototype patch (11.80 KB, patch)
2009-08-03 17:25 UTC, Robert Peterson
no flags Details | Diff

Description Corey Marthaler 2009-03-04 17:30:11 UTC
Description of problem:
This ability would allow QA to continue running revolver with the debug mount option (in order to acquire better debugging in case of real failures) while at the same time cause I/O errors on the filesystem as one of the failure injection methods. 

Version-Release number of selected component (if applicable):
gfs-utils-0.1.18-1.el5
kmod-gfs-0.1.31-3.el5

Comment 1 Robert Peterson 2009-03-04 18:12:27 UTC
What "better debugging" are you looking for?  Today, the "-o debug"
mount option for GFS doesn't provide additional GFS messages.  It only
causes GFS to call BUG() and dump the call stack when it does a
withdraw.  Are you saying that you don't want the subsequent kernel
panic()?

Comment 2 Corey Marthaler 2009-03-04 19:23:14 UTC
We definitely need the call stack for when legitimate corruption is detected. So if that's only way to grab that now, then it would be nice if it didn't panic afterwards. I wonder if anyone besides QA mounts with the debug flag and what their thoughts would be about just removing the panic altogether.

Comment 3 Robert Peterson 2009-05-05 19:17:56 UTC
I need to dig into this a little deeper.  I know that RHEL4 handles
withdrawing on errors differently from RHEL5.  RHEL4 did a BUG_ON
but RHEL5 is a little more graceful.

Adding Dave T and Steve W to the cc list to get a historical
perspective on the topic.  I wasn't part of the design process so
I don't know the impact of changing this.  This seems more like a
design issue to me, although I can see its value in debugging.
Is this something that we should do for RHEL6 rather than RHEL5.x?

Comment 4 Steve Whitehouse 2009-05-06 09:24:16 UTC
The error handling in gfs is pretty poor really. I don't really expect that it will make a lot of difference what is done. As Bob says, the debug option doesn't change any of the messages which are printed, and if the fs withdraws then it usually means something pretty serious has gone wrong and there is not anything which can be done to recover from it.

We hope to improve on this in gfs2 by adding an errors= mount option along the lines of ext2/3/4 and being able to handle more of the possible fs errors by returning I/O errors to the user rather than panicing.

I'm not at all sure that there is anything that we can reasonably change in gfs though at this stage.

Comment 5 David Teigland 2009-05-06 15:19:07 UTC
There are two issues here:
1. mount option to enable extra messages (unfortunately there are very few)
2. mount option to enable panic instead of withdraw (on i/o errors)

Both make sense, and we need both (I've seen customer requests for each).
The "problem" is quite trivial: they don't have distinct mount options.  We should let "debug" mean one of them (probably 1), and add a new option for the other.

Comment 6 Robert Peterson 2009-07-30 16:07:19 UTC
*** Bug 509233 has been marked as a duplicate of this bug. ***

Comment 7 Robert Peterson 2009-07-30 16:11:17 UTC
There are really two general classes of GFS errors: (1) withdraw
problems due to file system inconsistency, and (2) run-time errors
(for example, memory corruption) that cause an assertion error.

So really there should be two mount options: ar_debug, which means
to BUG on withdraw, and -o panic_on_assert (versus BUG() on assert).

Comment 8 Robert Peterson 2009-07-31 21:48:20 UTC
Created attachment 355866 [details]
FIrst crack at a patch

This is a first-stab and is in no way complete.  At the very least
I need to change the man pages.  Just thought I'd post what I have
so far.

This implements "errors=panic|continue|remount-ro" similar to ext3.
If none of the three are specified, the default behavior remains
unchanged from how it is today.  That can be restored with
errors=default.

Comment 9 Robert Peterson 2009-08-03 17:25:20 UTC
Created attachment 356067 [details]
Second prototype patch

This patch adds the missing man page elements and fixes a bug I
spotted.

Comment 10 Robert Peterson 2009-08-05 13:56:30 UTC
*** Bug 461065 has been marked as a duplicate of this bug. ***

Comment 11 Vinny Valdez 2009-08-05 14:09:46 UTC
As to comment #2, I used the debug mount option by default in all Dell cluster deployments.  The reason is that if GFS only withdraws, then the system didn't get fenced, so any running services simply stopped working, instead of failing over.  The panic resolved this by causing it to stop responding to the cluster, and get fenced.

Comment 12 Guil Barros 2009-08-05 18:25:42 UTC
Patch tested and works fine, thanks!

Comment 13 Steve Whitehouse 2009-08-11 13:16:40 UTC
I think that we need to be a bit careful here....

There are some issues which we need to resolve before we can go ahead with this in order to avoid shooting ourselves in the foot with gfs2 compatibility.

The most important is that there is a failure to distinguish between two important classes of error. One class is those errors caused by reading something on disk which is known to be wrong. Another class of error relates to the internal state being incorrect for some reason.

In the case of the first class, then it is ok to do things like "continue", or "remount-ro" for example. We know that the information that GFS2 is referring to is in the main correct and can be used to make sensible decisions as to how to recover from the error.

In the second case, we must not rely on the apparent state of the filesystem since it may well be wrong. In specific cases it might be possible to recover from errors in certain ways, but we cannot really place these under any generic error handling scheme and we must at least withdraw/BUG() to inform the user. Failure to stop the execution path in these cases can lead to losing all the data on an otherwise correct filesystem.

So the main task here is not in adding the options to the kernel command line, but in auditing each and every caller in order to put them in the correct class.

Also, I'd prefer to call the "default" option "withdraw" since that describes what it does.

I'm also rather concerned at doing this development in gfs1 rather than gfs2. It really ought to be done in gfs2 first and back-ported if required.

Comment 16 David Teigland 2009-08-11 16:58:42 UTC
panic on errors instead of withdraw is an essential feature, and a lot of
people depend on it.  It's the only way to get reliable recovery when error
originate in the fs.  And it's the way gfs worked for ages before the
ill-advised withdraw "feature" came along.  Hiding this old behavior under 
option named "debug" was an unfortunate choice; it should have had its own
option with better name.  (I didn't realize the panic behavior had been rem
altogether at some point, that's a major regression.)

I'd urge not adding the other forms of error handling to gfs1, they will tu
into big cans of worms when people try to use them.  Just add back the simp
panic behavior under -o debug or an option with a better name.

People should also be encouraged to use this panic behavior since it results in much more reliable failure handling.  (This is all independent of the fence_scsi
issues.)

Comment 20 Steve Whitehouse 2009-08-12 09:03:03 UTC
I have to agree (comment #16) that withdraw is not an easy thing to do correctly and the current implementation doesn't really work for many (any?) cases. Also I'd be quite happy to be rid of it at some future stage if there is no pressing need to retain that feature.

If we do keep withdraw, then I'd prefer to change the way it operates so that the fs would internally ensure that it would no longer send any I/O before sending the uevent so that the current system of using dm to turn off I/O through the device would not be needed.

Ideally, if the cause of the withdraw was not related to the journal, the fs would write a final record into the journal describing what went wrong, and flushing any I/O which could still be processed correctly.

I suspect that (comment #19) errors=remount-ro will not work correctly in gfs1 because gfs_controld assumes that it can know the ro/rw state of the filesystem by catching the state changes via mount.gfs. Obviously if this change happens internally to the fs, it will no longer know whether its ok to ask the node to recover another node's journal.

Recently in gfs2 I have added an ONLINE uevent which could, potentially, be used by gfs_controld to monitor that state change. Currently none of the userland tools make use of it though.

As I mentioned in the earlier comment (comment #13) we do need to review all of the error handling carefully and be sure to distinguish the different types of error from each other. Some can be recovered from, and others cannot and we will have to look at them on a case by case basis. Applying a blanket policy is unlikely to have the desired effect.

Returning to the issue which sparked this off, it appears from the parallel email exchange that there may be other issues to consider too (wrt fence_scsi) and that the originally proposed patch will not fix the whole problem anyway.

Comment 23 David Teigland 2009-08-12 14:30:09 UTC
I've never configured a machine to automatically reboot after a panic myself; it looks like you have to set the kernel.panic sysctl to get that behavior.

A withdraw will definately not reboot -- the whole idea of withdraw is to leave the machine running and in the cluster, but disable the specific fs/storage causing errors.

Comment 24 Steve Whitehouse 2009-08-12 16:30:46 UTC
One thought - withdraws generate a uevent. It would be trivial to write a userland program to watch for those and call reboot if one occurs. Would that fix the issue?

It also has the advantage of not needing a hot fix to the kernel.

Comment 25 David Teigland 2009-08-12 16:42:16 UTC
We need a gfs mount option that will result in panic on i/o errors instead of a withdraw.  Like -o debug did; it's a simple regression on its own.  There is nothing more needed AFAIK.

Comment 26 Vinny Valdez 2009-08-12 16:52:08 UTC
Agree with Comment #25.  Although, as per Comment #24, if the uevent could trigger a fence of the node, that would also suffice.  The main issue I saw on the withdraw is that any service using that storage would not relocate to another node on I/O errors.  Fencing/rebooting/panic would provide that failover.

Comment 31 Robert Peterson 2009-08-18 21:33:29 UTC
Adding Nate Straz to the cc list as per this morning's gfs meeting.

Comment 35 Robert Peterson 2009-08-19 15:59:42 UTC
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
A new mount option (-o errors=continue) will be added to the GFS file system for Red Hat Enterprise Linux version 5.5.  It will not be available in prior releases.  The option controls how GFS behaves in the unlikely event that a file system error occurs. The normal behaviour is to withdraw from the file system and make it inaccessible until the next reboot.  If -o errors=continue is specified, the file system will report the error as a kernel error but the error will be otherwise ignored.  This mount option is intended for file system developers and quality testers only and is not intended for general use.

Comment 37 Robert Peterson 2009-08-20 13:04:03 UTC
The new mount options introduced in the name of this bugzilla record
won't be available until 5.5.  Therefore, we may need a release note
for 5.5.  Since the options are not available in 5.4 I don't think
it warrants a release note for 5.4.

Comment 44 Robert Peterson 2010-07-01 22:06:58 UTC
I don't think we can get this done by the 5.6 cutoff.  We have
a prototype patch but there are too many potential pitfalls, so
it requires a lot of testing.  Punting it to 5.7.

Comment 46 RHEL Program Management 2011-01-11 20:09:16 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 47 RHEL Program Management 2011-01-11 22:35:47 UTC
This request was erroneously denied for the current release of
Red Hat Enterprise Linux.  The error has been fixed and this
request has been re-proposed for the current release.

Comment 48 Robert Peterson 2011-01-21 19:40:41 UTC
Although we've got a working patch for this, there really isn't
a demand for this change that warrants the amount of work.
I spoke with Corey and Nate about it and they agreed we could
close it, at least for GFS1.  We'll keep the options open for
GFS2.  Closing as WONTFIX until there's a customer need.


Note You need to log in before you can comment on or make changes to this bug.