Bug 484913 - Change verbage of e2fsck questioning to make the "-y" switch more useful
Summary: Change verbage of e2fsck questioning to make the "-y" switch more useful
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: e2fsprogs
Version: 12
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Eric Sandeen
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 516996
TreeView+ depends on / blocked
 
Reported: 2009-02-10 17:55 UTC by Chris Marcantonio
Modified: 2010-12-05 07:01 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-12-05 07:01:23 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Chris Marcantonio 2009-02-10 17:55:17 UTC
Description of problem:
With the current potential line of questioning in e2fsck, a very frustrating situation can arise where the user is resigned to either not running the check, or mashing 'y' countless times if they want to proceed.  Normally this is a case where the -y switch would be useful, but answering 'yes' to the first question of "Abort?" causes the fsck to, obviously, abort.


Version-Release number of selected component (if applicable):

# rpm -q e2fsprogs
e2fsprogs-1.41.3-2.fc10.i386

# e2fsck -V
e2fsck 1.41.3 (12-Oct-2008)
        Using EXT2FS Library version 1.41.3, 12-Oct-2008



How reproducible:
Every time.


Steps to Reproduce:
1. Find a filesystem where the size reported in the superblock is larger than the size of the device that filesystem is on (likely due to some kind of filesystem corruption).
2. Run e2fsck -y to try to repair the filesystem and answer yes to all questions...assuming this will allow the fsck to run and simply answer 'yes' when it prompts to fix each individual error.
3. Watch e2fsck -y answer 'yes' to the first question, which is actually "Abort?" and never run.
4. Run e2fsck without the -y switch, manually answer 'no' to "Abort?" and then press the 'y' key over and over until a time roughly approaching the heat death of the universe as you are prompted whether or not to fix each individual error found.


Actual results:

sh-3.2# e2fsck -y /dev/VolGroup00/VolVol02
e2fsck 1.41.3 (12-Oct-2008)
The filesystem size (according to the superblock) is 14090240 blocks
The physical size of the device is 8847360 blocks
Either the superblock or the partition table is likely to be corrupt!

Abort? yes

sh-3.2#


Expected results:

Have the -y switch actually be useful in this case so that I don't have to hold down the 'y' key for months on end for every question that follows.  The easiest way would seem to be to change the first question to something more like "Are you sure you want to continue?" rather than "Abort?" so that answering 'yes' will continue through and not break the intended function of the '-y' switch.


Additional info:

Using an alternate superblock might be a valid suggestion in this specific case, but I don't think that discounts the spirit of the bug in general; i.e. if a user *does* want to actually run e2fsck in this situation and have it fix all errors it finds/prompts about, normally the -y switch is a Godsend.  It seems to me that this would function better if we were able to flip the first question around so that answering 'yes' actually allowed the fsck to continue.  There may be other situations out there that also prompt to "Abort?" first which would also need to be fixed.

Comment 1 Eric Sandeen 2009-04-18 14:07:16 UTC
Sorry for the late reply, I agree that "-y" answering yes to "Abort?" is not the best behavior... 

I'm not sure if changing the wording & logic of the abort test would be the safest choiice; I'd probably rather have it continue to come up and ask Abort?, you can say "n" to that, and the -y switch takes over from there.

I'll see if I can whip up a patch and run it by upstream.

-Eric

Comment 2 Bug Zapper 2009-11-18 11:04:45 UTC
This message is a reminder that Fedora 10 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 10.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '10'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 10's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 10 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 3 Bug Zapper 2009-12-18 07:52:47 UTC
Fedora 10 changed to end-of-life (EOL) status on 2009-12-17. Fedora 10 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 4 Chris Marcantonio 2009-12-18 16:44:18 UTC
Arg, I haven't had the time to try to reproduce this and see if it's still a problem on F12.  Did you ever get anywhere with regards to a patch?  Is it best if I tried to reproduce just to confirm?

I see this is flagged as one of the "important" BaseOS bugs we'd like to correct for RHEL6, so that also leads me to believe we might want to re-open it...

Comment 5 Eric Sandeen 2009-12-18 19:06:07 UTC
Sorry it hasn't been very high on the list ... if this is deemed important for rhel6 can you please open a rhel6 bug?  That'll keep it higher on the todo list ;)

I did start a little discussion upstream but it didn't really get anywhere; I can revisit it.

thanks,
-Eric

Comment 6 JamesIsIn 2010-04-03 19:39:36 UTC
I can confirm that this problem exists in Ubuntu 9.10.  I can also confirm that it is a pain.  The drive I am attempting to recover is 1.5 TB.  It lists around 200 million inodes.  Holding down the y key scrolls through about 10 per second.  This works out to around 266 days.

Possible ways to solve this:

1) alter the question so that y means continue
2) exclude that particular question from the -y switch (this will allow the user to decide but keep y for all the other questions)
3) include a different switch specifically for including the abort question (-Y maybe)

Thanks.

Comment 7 Eric Sandeen 2010-04-03 21:09:04 UTC
I sent a slightly hacky patch for this upstream, see:

http://marc.info/?l=linux-ext4&m=127032861320426&w=2

so that e2fsck -y -y will answer "no" to "Abort?"

In retrospect maybe your suggestion of -Y is better, let's see how it flies upstream.

-Eric

Comment 8 JamesIsIn 2010-04-03 21:50:45 UTC
Cool.

I rather like the idea of excluding it from -y because the nature of this particular question seems rather different from the others.  Then it would be up to the user to answer that one question (or then to use the -Y to knowingly force that question as well).  (In other words, my ideal fix would involve 2 & 3 from my previous comment.)

Did you link upstream back to this report?  At least then folks would have the opportunity to view this discussion.

Comment 9 Eric Sandeen 2010-04-03 21:57:28 UTC
(In reply to comment #8)
> Cool.
> 
> I rather like the idea of excluding it from -y because the nature of this
> particular question seems rather different from the others.  Then it would be
> up to the user to answer that one question (or then to use the -Y to knowingly
> force that question as well).  (In other words, my ideal fix would involve 2 &
> 3 from my previous comment.)

I think what I've done is akin to that, except using -y -y rather than -Y

hm, well... with my patch e2fsck -y says "yes" to abort; e2fsck -y -y says "no" and it carries on answering "yes" to non-Abort? questions.

So we still don't have a "soft -y" sort of thing where we get -y for everything but Abort? still stops and waits...

Part of the concern here is that -y had fairly well defined behavior; excluding -y from the Abort? question and waiting for an answer has the potential to break scripts.

> Did you link upstream back to this report?  At least then folks would have the
> opportunity to view this discussion.    

Yep, and there was a prior discussion about it upstream as well:

http://marc.info/?t=124022639800012&r=1&w=2

Comment 10 JamesIsIn 2010-04-03 22:24:53 UTC
Sounds like your patch is close except that with -y in your patch the Abort question exits e2fsck while under my proposal e2fsck would not exit (but rather await input).

It would seem to me that the Abort question already has the potential for breaking scripts (which are expecting certain behaviours from -y).  My proposal would simply provide the user (or script writer) with the greatest level of control.

But for the prevention of breaking any script which might use a unique exit code from the -y answer to Abort (is there one?), your solution would be preferred.

An alternative for your (perhaps awkward) -y -y switch might be to add a --no-abort switch?  Though I too like -Y.

The discussion to which you link is excellent.  They are discussing exactly what I was thinking about.  In fact the first comment from Eric says exactly what I said:

"But it seems like perhaps stopping at 'Abort?', allowing the user to say
'n' to that and then let the '-y' flag take over from there would be
reasonable."

And his -yy (or your -y -y or my -Y or using something like --no-abort) would help to automate in those cases where scripts were desired.

Comment 11 JamesIsIn 2010-04-03 22:28:33 UTC
In fact, if we just added a --no-abort then -Y would merely be -y --no-abort.  This would have no direct effect on -y or -n and would perhaps provide the least damage to any existing scripts.  Thoughts?

Comment 12 Eric Sandeen 2010-04-03 22:34:38 UTC
(In reply to comment #10)

> The discussion to which you link is excellent.  They are discussing exactly
> what I was thinking about.  In fact the first comment from Eric says exactly
> what I said:
> 
> "But it seems like perhaps stopping at 'Abort?', allowing the user to say
> 'n' to that and then let the '-y' flag take over from there would be
> reasonable."
> 
> And his -yy (or your -y -y or my -Y or using something like --no-abort) would
> help to automate in those cases where scripts were desired.    

Hehe, I am that Eric ;)  And -y -y is treated the same as -yy, FWIW.

The worry is about -existing- scripts that don't expect to be asked any questions when specifying "-y"

Right not there are no --long-options, so if anything new is added, it's simplest to keep it as a short option.

If you have concerns about the patch I sent upstream, it might be best to chime in on the list so others can see it.

-Eric

Comment 13 JamesIsIn 2010-04-03 23:21:40 UTC
Well, that Eric sure is a brilliant guy...

I see what you are saying about the long options.  Unfortunate.  I like the clarity of --no-abort.  And the simplicity of -Y being merely a shorthand form of -y --no-abort is handy as well.  (Though it does seem that anything with more than one character is an awful lot like a long option.)

That's true about scripts not expecting -y to return a question, but I suspect they don't typically expect the command to exit either (that is, not to exit without having done what it was asked to do).

Clearly when running this interactively I could see the abort question exit and then change my arguments and rerun accordingly.  This would only be possible in a script if the Abort question exited with a unique exit code (true? I didn't see one in the man page).  If no such exit code exists, no script can manage an Abort = y because no script can know that was how it exited.

Changing the Abort question (to Continue? for instance) would certainly effect any script expecting Abort to exit, but changing the outcome of the Abort question (from exit to await user input) other than altering the condition from "command exited" to "command hung awaiting user input"... I mean it's not going to perform any unwanted surgery for instance.  Nor would the command do any less work (since in both cases it never gets any further than the Abort question).  Of course the next line in the script will never be reached if it is waiting for an exit code from e2fsck.

As such I see how it would be best to keep -y as is.  Then add functionality like --no-abort (as you have done with -y -y).  The only other bit I might suggest then would be to give the Abort question a unique exit code so scripts could take better advantage of this --no-abort functionality.

If we wanted to make -Y a combination of -y and --no-abort the only letter in no abort not already used (according to the man page) is o.  So perhaps make -o mean --no-abort and -Y mean -yo/-y -o?

Where would you like me to chime in upstream?  I'm happy to toss in my two cents (obviously).

Comment 14 Eric Sandeen 2010-04-03 23:24:20 UTC
(In reply to comment #13)

> Where would you like me to chime in upstream?  I'm happy to toss in my two
> cents (obviously).    

You can jump in on the thread I linked in comment #7 if you like.

Thanks,
-Eric

Comment 16 Bug Zapper 2010-11-04 11:31:34 UTC
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 17 Bug Zapper 2010-12-05 07:01:23 UTC
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.