Bug 1468778 - System Recovery bugs in LiveCD - Major Fail
Summary: System Recovery bugs in LiveCD - Major Fail
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: anaconda
Version: 25
Hardware: All
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Anaconda Maintenance Team
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-07-07 23:24 UTC by Gerald Cox
Modified: 2017-12-12 10:25 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-12 10:25:23 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
screen photo of rescue shell from Fedora-Server-dvd-x86_64-25-1.3.iso (134.11 KB, image/jpeg)
2017-07-09 18:19 UTC, John Reiser
no flags Details
clues for safer chroot (796 bytes, patch)
2017-07-14 18:48 UTC, John Reiser
no flags Details | Diff

Description Gerald Cox 2017-07-07 23:24:25 UTC
Description of problem:
when trying to use chroot receive:
chroot:  failed to run command '/bin/bash':  Input/output error


Version-Release number of selected component (if applicable):
LiveCD F25

How reproducible:
Try to use the Troubleshooting / Recovery functionality


Actual results:
chroot failed, couldn't run system commands to repair


Expected results:  
command functioned properly

Additional info:
<Sigh> To fix the issue, I had to re-install.  I realize that (hopefully) you don't need to use these utilities often... but when you do, one would expect them to work.

Comment 1 John Reiser 2017-07-09 18:17:57 UTC
It works for me.
I booted Fedora-Server-dvd-x86_64-25-1.3.iso 3 times with 3 successes, choosing "Troubleshooting -->", "Rescue a Fedora System", then following the prompts; see screen photo attached.
I also booted Fedora-Workstation-Live-x86_64-25-1.3.iso 3 times with 3 successes, choosing "Troubleshooting -->", "Start Fedora-Workstation-Live 25 in basic graphics mode", "Try Fedora".  After a manual chroot in Workstation then I had to mount /proc manually, but after that things just worked; see console transcript at the end of this Comment.

Booting was from USB 2.0 flash memory via the machine's EFI on:
  Vendor ID:             GenuineIntel
  CPU family:            6
  Model:                 42
  Model name:            Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz
  Stepping:              7
and the installed system to be rescued was Fedora 25 using x86_64.

The initial Description fails to mention relevant context.  The original Reporter of this bugzilla report made a related query to the Fedora Developer mailing list on Fri, 7 Jul 2017 with subject "Issues with Recovery and Documentation".  The query is archived at
  https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/IYCSUDWPZHIY57YDTMWGJMMQFFS3ATMJ/
The trailing snippet is:
=====
  Then it says:  If you would like to make your system the root environment, run the command:

  chroot /mnt/sysimage

  I then get the response:
  chroot:  failed to run command '/bin/bash':  Input/output error

  I checked and it is there.  What's going on?
=====
The complaint "Input/output error" from chroot is relevant because it suggests that either the boot medium and/or the target system may have suffered hardware errors.  The original poster did not report any investigation of this possibility, such as via "dmesg" or "journalctl -b".


Console transcript in Terminal after booting Workstation-Live:
=====
[root@localhost-live liveuser]# mount /dev/sda5 /mnt  ## My root-to-be-rescued
[root@localhost-live liveuser]# chroot /mnt
Error, do this: mount -t proc proc /proc
/bin/basename: missing operand
Try '/bin/basename --help' for more information.
Error, do this: mount -t proc proc /proc   ## Yes, 3 lines were repeated.
/bin/basename: missing operand
Try '/bin/basename --help' for more information.
[root@localhost-live /]# mount -t proc proc /proc
[root@localhost-live /]# ps
  PID TTY          TIME CMD
 2452 ?        00:00:00 su
 2457 ?        00:00:00 bash
 2528 ?        00:00:00 bash
 2571 ?        00:00:00 ps
[root@localhost-live /]#
=====

Comment 2 John Reiser 2017-07-09 18:19:15 UTC
Created attachment 1295640 [details]
screen photo of rescue shell from Fedora-Server-dvd-x86_64-25-1.3.iso

Comment 3 Gerald Cox 2017-07-09 22:37:12 UTC
<Sigh> It's not a hardware error.  I basically wasted 6 hours trying to get it to work and finally just did a reinstall.  System is working fine.  

Since you read the thread on the developer list, you should have seen where I and another user did a trace and found where chroot was complaining about locale errors.  

It's a bit difficult to capture screenshots, etc. when your system is running in system restore mode.  In any event, saying "Works for me" isn't at all helpful.  Whomever is responsible for chroot should take the time and investigate why it was throwing errors based upon locale.  

The bottom line in this whole exercise is that the Fedora recovery tools aren't reliable - and one shouldn't be expected to spend hours of time searching google to try to figure out what is going on.  

As I mentioned above, I gave up and just did a reinstall (which is apparently what the vast majority of people end up doing).

Comment 4 Gerald Cox 2017-07-09 22:48:24 UTC
One other comment, I ran xfs_repair, fsck.vfat and fsck.ext4 on the drives as appropriate... there were no issues.

Comment 5 John Reiser 2017-07-12 13:39:04 UTC
(In reply to Gerald Cox from comment #3; paragraphs re-ordered, etc.)
> It's a bit difficult to capture screenshots, etc. when your system is
> running in system restore mode.

I used an old stand-alone digital camera, but the camera on most smartphones works well enough after a few practice pictures.  Also, already Rescue mode is running /usr/bin/tmux, which can capture and save contents of the screen.  (Note the tmux menu+status bar at the bottom of the screen.)

> In any event, saying "Works for me" isn't at all helpful.  

The screen photo that I attached as part of the evidence that Rescue mode works for me, shows a direct clue to the exact problem and its solution, including an immediate workaround.  Hint: look at the last two lines.

Rescue mode could give a safer suggestion, but did not commit any actual errors.  Usually Rescue mode's suggestion is reasonable, and always it is possible for the knowledgeable user to accomplish the goal by using appropriate input.  It helps to read the short manual page for chroot.

> Since you read the thread on the developer list, you should have seen where
> I and another user did a trace and found where chroot was complaining about
> locale errors.  

The reasonable expectation is that the Reporter will bring all the data to the bugzilla report, and present the evidence (directly or via explicit URL) as part of the initial Description or a subsequent Comment.  Besides, in those strace outputs the actual complainer was not chroot.  [That's another hint.]

> [O]ne shouldn't be expected to spend hours of time
> searching google to try to figure out what is going on.  

In a few seconds "strace -e trace=open,chroot,execve chroot /mnt/sysimage" provides key information about sequence.

> As I mentioned above, I gave up and just did a reinstall (which is
> apparently what the vast majority of people end up doing).

The motivated and resourceful user can perform all Rescue mode operations "manually" through explicit manipulation of mount, pathname, and shell environment.

A system that does not need rescuing presents the opportunity to learn about Rescue mode in a less stressful environment.  [Beware the selinux re-label if mounting in Read+Write mode.]  Compare and contrast
   strace -f -o strace-01.out -v -s 100 -e trace=file chroot /mnt/sysimage
with
   strace -f -o strace-02.out -v -s 100 -e trace=file chroot /mnt/sysimage \
      /bin/sh --norc -i
(Note that "trace=file" subsumes "trace=open,chroot,execve".)


> The bottom line in this whole exercise is that the Fedora recovery tools
> aren't reliable

The bottom line has three parts:

1. User error due to incomplete understanding of chroot and the Rescue environment, compounded by lack of appreciation for all the INVOCATION actions performed by /bin/bash.  See the manual pages and the output from "set".  It might be common to believe that "chroot /mnt/sysimage" is safe, but its safety depends on shell startup actions.

2. Rescue mode should facilitate the safer default invocation of a shell under chroot, by setting SHELL=/bin/sh (overriding /bin/bash).  Then the user will get the safer /bin/sh by default, and the riskier /bin/bash only by explicit request.  (Of course, "safer" might imply "less featureful".)   Rescue mode can instruct and re-enforce safer practice at the same time by suggesting "chroot /mnt/sysimage /bin/sh --norc -i".

3. Reporter failure to include all available and appropriate evidence and data in the bugzilla report.  State the names, values, and operational steps using enough clarity, detail, and accuracy so that someone else can replicate your actions faithfully.  Related opportunity for improvement was shown in https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/77CMXYNKW6FJYCWM6UNUKY7CYX6EIMMF/ where the claim "In any event, I opened up a bugzilla." omitted the URL for the report.

Comment 6 Gerald Cox 2017-07-12 14:04:24 UTC
<Sigh> Your first response was you have a hardware error, your follow-up is user error - while maintaining the theme of "there can't be a problem, it works for me".  

That said, I do appreciate you taking the time to include additional debugging tips into report to assist in the future - and the suggestion to practice this when the system isn't down.  That actually is good advice - however, tools like this should work without having to jump through hoops - and in this instance I couldn't use chroot to allow me to use dnf to fix the issue.  That is the issue.

Blaming the victim isn't the resolution.

Comment 7 John Reiser 2017-07-14 18:48:56 UTC
Created attachment 1298573 [details]
clues for safer chroot

This patch gives two hints to make chroot safer in Rescue mode.  The hints can be ignored, but they do protect against some possible corruption in locales (on both the Rescue media and the target system) and the startup actions of /bin/bash for user "root" on the target system.

Comment 8 Fedora End Of Life 2017-11-16 19:07:21 UTC
This message is a reminder that Fedora 25 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 25. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '25'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 25 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 9 John Reiser 2017-11-20 04:14:07 UTC
"Fixed" in Fedora 27 by removing Rescue Mode entirely from the Live .iso.

Comment 10 John Reiser 2017-11-20 05:05:41 UTC
(In reply to John Reiser from comment #9)
> "Fixed" in Fedora 27 by removing Rescue Mode entirely from the Live .iso.

However the same opportunity for confusion persists in Fedora-Server-dvd-x86_64-27-1.6.iso.  The patch of Comment #7 was not implemented.

Comment 11 Fedora End Of Life 2017-12-12 10:25:23 UTC
Fedora 25 changed to end-of-life (EOL) status on 2017-12-12. Fedora 25 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.