Bug 250843 - grub-install hangs on xfs
grub-install hangs on xfs
Product: Fedora
Classification: Fedora
Component: grub (Show other bugs)
All Linux
low Severity low
: ---
: ---
Assigned To: Peter Jones
Fedora Extras Quality Assurance
Depends On:
  Show dependency treegraph
Reported: 2007-08-03 17:16 EDT by Eric Sandeen
Modified: 2009-05-04 14:56 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2009-05-04 14:47:49 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Eric Sandeen 2007-08-03 17:16:12 EDT
if you try grub-install to a fresh /boot on xfs, it will usually hang.

It appears to be an issue with the validation that happens in grub-install, when
it uses grub to dump out the various files to be sure they were written
correctly. (?)

The symptom is that grub is hung in the xfs driver code (in grub), due to having
stumbled upon some "bad" (or inconsistent) metadata on disk.

I believe this is because in general, it is not safe to assume that you have a
consistent, valid filesystem present on your mounted block device after typing
"sync" - if you think about it, there's a reason that dm-snapshot *freezes* the
filesystem before taking a snap of the block device, the same principle applies

I modified grub-install to use xfs_freeze/thaw on /boot before & after grub dump
was invoked; this did indeed resolve the problem, but it's not a good general
solution - /boot may be part of your root fs, and freezing root leads to
trouble.  Plus, grub-install wants to write that image *somewhere* and can't if
the sole fs is frozen.

So this may come down to what this test is trying to achieve, i.e. what failure
it is trying to catch; if it wishes to use grub code to actually traverse the
filesystem in the same manner as it will need to at boot time, then I don't have
a great suggestion; as mentioned before, without either unmounting, or freezing,
the filesystem, you don't have any assurance of a valid block device to test.

If it's simply trying to verify that the files in question are in the blocks it
thinks they are, then something like FIBMAP calls may be more appropriate to
check with the filesystem about file locations.
Comment 1 Eric Sandeen 2007-08-06 14:12:12 EDT
FWIW I confirmed w/ dchinner that "sync" gets xfs metadata only as far as the
log - where grub will never find it.  While this exact behavior may be specific
to xfs, I'll still maintain that it's not safe in general to read a rw-mounted
block device and expect to find a consistent filesystem....
Comment 2 Bug Zapper 2008-04-04 09:32:59 EDT
Based on the date this bug was created, it appears to have been reported
during the development of Fedora 8. In order to refocus our efforts as
a project we are changing the version of this bug to '8'.

If this bug still exists in rawhide, please change the version back to
(If you're unable to change the bug's version, add a comment to the bug
and someone will change it for you.)

Thanks for your help and we apologize for the interruption.

The process we're following is outlined here:

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.
Comment 3 Eric Sandeen 2008-04-04 11:52:56 EDT
yeah, it still exists in rawhide.

however, Peter, if you want to close this with "we don't support boot on
anything but ext2 and ext3" I won't complain.
Comment 4 Bug Zapper 2008-05-13 23:06:39 EDT
Changing version to '9' as part of upcoming Fedora 9 GA.
More information and reason for this action is here:
Comment 5 Jeremy Katz 2009-05-04 14:47:49 EDT
I'll close it as such :-)
Comment 6 Eric Sandeen 2009-05-04 14:56:59 EDT
FWIW, recent proposed upstream changes may actually make xfs behave better in the face of grub's mistaken expectations .... :)  So worth trying this again after that makes it upstream...

Note You need to log in before you can comment on or make changes to this bug.