Bug 124845

Summary: fsck.ext3 segfaults on startup with LABEL= and bad /etc/blkid.tab
Product: [Fedora] Fedora Reporter: D. Hugh Redelmeier <hugh>
Component: e2fsprogsAssignee: Thomas Woerner <twoerner>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3CC: hugh, kzak, mattdm
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: fc4 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-10-30 13:47:57 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description D. Hugh Redelmeier 2004-05-31 08:39:31 UTC
Description of problem:
I have labelled the root ext3 file system "fc1" and changed
/boot/grub/grub.conf and /etc/fstab appropriately.  During startup,
fsck fails and I drop into root shell option.

I boiled the problem down to:

fails with segfault: /sbin/fsck.ext3 -a LABEL=fc1
works: /sbin/fsck.ext3 -a /dev/hda8


Version-Release number of selected component (if applicable):
e2fsck 1.34 (25-Jul-2003)
e2fsprogs-1.34-1

How reproducible:
All the time on my current setup, but I don't know why.
I have a similar setup on another machine that does not have this problem.

AHA: /etc/blkid.tab is critical (I found this out by using strace on
fsck.ext3).  I manually edited /etc/blkid.tab to have the right LABEL.
 Now "fsck.ext3 LABEL=fc1" works!

/etc/blkid.tab is a cache.  Who is supposed to build and update it?
libblkid(3) says see also blkid.tab(7) but there is no such manpage.

Steps to Reproduce (guess!):
1. Use a LABEL= for root filesystem
2. Make /etc/blkid.tab wrong (have wrong label)
3. try to reboot
  
Actual results:
Startup will stop with a failing fsck of root filesystem

Expected results:
Normal startup.

Additional info:

Comment 1 Thomas Woerner 2004-10-04 13:47:47 UTC
Can you please test this with e2fsprogs-1.35-7 or a newer version?

Comment 2 D. Hugh Redelmeier 2004-12-25 09:28:14 UTC
[Sorry for this unpolished report.  I want to get to bed so that Santa
can visit.]
I've just had a related problem happen on Fedora Core 3 on x86.
This was on the third boot (or so), before any updates were applied.
This system has e2fsprogs-1.35-11.2.

I relabeled / as fc3_32.  The reason is that I wish to dual boot
fc3_64 and fc3_32.  So I don't want both to use / as a label!

Instead of segfaulting, fsck complains that it cannot find the label
(I don't have the exact message, but you get the idea).  Much better
than fc1.  Again, the /etc/blkid.id cache was not updated by e2label
when it changed the label.

Possibly interesting clue: /etc/blkid.tab appears to have a
modification time in the future.  Perhaps modified before the timezone
offset was known.  My CMOS clock is kept in Easter Standard Time to
placate WinXP.

Fiddling with the date and relabelling did not update the
/etc/blkid.tab.  So the date thing may not be relevant.

Another (related) problem.  Because fschk failed, I got into repair
mode.  / was mounted read-only.  e2label did not complain about not
being able to update the /etc/blkid.tab cache.  (I eventually
remounted / so that /etc/blkid.tab could be written.)

An strace of e2label showed that /etc/blkid was only opened for reading.

BTW, I don't know why my clock was wrong.  Perhaps because I left fc3
sitting in the initial clock setup screen across midnight.

Comment 3 D. Hugh Redelmeier 2004-12-28 18:08:09 UTC
I've done a little more investigation.
The e2label program does not use libblkid so it certainly never
updates the blkid cache.  Who should be doing so?  Probably not the
initial fsck since (I think) / is read-only and /etc/blkid is on /.

Why does it matter that the on-disk cache is wrong?  Should it not be
regenerated on the fly?  Not always, according to libblkid(3):
       In  some  cases  (modular  kernels), block devices are not even
visible
       until after they are accessed the first time, so it  is 
critical  that
       there is some way to locate these devices without enumerating
only vis-
       ible devices, so the use of the cache file is required in  this
 situa-
       tion.


The manpage libblkid(3) SEE ALSO section lists a bunch of manpages
that don't exist.  Particularly intriguing is the blkid_put_cache(3).
THIS ITSELF IS AS BUG.

Summary: I am experiencing a case where I cannot boot because the
blkid cache is not being updated.

Comment 4 Karel Zak 2005-04-06 12:43:45 UTC
I think it's no problem if e2label doesn't update /etc/blkid.tab. The cache
should be updated always when library cannot found any label. You can for
example connect to system other HDD with already labeled partitions and call
"fsck LABEL=foo" where "foo" is label from new HDD.

Comment 5 D. Hugh Redelmeier 2005-04-06 14:07:16 UTC
"should" and "is" may be different.  How else can you explain my experience?
As I mentioned in #3, libblkid says that sometimes the cache is the only way to
find the label (I don't actually know that this applies in my case).
I wonder (but have no time to experiment right now) whether one problem is that
/ is mounted read-only at this point and a regenerated cache cannot be written.
Karel: have you been able to duplicate the problem?

Comment 6 Karel Zak 2005-04-06 21:37:28 UTC
I haven't FC-3 at my test computer, so with devel branch (FC-4):
(note: /dev/hda2 is my root fs and original label was "/1")

# rpm -q e2fsprogs
e2fsprogs-1.36-1.4
# e2label /dev/hda2 ROOT
# grep ROOT /etc/blkid.tab
# sed -e "s:/1:ROOT:g" < /etc/fstab > /etc/fstab.tmp
# mv -f /etc/fstab.tmp /etc/fstab
# sed -e "s:/1:ROOT:g" < /boot/grub/grub.conf > /boot/grub/grub.conf.tmp
# mv -f /boot/grub/grub.conf.tmp /boot/grub/grub.conf
# grep ROOT /etc/blkid.tab
# shutdown -F -r now

... the reboot passed


> As I mentioned in #3, libblkid says that sometimes the cache is the only way
> to find the label (I don't actually know that this applies in my case).

Yes. A device driver could be load on-demand (=first access to device). I think
LABELs usage in particular case not the best idea. But your problem is
different, your device is already initialized when system calls fsck.


Comment 7 Matthew Miller 2006-07-10 21:30:06 UTC
Fedora Core 3 is now maintained by the Fedora Legacy project for security
updates only. If this problem is a security issue, please reopen and
reassign to the Fedora Legacy product. If it is not a security issue and
hasn't been resolved in the current FC5 updates or in the FC6 test
release, reopen and change the version to match.

Thank you!


Comment 8 John Thacker 2006-10-30 13:47:57 UTC
No response to request for information.  Sounds like from comment #6 that this
has been fixed since FC4.  FC3 is only supported by Fedora Legacy for critical
security bugs.