Bug 199697

Summary: Doesn't react well to missing paths
Product: Red Hat Enterprise Linux 4 Reporter: Bastien Nocera <bnocera>
Component: e2fsprogsAssignee: Eric Sandeen <esandeen>
Status: CLOSED NOTABUG QA Contact: Jay Turner <jturner>
Severity: medium Docs Contact:
Priority: high    
Version: 4.0CC: kzak, sct, srevivo
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-02-12 22:11:35 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 198694, 205159, 234251, 236329    

Description Bastien Nocera 2006-07-21 13:40:33 UTC
e2fsprogs-1.35-12.3.EL4

1. Setup a machine with multipath (you don't really need to, the /etc/blkid.tab
needs to reflect that though)
blkid.tab contains:
<device DEVNO="0x0802" TIME="1151916502" LABEL="/test"
UUID="06e57d7d-f574-435c-a0aa-b2b3ed511eba" SEC_TYPE="ext3"
TYPE="ext2">/dev/sda2</device>
<device DEVNO="0x0842" TIME="1151916502" LABEL="/test"
UUID="06e57d7d-f574-435c-a0aa-b2b3ed511eba" SEC_TYPE="ext3"
TYPE="ext2">/dev/sde2</device>

Both paths lead to /test

2. Remove a path (/test is on /dev/sda1 and /dev/sde1 for example, but is still
present in the blkid.tab and /etc/fstab)

3. Reboot the machine, and look at fsck stop dead in its tracks, as it cannot
find /test (again, testcase just below)

Here's a simple testcase that replicates what fsck does right now:
--8<--
#include <blkid/blkid.h>
#include <string.h>

int main (int argc, char **argv)
{
       blkid_cache blkid;
       char *name;

       blkid_get_cache(&blkid, NULL);
       name = blkid_get_devname(blkid, "LABEL=/test", 0);
       printf ("name: %s\n", name);

       return 0;
}
--8<--

As the path has been removed, and neither block devices exist anymore (both
/dev/sda2 and /dev/sde2 are gone), fsck will try to access those, and stop saying:
--8<--
[/sbin/fsck.ext3 (1) -- /test] fsck.ext3 -a LABEL=/test
fsck.ext3: Unable to resolve 'LABEL=/test'
[FAILED]

*** An error occurred during the file system check.
*** Dropping you to a shell; the system will reboot
*** when you leave the shell.
*** Warning -- SELinux is active
*** Disabling security enforcement for system recovery.
*** Run 'setenforce 1' to reenable.
Give root password for maintenance
--8<--

It would be good if fsck could try harder to find a valid device path for that
label.

Comment 1 Thomas Woerner 2006-07-31 09:30:34 UTC
Labels should be unique on your system. If you are using a label more than once,
then it is required to use the first entry in the list to be consistent. I think
it is not a good idea if blkid takes the first in the list which is physically
available, because this could result in a partition switch, which would be very
bad, e.g. if you switch /tmp with /home, because /tmp is not there and both
partitions have the same label.

I could only say: DO not use labels with multipath or software-raid.

Comment 2 Keiichi Mori 2006-07-31 10:06:55 UTC
> I could only say: DO not use labels with multipath or software-raid.
Anaconda will take labels, if a filesystem is on the real devices.
Also, the concept of using label is handling filesystem well even if the
physical device name is changed, right ?


Comment 5 RHEL Program Management 2007-06-08 22:24:32 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 11 Eric Sandeen 2008-02-12 22:11:35 UTC
For lack of a better resolution type, NOTABUGging this one based on the IT
closing; also, from my understanding, we have a by-label entry in fstab for
which there is no device containing that label, and fsck stops.  This looks to
me like simply a misconfiguration....