Bug 455998

Summary: busybox in rawhide is missing findfs
Product: [Fedora] Fedora Reporter: Qian Cai <qcai>
Component: kexec-toolsAssignee: Neil Horman <nhorman>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: high    
Version: 10CC: dvlasenk, nhorman, thomas.mey, varekova
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-12-18 06:15:32 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
patch to resolve UUID of root filesystem
none
patch to resolve uuid's in forgotten cases
none
new mkdumprd patch
none
log as required none

Description Qian Cai 2008-07-20 03:44:18 UTC
Description of problem:
I had a root device with something like UUID=1aa24a07. In line 392 of mkdumprd,

echo $1 | sed -e's/\dev\///' -e's/[0-9]\+//' >> $TMPDISKLIST

After this line, it saved a wrong UUID=aa24a07 to /etc/critical_disks.
Therefore, it caused capture Kernel failing to find the root device later.

Version-Release number of selected component (if applicable):
kexec-tools-1.102pre-13.fc10.x86_64

How reproducible:
Always

Comment 1 Qian Cai 2008-07-20 04:50:28 UTC
In fact, I don't think kexec-tools could dump to a disk partition by UUID if
using the default kdump configuration file. Even if I manually modified mkdumprd
to have a correct UUID, it still failed. It missed a procedure to transfer UUID
to real device in this situation.

Comment 2 Qian Cai 2008-07-20 06:26:16 UTC
Created attachment 312217 [details]
patch to resolve UUID of root filesystem

After applying the path and some changes to the config file, I could possible
to capture a vmcore with ext3 UUID target or without specifying a dump target
(entering init to capture it).

Comment 3 Qian Cai 2008-07-20 06:31:17 UTC
Another problem is that mkdumprd uses findfs in kdump initrd to find out real
device for LABEL and UUID. However, findfs is not part of busybox, so a user
needs to add a line to the config file,

extra_bins /sbin/findfs

Would it possible to convert LABEL or UUID to real device first, so there is no
need to use findfs in kdump initrd?

Comment 4 Neil Horman 2008-07-21 19:44:13 UTC
Nak, the problem seems to be that the mkdumprd code is shared between RHEL and
Fedora, and so the generated init script is also shared between the two.  In
RHEL findfs is built into busybox, and so it works, while on Fedoraits off.  I'm
looking at the configs, trying to figure out how/why that is (it looks like the
meaning of make defconfig changed radically between the two versions).

Regardless, I think the right thing to do is simply enable findfs in Fedora for
busybox.  

Ivana, can you comment on this?

Comment 5 Ivana Varekova 2008-07-23 12:00:19 UTC
There is no problem to add findfs to rawhide busybox.

Comment 6 Neil Horman 2008-07-23 13:02:18 UTC
as per our conversation and the comment above, re-assigning this to Ivana to add
this applet to busybox.  Thanks Ivana!

Comment 7 Qian Cai 2008-07-23 13:11:07 UTC
Just to mention that after adding findfs to busybox, Kdump init script still
needs the patch from comment #2 to get Kdump work by specifying a UUID of
dumping target.

Comment 8 Qian Cai 2008-07-23 13:12:56 UTC
Sorry. I mean the patch is to mkdumprd.

Comment 9 Ivana Varekova 2008-07-24 10:22:46 UTC
findfs applet is add to busybox-1.10.3-2.fc10. I'm reassigning this bug to
nhorman to finish kexec-tools part.

Comment 10 Neil Horman 2008-07-24 11:18:26 UTC
Ok, thank you Ivana, Now that its in, there shouldn't (AFAICS) be anything for
kexec-tools left to do but work :).  Cai, can you update to
busybox-1.10.3-2.fc10, and retest this?  Thanks!

Comment 11 Qian Cai 2008-07-25 10:59:08 UTC
It is still not work for the following configuration,

$ cat /proc/cmdline 
ro root=UUID=1aa24a07-5288-400a-8fa4-0bf9e58cd131 crashkernel=64M

I used the default Kdump configuration file, so it attempted to enter INIT to
capture the vmcore.

In mkdumprd,

    729     rootdev=$(awk '/^[ \t]*[^#]/ { if ($2 == "/") { print $1; }}' $fstab)
    730     # check if it's nfsroot
    731     if [ "$rootfs" == "nfs" ]; then
    732         remote=$(echo $rootdev | cut -d : -f 1)
    733         # FIXME: this doesn't handle ips properly
    734         remoteip=$(getent hosts $remote | cut -d ' ' -f 1)
    735         netdev=`/sbin/ip route get to $remoteip |sed 's|.*dev
\(.*\).*|\1|g' |awk {'print $1;'} |head -n 1`
    736         net_list="$net_list $netdev"
    737     # check if it's root by label
    738     elif echo $rootdev | cut -c1-6 | grep -q "LABEL=" ; then
    739         rootopts=$(echo $rootopts | sed -e 's/^r[ow],//' -e
's/,r[ow],$//' -e 's/,r[ow],/,/' \
    740                      -e 's/^r[ow]$/defaults/' -e 's/$/,ro/')
    741         majmin=$(get_numeric_dev dec /dev/root)
    742         if [ -n "$majmin" ]; then
    743             dev=$(findall /sys/block -name dev | while read device ; do \
    744                   echo "$majmin" | cmp -s $device && echo $device ; done \
    745                   | sed -e 's,.*/\([^/]\+\)/dev,\1,' )
    746             if [ -n "$dev" ]; then
    747                 vecho "Found root device $dev for $rootdev"
    748                 rootdev=$dev
    749             fi
    750         fi
    751     else
    752         rootopts=$(echo $rootopts | sed -e 's/^r[ow],//' -e
's/,r[ow],$//' -e 's/,r[ow],/,/' \
    753                      -e 's/^r[ow]$/defaults/' -e 's/$/,ro/')
    754     fi
    755     [ "$rootfs" != "nfs" ] && handlelvordev $rootdev

It went to line 755, and enter handlevordev function like this,

handlelvordev UUID=1aa24a07-5288-400a-8fa4-0bf9e58cd131

In handlevordev,

    373 handlelvordev() {
    374     local vg=`lvs --noheadings -o vg_name $1 2>/dev/null`
    375     if [ -z "$vg" ]; then
    376         vg=`lvs --noheadings -o vg_name $(echo $1 | sed -e
's#^/dev/mapper/\([^-]*\)-\(.*\)$#/dev/\1/\2#') 2>/dev/null`
    377     fi
    378     if [ -n "$vg" ]; then
    379         vg=`echo $vg` # strip whitespace
    380         case " $vg_list " in
    381         *" $vg "*)
    382             ;;
    383         *)
    384             vg_list="$vg_list $vg"
    385             for device in `vgdisplay -v $vg 2>/dev/null | sed -n 's/PV
Name//p'`; do
    386                 echo $device | sed -e's/\/dev\///' -e's/[0-9]\+//' >>
$TMPDISKLIST
    387                 findstoragedriver ${device##/dev/}
    388             done
    389             ;;
    390         esac
    391     else
    392         echo $1 | sed -e's/\/dev\///' -e's/[0-9]\+//' >> $TMPDISKLIST
    393         findstoragedriver ${1##/dev/}
    394     fi
    395 }

It went to line 392, and write UUID=aa24a07-5288-400a-8fa4-0bf9e58cd131 to
/etc/critical_disks in Kdump initrd.

When init file in Kdump initrd tried to find out root filesystem, it would never
find it, as there is no device named
/sys/block/UUID=aa24a07-5288-400a-8fa4-0bf9e58cd131.

   1562 cat >> $MNTIMAGE/init << EOF
   1563 echo "Waiting for required block device discovery"
   1564 for i in \`cat /etc/critical_disks\`
   1565 do
   1566     echo -n Waiting for \$i...
   1567     while [ ! -d /sys/block/\$i ]
   1568     do
   1569         sleep 1
   1570     done
   1571     echo Found
   1572 done
   1573 EOF

Looks like we missed checking of UUID in two places (root and swap filesystems),

    729     rootdev=$(awk '/^[ \t]*[^#]/ { if ($2 == "/") { print $1; }}' $fstab)
    730     # check if it's nfsroot
    731     if [ "$rootfs" == "nfs" ]; then
    732         remote=$(echo $rootdev | cut -d : -f 1)
    733         # FIXME: this doesn't handle ips properly
    734         remoteip=$(getent hosts $remote | cut -d ' ' -f 1)
    735         netdev=`/sbin/ip route get to $remoteip |sed 's|.*dev
\(.*\).*|\1|g' |awk {'print $1;'} |head -n 1`
    736         net_list="$net_list $netdev"
    737     # check if it's root by label
    738     elif echo $rootdev | cut -c1-6 | grep -q "LABEL=" ; then
    739         rootopts=$(echo $rootopts | sed -e 's/^r[ow],//' -e
's/,r[ow],$//' -e 's/,r[ow],/,/' \
    740                      -e 's/^r[ow]$/defaults/' -e 's/$/,ro/')
    741         majmin=$(get_numeric_dev dec /dev/root)
    742         if [ -n "$majmin" ]; then
    743             dev=$(findall /sys/block -name dev | while read device ; do \
    744                   echo "$majmin" | cmp -s $device && echo $device ; done \
    745                   | sed -e 's,.*/\([^/]\+\)/dev,\1,' )
    746             if [ -n "$dev" ]; then
    747                 vecho "Found root device $dev for $rootdev"
    748                 rootdev=$dev
    749             fi
    750         fi
    751     else
    752         rootopts=$(echo $rootopts | sed -e 's/^r[ow],//' -e
's/,r[ow],$//' -e 's/,r[ow],/,/' \
    753                      -e 's/^r[ow]$/defaults/' -e 's/$/,ro/')
    754     fi
    755     [ "$rootfs" != "nfs" ] && handlelvordev $rootdev
    756 
    757     # find the first swap dev which would get used for swsusp
    758     swsuspdev=$(awk '/^[ \t]*[^#]/ { if ($3 == "swap") { print $1; }}'
$fstab \
    759                 | head -n 1)
    760     if ! echo $swsuspdev | cut -c1-6 | grep -q "LABEL=" ; then
    761         handlelvordev $swsuspdev
    762     fi

Have not tested on system with swap device referred by UUID yet, but it looks
like it won't work either. BTW, RHEL5.2 does not have such a problem.

Comment 12 Neil Horman 2008-07-25 11:25:41 UTC
5.2 didn't contain the code to wait for critical disks.  Ok, I'll look into this

Comment 13 Neil Horman 2008-07-25 18:35:33 UTC
Created attachment 312665 [details]
patch to resolve uuid's in forgotten cases

Here you go Cai, I've not had time to test this yet, but I think it should fix
your problem.  In addition to the driver find routine that you pointed out in
your previous patch, its converts labels and uuids to actual device names
before storing them in the critical_disks file.  I've also removed the swap
detection case, since we really shouldn't need to use swap on these
systems,even in the event that we boot to the root file system.  If you could
try this out, I would appreciate it.  Thanks!

Comment 14 Qian Cai 2008-07-26 14:48:08 UTC
Hmm, I got this after applied the patch,

[root@localhost sbin]# service kdump restart
Stopping kdump:                                            [  OK  ]
No kdump initial ramdisk found.                            [WARNING]
Rebuilding /boot/initrd-2.6.27-0.166.rc0.git8.fc10.x86_64kdump.img
sed: -e expression #1, char 15: unterminated address regex
Starting kdump:                                            [  OK  ]

Comment 15 Neil Horman 2008-07-28 16:32:27 UTC
Created attachment 312792 [details]
new mkdumprd patch

Oops, forgot to finish the rootdev regex. Ne wpatch.  Thanks!

Comment 16 Thomas Meyer 2008-08-08 11:12:35 UTC
Same is true for Fedora 9.

Findfs is not found in the kdump ramdisk. Could you please rollout a fixed busybox to Fedora 9, too?

The patch for mkdumprd doesn't apply correctly to this version:
kexec-tools.i386                         1.102pre-10.fc9        installed 

Could you please check the mkdumprd script in Fedora 9, too? And correct when needed?

Comment 17 Qian Cai 2008-08-16 03:03:00 UTC
Ivana, looks like findfs in busybox does not work correctly,

busybox-1.10.3-2.fc10.x86_64

$ /sbin/findfs UUID=1aa24a07-5288-400a-8fa4-0bf9e58cd131
/dev/sda1

$ /sbin/busybox findfs UUID=1aa24a07-5288-400a-8fa4-0bf9e58cd131
<nothing>

Additional information,

# blkid /dev/sda1
/dev/sda1: LABEL="OK" UUID="1aa24a07-5288-400a-8fa4-0bf9e58cd131" SEC_TYPE="ext2" TYPE="ext3"

Comment 18 Denys Vlasenko 2008-08-19 14:39:34 UTC
Cai, you seem to be running findfs as non-root. Two questions:

1. Does it work under root?

2. Is your /sbin/findfs setuid root?

(it's surprising that non-root on your machine can even *see* /sbin/* directory contents, let alone run a program there...)

Comment 19 Qian Cai 2008-08-19 15:41:06 UTC
No, I was running it as root. Let me know if there is anything else I could provide.

Comment 20 Denys Vlasenko 2008-08-20 08:13:20 UTC
findfs 1.40.4 can determine the device by UUID from non-root by reading /etc/blkid/blkid.tab. Example of a line from this file:

<device DEVNO="0x0801" TIME="1219041198" LABEL="/boot" UUID="624b7b0d-2a0e-480a-a803-d6d15de302a0" SEC_TYPE="ext2" TYPE="ext3">/dev/sda1</device>


Strace of findfs:

29058 10:07:19.372756 open("/etc/blkid/blkid.tab", O_RDONLY) = 4
29058 10:07:19.372801 fstat64(4, {st_mode=S_IFREG|0644, st_size=964, ...}) = 0
29058 10:07:19.372860 fcntl64(4, F_GETFL) = 0 (flags O_RDONLY)
29058 10:07:19.372899 fstat64(4, {st_mode=S_IFREG|0644, st_size=964, ...}) = 0
29058 10:07:19.372957 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fc9000
29058 10:07:19.372997 _llseek(4, 0, [0], SEEK_CUR) = 0
29058 10:07:19.373035 read(4, "<device DEVNO=\"0xfd01\" TIME=\"121"..., 4096) = 964
29058 10:07:19.373118 read(4, "", 4096) = 0
29058 10:07:19.373154 close(4)          = 0
29058 10:07:19.373189 munmap(0xb7fc9000, 4096) = 0
29058 10:07:19.373230 open("/etc/blkid/blkid.tab", O_RDONLY) = 4
29058 10:07:19.373274 fstat64(4, {st_mode=S_IFREG|0644, st_size=964, ...}) = 0
29058 10:07:19.373331 close(4)          = 0
29058 10:07:19.373368 time(NULL)        = 1219219639
29058 10:07:19.373404 open("/dev/sda1", O_RDONLY) = -1 EACCES (Permission denied)
29058 10:07:19.373453 fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 44), ...}) = 0
29058 10:07:19.373516 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fc9000
29058 10:07:19.373557 write(1, "/dev/sda1\n", 10) = 10
29058 10:07:19.373613 exit_group(0)     = ?

busybox findfs does not use /etc/blkid/blkid.tab, so it needs to be run as root, or findfs applet should be made setuid.

Comment 21 Denys Vlasenko 2008-08-20 08:16:42 UTC
(In reply to comment #19)
> No, I was running it as root. Let me know if there is anything else I could
> provide.

Please do

strace -o bb_findfs.log busybox findfs UUID=1aa24a07-5288-400a-8fa4-0bf9e58cd131

and attach resulting bb_findfs.log to this bug

Comment 22 Denys Vlasenko 2008-08-20 08:49:12 UTC
Fix for busybox findfs:


diff -d -urpN busybox.9/include/applets.h busybox.a/include/applets.h
--- busybox.9/include/applets.h 2008-08-20 10:19:40.000000000 +0200
+++ busybox.a/include/applets.h 2008-08-20 10:47:38.000000000 +0200
@@ -152,7 +152,7 @@ USE_FDISK(APPLET(fdisk, _BB_DIR_SBIN, _B
 USE_FETCHMAIL(APPLET_ODDNAME(fetchmail, sendgetmail, _BB_DIR_USR_BIN, _BB_SUID_NEVER, fetchmail))
 USE_FEATURE_GREP_FGREP_ALIAS(APPLET_ODDNAME(fgrep, grep, _BB_DIR_BIN, _BB_SUID_NEVER, fgrep))
 USE_FIND(APPLET_NOEXEC(find, find, _BB_DIR_USR_BIN, _BB_SUID_NEVER, find))
-USE_FINDFS(APPLET(findfs, _BB_DIR_SBIN, _BB_SUID_NEVER))
+USE_FINDFS(APPLET(findfs, _BB_DIR_SBIN, _BB_SUID_MAYBE))
 USE_FOLD(APPLET(fold, _BB_DIR_USR_BIN, _BB_SUID_NEVER))
 USE_FREE(APPLET(free, _BB_DIR_USR_BIN, _BB_SUID_NEVER))
 USE_FREERAMDISK(APPLET(freeramdisk, _BB_DIR_SBIN, _BB_SUID_NEVER))

Of course it works only if busybox binary is setuid root (it typically should be).

Comment 23 Qian Cai 2008-08-20 09:11:31 UTC
Created attachment 314615 [details]
log as required

Comment 24 Neil Horman 2008-08-20 11:20:40 UTC
Thank you Denys.  Reassigning to busybox owner for inclusion.

Comment 25 Neil Horman 2008-08-20 11:23:12 UTC
Ivana looks like busybox still has a problem parsing UUID's .  Could you please review  Denys patch?

Thanks

Comment 26 Denys Vlasenko 2008-08-22 11:18:49 UTC
I downloaded and built new busybox in /tmp/v/busybox-1.12.0 on your
machine, and unlike busybox 1.10.3 in /sbin it works:

# pwd
/tmp/v/busybox-1.12.0

# findfs UUID="1aa24a07-5288-400a-8fa4-0bf9e58cd131"
/dev/sda1

# ./busybox findfs UUID="1aa24a07-5288-400a-8fa4-0bf9e58cd131"
/dev/sda1

# /sbin/busybox findfs UUID="1aa24a07-5288-400a-8fa4-0bf9e58cd131"

#

I downloaded 1.10.3 and built it too in /tmp/v/busybox-1.10.3 with the same .config
as 1.12.0, and it works too:

# ./busybox findfs UUID="1aa24a07-5288-400a-8fa4-0bf9e58cd131"
/dev/sda1



I can only conclude that busybox binary you have in /sbin was built without
CONFIG_FEATURE_VOLUMEID_EXT=y

Comment 27 Ivana Varekova 2008-08-26 12:33:29 UTC
Thanks Denys, the patch is OK, it is applied in busybox-1.10.3-3.fc10 now.

Comment 28 Qian Cai 2008-08-27 01:06:20 UTC
I have tried the latest busybox and UUID kexec-tools patch, and confirmed it worked fine. I suppose the UUID patch has not committed yet, so reopen it and let Neil finish that part. Thanks!

Comment 29 Neil Horman 2008-08-27 11:28:36 UTC
kexec updated in rawhide. (-16.f10)  Thanks guys

Comment 30 Bug Zapper 2008-11-26 02:35:00 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle.
Changing version to '10'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 31 Bug Zapper 2009-11-18 07:45:49 UTC
This message is a reminder that Fedora 10 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 10.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '10'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 10's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 10 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 32 Bug Zapper 2009-12-18 06:15:32 UTC
Fedora 10 changed to end-of-life (EOL) status on 2009-12-17. Fedora 10 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.