Bug 455998
Summary: | busybox in rawhide is missing findfs | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Qian Cai <qcai> | ||||||||||
Component: | kexec-tools | Assignee: | Neil Horman <nhorman> | ||||||||||
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||
Severity: | high | Docs Contact: | |||||||||||
Priority: | high | ||||||||||||
Version: | 10 | CC: | dvlasenk, nhorman, thomas.mey, varekova | ||||||||||
Target Milestone: | --- | Keywords: | Reopened | ||||||||||
Target Release: | --- | ||||||||||||
Hardware: | All | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2009-12-18 06:15:32 UTC | Type: | --- | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Description
Qian Cai
2008-07-20 03:44:18 UTC
In fact, I don't think kexec-tools could dump to a disk partition by UUID if using the default kdump configuration file. Even if I manually modified mkdumprd to have a correct UUID, it still failed. It missed a procedure to transfer UUID to real device in this situation. Created attachment 312217 [details]
patch to resolve UUID of root filesystem
After applying the path and some changes to the config file, I could possible
to capture a vmcore with ext3 UUID target or without specifying a dump target
(entering init to capture it).
Another problem is that mkdumprd uses findfs in kdump initrd to find out real device for LABEL and UUID. However, findfs is not part of busybox, so a user needs to add a line to the config file, extra_bins /sbin/findfs Would it possible to convert LABEL or UUID to real device first, so there is no need to use findfs in kdump initrd? Nak, the problem seems to be that the mkdumprd code is shared between RHEL and Fedora, and so the generated init script is also shared between the two. In RHEL findfs is built into busybox, and so it works, while on Fedoraits off. I'm looking at the configs, trying to figure out how/why that is (it looks like the meaning of make defconfig changed radically between the two versions). Regardless, I think the right thing to do is simply enable findfs in Fedora for busybox. Ivana, can you comment on this? There is no problem to add findfs to rawhide busybox. as per our conversation and the comment above, re-assigning this to Ivana to add this applet to busybox. Thanks Ivana! Just to mention that after adding findfs to busybox, Kdump init script still needs the patch from comment #2 to get Kdump work by specifying a UUID of dumping target. Sorry. I mean the patch is to mkdumprd. findfs applet is add to busybox-1.10.3-2.fc10. I'm reassigning this bug to nhorman to finish kexec-tools part. Ok, thank you Ivana, Now that its in, there shouldn't (AFAICS) be anything for kexec-tools left to do but work :). Cai, can you update to busybox-1.10.3-2.fc10, and retest this? Thanks! It is still not work for the following configuration, $ cat /proc/cmdline ro root=UUID=1aa24a07-5288-400a-8fa4-0bf9e58cd131 crashkernel=64M I used the default Kdump configuration file, so it attempted to enter INIT to capture the vmcore. In mkdumprd, 729 rootdev=$(awk '/^[ \t]*[^#]/ { if ($2 == "/") { print $1; }}' $fstab) 730 # check if it's nfsroot 731 if [ "$rootfs" == "nfs" ]; then 732 remote=$(echo $rootdev | cut -d : -f 1) 733 # FIXME: this doesn't handle ips properly 734 remoteip=$(getent hosts $remote | cut -d ' ' -f 1) 735 netdev=`/sbin/ip route get to $remoteip |sed 's|.*dev \(.*\).*|\1|g' |awk {'print $1;'} |head -n 1` 736 net_list="$net_list $netdev" 737 # check if it's root by label 738 elif echo $rootdev | cut -c1-6 | grep -q "LABEL=" ; then 739 rootopts=$(echo $rootopts | sed -e 's/^r[ow],//' -e 's/,r[ow],$//' -e 's/,r[ow],/,/' \ 740 -e 's/^r[ow]$/defaults/' -e 's/$/,ro/') 741 majmin=$(get_numeric_dev dec /dev/root) 742 if [ -n "$majmin" ]; then 743 dev=$(findall /sys/block -name dev | while read device ; do \ 744 echo "$majmin" | cmp -s $device && echo $device ; done \ 745 | sed -e 's,.*/\([^/]\+\)/dev,\1,' ) 746 if [ -n "$dev" ]; then 747 vecho "Found root device $dev for $rootdev" 748 rootdev=$dev 749 fi 750 fi 751 else 752 rootopts=$(echo $rootopts | sed -e 's/^r[ow],//' -e 's/,r[ow],$//' -e 's/,r[ow],/,/' \ 753 -e 's/^r[ow]$/defaults/' -e 's/$/,ro/') 754 fi 755 [ "$rootfs" != "nfs" ] && handlelvordev $rootdev It went to line 755, and enter handlevordev function like this, handlelvordev UUID=1aa24a07-5288-400a-8fa4-0bf9e58cd131 In handlevordev, 373 handlelvordev() { 374 local vg=`lvs --noheadings -o vg_name $1 2>/dev/null` 375 if [ -z "$vg" ]; then 376 vg=`lvs --noheadings -o vg_name $(echo $1 | sed -e 's#^/dev/mapper/\([^-]*\)-\(.*\)$#/dev/\1/\2#') 2>/dev/null` 377 fi 378 if [ -n "$vg" ]; then 379 vg=`echo $vg` # strip whitespace 380 case " $vg_list " in 381 *" $vg "*) 382 ;; 383 *) 384 vg_list="$vg_list $vg" 385 for device in `vgdisplay -v $vg 2>/dev/null | sed -n 's/PV Name//p'`; do 386 echo $device | sed -e's/\/dev\///' -e's/[0-9]\+//' >> $TMPDISKLIST 387 findstoragedriver ${device##/dev/} 388 done 389 ;; 390 esac 391 else 392 echo $1 | sed -e's/\/dev\///' -e's/[0-9]\+//' >> $TMPDISKLIST 393 findstoragedriver ${1##/dev/} 394 fi 395 } It went to line 392, and write UUID=aa24a07-5288-400a-8fa4-0bf9e58cd131 to /etc/critical_disks in Kdump initrd. When init file in Kdump initrd tried to find out root filesystem, it would never find it, as there is no device named /sys/block/UUID=aa24a07-5288-400a-8fa4-0bf9e58cd131. 1562 cat >> $MNTIMAGE/init << EOF 1563 echo "Waiting for required block device discovery" 1564 for i in \`cat /etc/critical_disks\` 1565 do 1566 echo -n Waiting for \$i... 1567 while [ ! -d /sys/block/\$i ] 1568 do 1569 sleep 1 1570 done 1571 echo Found 1572 done 1573 EOF Looks like we missed checking of UUID in two places (root and swap filesystems), 729 rootdev=$(awk '/^[ \t]*[^#]/ { if ($2 == "/") { print $1; }}' $fstab) 730 # check if it's nfsroot 731 if [ "$rootfs" == "nfs" ]; then 732 remote=$(echo $rootdev | cut -d : -f 1) 733 # FIXME: this doesn't handle ips properly 734 remoteip=$(getent hosts $remote | cut -d ' ' -f 1) 735 netdev=`/sbin/ip route get to $remoteip |sed 's|.*dev \(.*\).*|\1|g' |awk {'print $1;'} |head -n 1` 736 net_list="$net_list $netdev" 737 # check if it's root by label 738 elif echo $rootdev | cut -c1-6 | grep -q "LABEL=" ; then 739 rootopts=$(echo $rootopts | sed -e 's/^r[ow],//' -e 's/,r[ow],$//' -e 's/,r[ow],/,/' \ 740 -e 's/^r[ow]$/defaults/' -e 's/$/,ro/') 741 majmin=$(get_numeric_dev dec /dev/root) 742 if [ -n "$majmin" ]; then 743 dev=$(findall /sys/block -name dev | while read device ; do \ 744 echo "$majmin" | cmp -s $device && echo $device ; done \ 745 | sed -e 's,.*/\([^/]\+\)/dev,\1,' ) 746 if [ -n "$dev" ]; then 747 vecho "Found root device $dev for $rootdev" 748 rootdev=$dev 749 fi 750 fi 751 else 752 rootopts=$(echo $rootopts | sed -e 's/^r[ow],//' -e 's/,r[ow],$//' -e 's/,r[ow],/,/' \ 753 -e 's/^r[ow]$/defaults/' -e 's/$/,ro/') 754 fi 755 [ "$rootfs" != "nfs" ] && handlelvordev $rootdev 756 757 # find the first swap dev which would get used for swsusp 758 swsuspdev=$(awk '/^[ \t]*[^#]/ { if ($3 == "swap") { print $1; }}' $fstab \ 759 | head -n 1) 760 if ! echo $swsuspdev | cut -c1-6 | grep -q "LABEL=" ; then 761 handlelvordev $swsuspdev 762 fi Have not tested on system with swap device referred by UUID yet, but it looks like it won't work either. BTW, RHEL5.2 does not have such a problem. 5.2 didn't contain the code to wait for critical disks. Ok, I'll look into this Created attachment 312665 [details]
patch to resolve uuid's in forgotten cases
Here you go Cai, I've not had time to test this yet, but I think it should fix
your problem. In addition to the driver find routine that you pointed out in
your previous patch, its converts labels and uuids to actual device names
before storing them in the critical_disks file. I've also removed the swap
detection case, since we really shouldn't need to use swap on these
systems,even in the event that we boot to the root file system. If you could
try this out, I would appreciate it. Thanks!
Hmm, I got this after applied the patch, [root@localhost sbin]# service kdump restart Stopping kdump: [ OK ] No kdump initial ramdisk found. [WARNING] Rebuilding /boot/initrd-2.6.27-0.166.rc0.git8.fc10.x86_64kdump.img sed: -e expression #1, char 15: unterminated address regex Starting kdump: [ OK ] Created attachment 312792 [details]
new mkdumprd patch
Oops, forgot to finish the rootdev regex. Ne wpatch. Thanks!
Same is true for Fedora 9. Findfs is not found in the kdump ramdisk. Could you please rollout a fixed busybox to Fedora 9, too? The patch for mkdumprd doesn't apply correctly to this version: kexec-tools.i386 1.102pre-10.fc9 installed Could you please check the mkdumprd script in Fedora 9, too? And correct when needed? Ivana, looks like findfs in busybox does not work correctly, busybox-1.10.3-2.fc10.x86_64 $ /sbin/findfs UUID=1aa24a07-5288-400a-8fa4-0bf9e58cd131 /dev/sda1 $ /sbin/busybox findfs UUID=1aa24a07-5288-400a-8fa4-0bf9e58cd131 <nothing> Additional information, # blkid /dev/sda1 /dev/sda1: LABEL="OK" UUID="1aa24a07-5288-400a-8fa4-0bf9e58cd131" SEC_TYPE="ext2" TYPE="ext3" Cai, you seem to be running findfs as non-root. Two questions: 1. Does it work under root? 2. Is your /sbin/findfs setuid root? (it's surprising that non-root on your machine can even *see* /sbin/* directory contents, let alone run a program there...) No, I was running it as root. Let me know if there is anything else I could provide. findfs 1.40.4 can determine the device by UUID from non-root by reading /etc/blkid/blkid.tab. Example of a line from this file: <device DEVNO="0x0801" TIME="1219041198" LABEL="/boot" UUID="624b7b0d-2a0e-480a-a803-d6d15de302a0" SEC_TYPE="ext2" TYPE="ext3">/dev/sda1</device> Strace of findfs: 29058 10:07:19.372756 open("/etc/blkid/blkid.tab", O_RDONLY) = 4 29058 10:07:19.372801 fstat64(4, {st_mode=S_IFREG|0644, st_size=964, ...}) = 0 29058 10:07:19.372860 fcntl64(4, F_GETFL) = 0 (flags O_RDONLY) 29058 10:07:19.372899 fstat64(4, {st_mode=S_IFREG|0644, st_size=964, ...}) = 0 29058 10:07:19.372957 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fc9000 29058 10:07:19.372997 _llseek(4, 0, [0], SEEK_CUR) = 0 29058 10:07:19.373035 read(4, "<device DEVNO=\"0xfd01\" TIME=\"121"..., 4096) = 964 29058 10:07:19.373118 read(4, "", 4096) = 0 29058 10:07:19.373154 close(4) = 0 29058 10:07:19.373189 munmap(0xb7fc9000, 4096) = 0 29058 10:07:19.373230 open("/etc/blkid/blkid.tab", O_RDONLY) = 4 29058 10:07:19.373274 fstat64(4, {st_mode=S_IFREG|0644, st_size=964, ...}) = 0 29058 10:07:19.373331 close(4) = 0 29058 10:07:19.373368 time(NULL) = 1219219639 29058 10:07:19.373404 open("/dev/sda1", O_RDONLY) = -1 EACCES (Permission denied) 29058 10:07:19.373453 fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 44), ...}) = 0 29058 10:07:19.373516 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fc9000 29058 10:07:19.373557 write(1, "/dev/sda1\n", 10) = 10 29058 10:07:19.373613 exit_group(0) = ? busybox findfs does not use /etc/blkid/blkid.tab, so it needs to be run as root, or findfs applet should be made setuid. (In reply to comment #19) > No, I was running it as root. Let me know if there is anything else I could > provide. Please do strace -o bb_findfs.log busybox findfs UUID=1aa24a07-5288-400a-8fa4-0bf9e58cd131 and attach resulting bb_findfs.log to this bug Fix for busybox findfs: diff -d -urpN busybox.9/include/applets.h busybox.a/include/applets.h --- busybox.9/include/applets.h 2008-08-20 10:19:40.000000000 +0200 +++ busybox.a/include/applets.h 2008-08-20 10:47:38.000000000 +0200 @@ -152,7 +152,7 @@ USE_FDISK(APPLET(fdisk, _BB_DIR_SBIN, _B USE_FETCHMAIL(APPLET_ODDNAME(fetchmail, sendgetmail, _BB_DIR_USR_BIN, _BB_SUID_NEVER, fetchmail)) USE_FEATURE_GREP_FGREP_ALIAS(APPLET_ODDNAME(fgrep, grep, _BB_DIR_BIN, _BB_SUID_NEVER, fgrep)) USE_FIND(APPLET_NOEXEC(find, find, _BB_DIR_USR_BIN, _BB_SUID_NEVER, find)) -USE_FINDFS(APPLET(findfs, _BB_DIR_SBIN, _BB_SUID_NEVER)) +USE_FINDFS(APPLET(findfs, _BB_DIR_SBIN, _BB_SUID_MAYBE)) USE_FOLD(APPLET(fold, _BB_DIR_USR_BIN, _BB_SUID_NEVER)) USE_FREE(APPLET(free, _BB_DIR_USR_BIN, _BB_SUID_NEVER)) USE_FREERAMDISK(APPLET(freeramdisk, _BB_DIR_SBIN, _BB_SUID_NEVER)) Of course it works only if busybox binary is setuid root (it typically should be). Created attachment 314615 [details]
log as required
Thank you Denys. Reassigning to busybox owner for inclusion. Ivana looks like busybox still has a problem parsing UUID's . Could you please review Denys patch? Thanks I downloaded and built new busybox in /tmp/v/busybox-1.12.0 on your machine, and unlike busybox 1.10.3 in /sbin it works: # pwd /tmp/v/busybox-1.12.0 # findfs UUID="1aa24a07-5288-400a-8fa4-0bf9e58cd131" /dev/sda1 # ./busybox findfs UUID="1aa24a07-5288-400a-8fa4-0bf9e58cd131" /dev/sda1 # /sbin/busybox findfs UUID="1aa24a07-5288-400a-8fa4-0bf9e58cd131" # I downloaded 1.10.3 and built it too in /tmp/v/busybox-1.10.3 with the same .config as 1.12.0, and it works too: # ./busybox findfs UUID="1aa24a07-5288-400a-8fa4-0bf9e58cd131" /dev/sda1 I can only conclude that busybox binary you have in /sbin was built without CONFIG_FEATURE_VOLUMEID_EXT=y Thanks Denys, the patch is OK, it is applied in busybox-1.10.3-3.fc10 now. I have tried the latest busybox and UUID kexec-tools patch, and confirmed it worked fine. I suppose the UUID patch has not committed yet, so reopen it and let Neil finish that part. Thanks! kexec updated in rawhide. (-16.f10) Thanks guys This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle. Changing version to '10'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping This message is a reminder that Fedora 10 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 10. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '10'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 10's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 10 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Fedora 10 changed to end-of-life (EOL) status on 2009-12-17. Fedora 10 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed. |