Description of problem: mkdumprd has multiple problems on my F13 system, and does not generate any initrd. Recently I installed VirtualBox and the boot hung. I traced it back to a hang in depsolve_modlist. Version-Release number of selected component (if applicable): Fedora-13 kexec-tools.x86_64 2.0.0-36.fc13 VirtualBox-OSE.x86_64 3.2.6-2.fc13 kmod-VirtualBox-OSE-2.6.33.8-149.fc13.x86_64.x86_64 3.2.6-1.fc13.4 How reproducible: TOTALLY (under the stated conditions, latent defect normally) Steps to Reproduce: 1. yum install VirtualBox-OSE.x86_64 2. rm -f /boot/initrd*kdump.img 3. chkconfig kdump on 3. reboot Actual results: Hang trying to recreated In particular, loops incessantly with TMPINMODS stuck with the following entries that cannot be dealt with: /lib/modules/2.6.33.8-149.fc13.x86_64/extra/VirtualBox-OSE/vboxnetadp.ko /lib/modules/2.6.33.8-149.fc13.x86_64/extra/VirtualBox-OSE/vboxnetflt.ko Both of these modules exist on my system Expected results: No Hang Additional info: If your initrd already exists, this will be a long time latent problem until initrd needs to be rebuild. On my system there are other issues (improper parsing of mdadm.conf, and calls to now-gone "nash") that prevent building an initrd.
Obvious, it would be more productive to test as follows (rather that really failing at boot). That is what I did once I encountered the problem 1. rm -f /boot/initrd*kdump.img 2. service kdump start
Created attachment 442374 [details] mkdumprd from RHEL6 I've been pretty focused on RHEL6 work lately, and I think I've already fixed this there. Can you please try this mkdumprd script? (just copy it over the F-13 mkdumprd). I expect that will fix it.
I tried the version you posted, but unfortunately it exhibits the same failure mode. The script gets stuck in an inifinite loop with TMPINMODS stuck with the following values: /lib/modules/2.6.33.8-149.fc13.x86_64/extra/VirtualBox-OSE/vboxnetadp.ko /lib/modules/2.6.33.8-149.fc13.x86_64/extra/VirtualBox-OSE/vboxnetflt.ko It seems to me that the outer loop is a hang just waiting to happen. There should be some check for "no progress" that breaks the loop. Perhaps checking the new value of TMPINMODS with the previous value and breaking the loop if there is no change.
So, I just tried to recreate this on my F-13 system here, and was unable to I installed VirtualBox 3.2.8 from here: http://download.virtualbox.org/virtualbox/3.2.8/VirtualBox-3.2-3.2.8_64453_fedora13-1.i686.rpm And installed the latest F-13 kexec-tools (kexec-tools-2.0.0-36). I modprobed the vbox modules and started kdump. The initramfs built without error or hang. I verified that the generated initramfs created and loaded the vbox drivers as well. So I'm not quite sure whats going on here. I did notice that during the install my vbox drivers installed to /lib/modules/`uname -r`/misc/ instead of to /lib/modules/`uname -r`/extra/VirtualBox-OSE/, not sure why that would make a difference though, as long as modprobe knows how to find the modules. Can you please edit /sbin/mkdumprd to add a set -x to the top of the depsolve_modlist function and a set +x to the bottom of the function. Then restart the service, let it run for a few minutes, ctrl-c out of it, and send in the output log please? Thanks!
Created attachment 442763 [details] Output of mkdumprd with tracing on in key function
Neil: Several notes: 1) I just realized I am getting VirtualBox-OSE from rpmfusion-free-updates. I had not paid attention to that, just thinking, "cool it is in Fedora". My VirtualBox-OSE version is 3.2.6-2.fc13. 2) My vboxdrv files are going into /lib/modules/<uname>/extra/... as shown /lib/modules/2.6.34.6-47.fc13.x86_64/extra/VirtualBox-OSE/vboxdrv.ko /lib/modules/2.6.34.6-47.fc13.x86_64/extra/VirtualBox-OSE/vboxguest.ko /lib/modules/2.6.34.6-47.fc13.x86_64/extra/VirtualBox-OSE/vboxnetadp.ko /lib/modules/2.6.34.6-47.fc13.x86_64/extra/VirtualBox-OSE/vboxnetflt.ko /lib/modules/2.6.34.6-47.fc13.x86_64/extra/VirtualBox-OSE/vboxsf.ko /lib/modules/2.6.34.6-47.fc13.x86_64/extra/VirtualBox-OSE/vboxvideo.ko 3) I enabled tracing in depsolve_modlist, and also added "echo TMPINMODS=$TMPINMODS" at the top of the loop to help determine if when we got to the non-productive point. I deleted the remainder of the log file after a few non-productive iterations had occurred. (See attached).
Ok, I think I see whats happening here. IT would appear that we're never satisfying any of the requirements to put any of the vbox* modules on the output module list, which means that we think we've never solved their dependencies. Can you please, attach the output of this command: for i in `ls /lib/modules/2.6.34.6-47.fc13.x86_64/extra/VirtualBox-OSE/*.ko` do echo OUTPUT FOR $i mname=`basename $i | sed -e's/\.ko//' modprobe --show-depends $mname 2>/dev/null | awk '/insmod/ {print $2}' echo " " done That should let me figure out how I'm parsing the deptree for these modules differently. thanks!
Here is the output you requested: OUTPUT FOR /lib/modules/2.6.34.6-47.fc13.x86_64/extra/VirtualBox-OSE/vboxdrv.ko /lib/modules/2.6.34.6-54.fc13.x86_64/extra/VirtualBox-OSE/vboxdrv.ko OUTPUT FOR /lib/modules/2.6.34.6-47.fc13.x86_64/extra/VirtualBox-OSE/vboxguest.ko /lib/modules/2.6.34.6-54.fc13.x86_64/extra/VirtualBox-OSE/vboxguest.ko OUTPUT FOR /lib/modules/2.6.34.6-47.fc13.x86_64/extra/VirtualBox-OSE/vboxnetadp.ko /lib/modules/2.6.34.6-54.fc13.x86_64/extra/VirtualBox-OSE/vboxguest.ko /lib/modules/2.6.34.6-54.fc13.x86_64/extra/VirtualBox-OSE/vboxnetadp.ko OUTPUT FOR /lib/modules/2.6.34.6-47.fc13.x86_64/extra/VirtualBox-OSE/vboxnetflt.ko /lib/modules/2.6.34.6-54.fc13.x86_64/extra/VirtualBox-OSE/vboxdrv.ko /lib/modules/2.6.34.6-54.fc13.x86_64/extra/VirtualBox-OSE/vboxguest.ko /lib/modules/2.6.34.6-54.fc13.x86_64/extra/VirtualBox-OSE/vboxnetflt.ko OUTPUT FOR /lib/modules/2.6.34.6-47.fc13.x86_64/extra/VirtualBox-OSE/vboxsf.ko /lib/modules/2.6.34.6-54.fc13.x86_64/extra/VirtualBox-OSE/vboxguest.ko /lib/modules/2.6.34.6-54.fc13.x86_64/extra/VirtualBox-OSE/vboxsf.ko OUTPUT FOR /lib/modules/2.6.34.6-47.fc13.x86_64/extra/VirtualBox-OSE/vboxvideo.ko /lib/modules/2.6.34.6-54.fc13.x86_64/kernel/drivers/i2c/i2c-core.ko /lib/modules/2.6.34.6-54.fc13.x86_64/kernel/drivers/gpu/drm/drm.ko /lib/modules/2.6.34.6-54.fc13.x86_64/extra/VirtualBox-OSE/vboxvideo.ko
Line wrapping often broke the "OUTPUT FOR XXX" into two lines above.
So I think I found at least part of the problem. When you start the kdump service, can you tell me if the vboxguest.ko module is loaded? Looking at the output dump you provided, vboxguest never shows up on the input module list of modules to add, which, according to the above list, it should if you have any of the vbox modules loaded. Whats happening is that we're waiting to solve a dependency which can never be solved, but the thing we are dependent on isn't on the input list of modules. So we need to figure out why that module isn't on the input list. Can you please: 1) look to make sure that vboxguest.ko is loaded when the kdump service starts 2) make sure that its not changing its name during registration. Thanks!
Neil -- I am getting back to this issue, but in the interim I have upgraded from Fedora-13 to Fedora-14. This has introduced some more snags, as follows: 1) The F14 mkdumprd still calls "nash", so I used your attachment of 2010-09-02. 2) However that script fails with: "grep: character class syntax is [[:space:]], not [:space:]" I fixed the typo at line 1352, and then get the following: No module ARRAY found for kernel 2.6.35.9-64.fc14.x86_64, aborting. This sounded very familiar, and I relocated https://bugzilla.redhat.com/show_bug.cgi?id=479211. I am indeed running with an MD raid, here is my /etc/mdadm.conf: # mdadm.conf written out by anaconda MAILADDR root AUTO +imsm +1.x -all ARRAY /dev/md0 UUID=7a4eb642:0041f6a6:25eac650:1e2f64f0 ARRAY /dev/md127 UUID=761c32bf:2f4e01eb:c7b3da57:a14787fa Any suggestions as to how to proceed past this little speed bump? I have NOT yet re-installed VirtualBox -- one issue at a time :-)
you should just be able to apply the patch here: https://bugzilla.redhat.com/attachment.cgi?id=328595 to /sbin/mkdumprd and that problem will be fixed. I'll commit it shortly.
Actually, scratch that, are you sure you're upgrade replaced the contents of /sbin/mkdumprd properly? I ask because the changes that caused that problem above have been removed from f14, such that that patch shouldn't be needed. Can you try removing kexec-tools, verifying that /sbin/mkdumprd* is gone, then re-installing?
Neil -- My "upgrade" was really a fresh install (including a reformat of the root partition). So my comment #11 was based on starting with the latest F14 copy of kexec-tools, namely kexec-tools-2.0.0-39.fc14.1.x86_64. I tried to apply the patch in comment #12 to the F14 mkdumprd, but they were not compatible. By hand I was able to apply the first two changes, but the third was not consistent with the existing F14 mkdumprd.
Created attachment 468920 [details] modified version of patch from comment 12 here you go, I took a look at the patch, and noted some discrepancies that needed to be updated. I still don't think it should have caused a hang, but the resultant finds for modules might have taken a hideously long time with the lack of fmPath modifications. Anywho, let me know how this works for you.
Neil -- I applied the patch in Comment 15 to my Fedora-14 mkdumprd (kexec-tools 39.fc14.1). The patch applies fine, but "mkdumprd" dies with the previously discussed "nash: command not found". Here is what I did and the output: patch < patch-file rm /boot/initrd-2.6.35.9-64.fc14.x86_64kdump.img service kdump start No kdump initial ramdisk found. [WARNING] Rebuilding /boot/initrd-2.6.35.9-64.fc14.x86_64kdump.img IN HANDLERAID /sbin/mkdumprd: line 952: nash: command not found Starting kdump: [FAILED] I'm mystified as to how this works for you. Regards, Charlie
Created attachment 469123 [details] updated patch IT works for me because I don't nominally use software raid. I'm not testing this at the moment because I don't have time and I'm trying to help you as best I'm able with the cycles I have available. Heres an updated patch that fixes the nash problem as well.
Neil -- I applied the patch in Comment 17. That works better, although there was a dangling reference to "emitdms" at line 1819 of the patched dumprd. I commented that out and we seem to build a dump.img sucessfully. I'm about to reinstall VirtualBox, but wanted to report the success thus far.
Ok, I'll fix the dangling refrence and update the netdump-server package with this bz. If you have subsequent problems, please open a new bug. Thanks
Neil -- I just installed VirtualBox-OSE.x86_64 (3.2.10-1.fc14 from rpmfusion-free-updates). The infinite loop that we had encountered in the past is still present. The 2 modules that cannot be found are: /lib/modules/2.6.35.9-64.fc14.x86_64/extra/VirtualBox-OSE/vboxnetadp.ko /lib/modules/2.6.35.9-64.fc14.x86_64/extra/VirtualBox-OSE/vboxnetflt.ko Interestingly, a 3rd VirtualBox module IS found: /lib/modules/2.6.35.9-64.fc14.x86_64/extra/VirtualBox-OSE/vboxnetflt.ko I could see no mention of vboxguest.ko in the module names added to "TMPINMODS". Here is the entire initial list of modules: /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/fs/fuse/fuse.ko /sbin/modprobe /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/fs/nfs_common/nfs_acl.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/net/sunrpc/auth_gss/auth_rpcgss.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/fs/exportfs/exportfs.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/fs/lockd/lockd.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/cpufreq/cpufreq_ondemand.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/cpufreq/freq_table.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/arch/x86/kernel/cpu/cpufreq/mperf.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/net/netfilter/nf_conntrack_netbios_ns.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/net/ipv6/netfilter/ip6t_REJECT.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/net/ipv6/netfilter/nf_conntrack_ipv6.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/net/ipv6/netfilter/ip6table_filter.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/net/ipv6/netfilter/ip6_tables.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/net/ipv6/ipv6.ko /lib/modules/2.6.35.9-64.fc14.x86_64/extra/VirtualBox-OSE/vboxnetadp.ko /lib/modules/2.6.35.9-64.fc14.x86_64/extra/VirtualBox-OSE/vboxnetflt.ko /lib/modules/2.6.35.9-64.fc14.x86_64/extra/VirtualBox-OSE/vboxdrv.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/arch/x86/kvm/kvm-intel.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/arch/x86/kvm/kvm.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/input/misc/uinput.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/sound/pci/hda/snd-hda-codec-analog.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/sound/pci/hda/snd-hda-intel.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/sound/pci/hda/snd-hda-codec.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/sound/core/snd-hwdep.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/sound/core/seq/snd-seq.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/sound/core/seq/snd-seq-device.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/char/ppdev.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/watchdog/iTCO_wdt.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/sound/core/snd-timer.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/i2c/busses/i2c-i801.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/net/tg3.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/parport/parport_pc.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/sound/core/snd.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/watchdog/iTCO_vendor_support.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/edac/i7core_edac.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/parport/parport.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/platform/x86/dell-wmi.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/edac/edac_core.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/input/serio/serio_raw.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/firmware/dcdbas.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/sound/soundcore.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/sound/core/snd-page-alloc.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/input/joydev.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/platform/x86/wmi.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/arch/x86/kernel/microcode.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/md/raid1.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/gpu/drm/nouveau/nouveau.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/gpu/drm/ttm/ttm.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/gpu/drm/drm_kms_helper.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/gpu/drm/drm.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/i2c/algos/i2c-algo-bit.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/acpi/video.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/video/output.ko /lib/modules/2.6.35.9-64.fc14.x86_64/kernel/drivers/i2c/i2c-core.ko
Whoops Comments 19 and 20 collided. I bailed out an resubmitted 20, then read 19. Should I open a new bz for the VirtualBox issue, or did you mean anything else other that the VirtualBox issue? Regards -- Charlie
netdump-server-0.7.16-23.el5 has been submitted as an update for Fedora EPEL 5. https://admin.fedoraproject.org/updates/netdump-server-0.7.16-23.el5
netdump-server-0.7.16-23.el5 has been pushed to the Fedora EPEL 5 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update netdump-server'. You can provide feedback for this update here: https://admin.fedoraproject.org/updates/netdump-server-0.7.16-23.el5
netdump-server-0.7.16-23.el5 has been pushed to the Fedora EPEL 5 stable repository. If problems still persist, please make note of it in this bug report.