Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
DescriptionCharlotte Richardson
2011-07-05 19:23:25 UTC
Description of problem:
This is related to Bug 626606. Mkdumprd can still hang in depsolve_modlist.
Version-Release number of selected component (if applicable):
kexec-tools-2.0.0-188.el6.x86_64
How reproducible:
Seen many times here. Believe I understand what is happening now.
Steps to Reproduce:
1. Install the stock RH 6.1 kernel.
2. Install our expedient kernel which has a necessary md fix for Bug 707268.
3. Without rebooting, install our occluding versions of drivers which pick up bug fixes our customers need. One set of these is the fusion drivers, mptbase.ko, mptsas.ko, and mptscsih.ko. Our mptsas introduces a dependency on scsi_transport_sas which the original driver did not have.
4. Still without rebooting, add some required kdump pre and post scripts and restart the kdump service. This runs mkdumprd against the expedient kernel. This is done by our installation scripts.
Actual results:
mkdumprd hangs looping in depsolve_modlist, unable to deal with the dependencies of the three fusion drivers listed above. The input module list never empties out.
Expected results:
This should work.
Additional info:
The problem appears to be due to the use of modprobe --show-depends in mkdumprd. In some places it uses modprobe --set-version $kernel to run modprobe against the kernel it is building for, and in other places it omits that parameter and thus is using the running kernel instead. In particular:
moduledep does not use --set-version
depsolve_modlist does not use --set-version (in two places)
findmodule does use --set-version
findstoragedriverinsys does use --set-version
findnetdriver does use --set-version
Created attachment 511448[details]
Proposed patch
Does adding '--set-version $kernel' to all those places fix the problem? Like this patch.
Comment 3Charlotte Richardson
2011-07-08 12:47:58 UTC
I'm still trying to verify it; I'll get back to you.
Comment 4Charlotte Richardson
2011-07-08 19:33:59 UTC
The patch is on the right track, but it does not completely fix the problem of using data for the modules of the wrong kernel when more than one kernel is installed. If you are doing mkdumprd for a kernel other than the one that is booted, the lsmod in /sbin/mkdumprd still gets data for the booted kernel, not the one you are building the kdump initrd for.
What I have installed right now on my system is a stock 6.1 kernel and our expedient kernel with the required fix in it, and that expedient kernel is what is booted. The expedient kernel also has the driver modules that we have to patch occluded (that is, there are .ko files in /lib/modules/2.6.32-131.../extra/... of the same names) because I installed them there, but the stock kernel does not. It happens that mopprobe --show-depends mptsas shows four things for the stock kernel but five for the expedient kernel since our fix to it added a reference to some symbol in scsi_hbas.ko. So the loop in depsolve_modlist never empties out the incoming list and so loops forever.
Comment 5Charlotte Richardson
2011-07-11 19:27:46 UTC
Never mind about the lsmod; it's OK as is. I finally reproduced the exact scenario, and your patch (adding the --set-version $kernel in the three places where it was missing originally) does fix it.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
http://rhn.redhat.com/errata/RHSA-2011-1532.html