This bug has been copied from bug #549763 and has been proposed to be backported to 5.4 z-stream (EUS).
Created attachment 394164 [details] x86 fcoe storage results
Created attachment 394165 [details] x64 fcoe storage results
driver is not preserved across kernel updates. I tested two ways: First method: A) install rhel5.4 GA on cisco-ca-blade1.rhts.eng.bos.redhat.com, B) install DUP driver by following steps: [root@cisco-ca-blade1 ~]# wget http://people.redhat.com/jolsa/dup/be2net-lpfc/dd-lpfc-rhel5u4-8.2.0.63-1.0el5.x86_64.iso.gz [root@cisco-ca-blade1 ~]# gzip -d dd-lpfc-rhel5u4-8.2.0.63-1.0el5.x86_64.iso.gz [root@cisco-ca-blade1 ~]# mkdir mnt [root@cisco-ca-blade1 ~]# mount -o loop ./dd-lpfc-rhel5u4-8.2.0.63-1.0el5.x86_64.iso ./mnt/ [root@cisco-ca-blade1 ~]# find ./mnt/ -name *.rpm ./mnt/rpms/2.6.18-164.el5/x86_64/kmod-lpfc-xen-rhel5u4-8.2.0.63-1.0el5.x86_64.rpm ./mnt/rpms/2.6.18-164.el5/x86_64/kmod-lpfc-rhel5u4-8.2.0.63-1.0el5.x86_64.rpm [root@cisco-ca-blade1 ~]# rpm -ivh mnt/rpms/2.6.18-164.el5/x86_64/kmod-lpfc-rhel5u4-8.2.0.63-1.0el5.x86_64.rpm C) verify DUP driver is installed, then install rhel5.4.z kernel 2.6.18-164.11.1.el5 D) reboot to rhel5.4.z kernel, lpfc driver is still old driver(ie. 0:8.2.0.48.2p) , not DUP driver. But if change the /etc/depmod.d/depmod.conf.dist, from "search updates extra built-in weak-updates" to " search updates extra weak-updates built-in" then removie the rpm then install it again, then DUP is installed. Second method: install rhel5.4GA, then install rhel5.4.z, then install DUP driver. reboot to rhel5.4.z kernel, DUP driver is not installed. Is this a problem?
Your description matches my understanding of how the KMOD driver binary RPMs work. A driver binary RPM is built for a specific kernel rev. The DUDs you are using contain the LPFC 8.2.0.63 driver binary RPMs built for the RHEL5.4 GA kernel, 2.6.18-164.el5. When you install one of these KMOD RPMs, the RPM install process puts the driver in “/lib/modules/<kernel_version>/extra/”, where <kernel_version> is the specific kernel version it was built for (2.6.18-164.el5 in this case). The module-init-tools also creates soft links from “/lib/modules/<other_version>/weak-updates/lpfc.ko” to the real lpfc.ko (under <version>/extra). These soft links are created for any kernel that has a compatible kABI and can therefore load the lpfc.ko file (rhel5.4.z kernel 2.6.18-164.11.1.el5 in this case). The loading precedence of various drivers in a given RHEL5 kernel's /lib/modules directory is controlled by “/etc/depmod.d/depmod.conf.dist” which is part of the module-init-tools RPM. As you mentioned the depmod.conf.dist file contains a few comments and one real line: search updates extra built-in weak-updates So, according to this default load precedence, when you load the rhel5.4.z kernel 2.6.18-164.11.1.el5, because the updated 8.2.0.63 lpfc.ko driver was installed in "weak_updates", the "build-in" or in-box LPFC driver takes precedence (from /lib/modules/2.6.18-164.11.1.el5/kernel/drivers/scsi/lpfc/lpfc.ko). The way to work around this process in the updated rhel5.4.z kernel 2.6.18-164.11.1.el5, as you correctly stated, is to modify the search line in “/etc/depmod.d/depmod.conf.dist” file. Is this a problem? I don't know, this appears to be the way the KMOD built drivers work. The problem is, as you found out, if you update to a newer kernel (i.e. errata kernel), and you want a previously installed driver binary RPM to load by default on reboot, you'll need to change the search/load precedence. Either that or rebuild the driver binary RPM for the new/updated kernel rev. This appears to be a weakness though of the KMOD process, as the general advantage of when you build a KMOD driver binary RPM for a RHEL5 kernel, that automatically this driver binary RPM will be supported/loaded on all subsequent kernel updates, is not transparent on kernel updates and manual intervention is needed.
I thought there was an "override" option in the depmod file that would help address this...
Hi Gregg, When new lpfc is loaded, console keeps printing following messages: Feb 26 00:56:35 cisco-ca-blade1 kernel: lpfc 0000:04:00.1: 1:0310 Mailbox command x5 timeout Data: x0 x700 xffff81036a159200 Feb 26 00:56:35 cisco-ca-blade1 kernel: lpfc 0000:04:00.1: 1:0345 Resetting board due to mailbox timeout Feb 26 00:56:35 cisco-ca-blade1 kernel: lpfc 0000:04:00.1: 1:(0):2530 Mailbox command x23 cannot issue Data: xd00 x2 Is this a problem? thanks.
(In reply to comment #12) > Hi Gregg, > > When new lpfc is loaded, console keeps printing following messages: > > Feb 26 00:56:35 cisco-ca-blade1 kernel: lpfc 0000:04:00.1: 1:0310 Mailbox > command x5 timeout Data: x0 x700 xffff81036a159200 > Feb 26 00:56:35 cisco-ca-blade1 kernel: lpfc 0000:04:00.1: 1:0345 Resetting > board due to mailbox timeout > Feb 26 00:56:35 cisco-ca-blade1 kernel: lpfc 0000:04:00.1: 1:(0):2530 Mailbox > command x23 cannot issue Data: xd00 x2 > > Is this a problem? > > thanks. Vaios, Can you comment on this? Thanks, Rob
Vaios, Gregg, can either of u guys give some comments ? Alex He
What this trace snippet tells us is that init_link mailbox command timed out, which triggered mailbox command timeout handler reset the HBA, and then the following unreg_did mailbox command got rejected. Without the entire trace (/var/log/messages file) it's hard to tell what happened and what caused this. Could you please attach the /var/log/messages file? Also, when you say "When new lpfc is loaded...", what is the version of the new LPFC driver? How are you loading this "new lpfc driver", did you modify the search line in “/etc/depmod.d/depmod.conf.dist” file ? By the way, what is the HBA used in this configuration? -Vaios-
(In reply to comment #15) > What this trace snippet tells us is that init_link mailbox command timed out, > which triggered mailbox command timeout handler reset the HBA, and then the > following unreg_did mailbox command got rejected. > Without the entire trace (/var/log/messages file) it's hard to tell what > happened and what caused this. > > Could you please attach the /var/log/messages file? > > Also, when you say "When new lpfc is loaded...", what is the version of the new > LPFC driver? 8.2.0.63 > How are you loading this "new lpfc driver", did you modify the > search line in “/etc/depmod.d/depmod.conf.dist” file ? I modified the file so that the new driver is loaded, I loaded it by "modprobe lpfc" > By the way, what is the HBA used in this configuration? Sorry, I did not look the HBA. and the machine cisco-ca-blade1.rhts.eng.bos.redhat.com printed out the info is not available, since network and console does not work now. have asked lab admin to have a look at it. will update you the info as soon as the machine is ok.
Created attachment 400540 [details] Final DUPs Logs Output for x86 and x86_64 systems with DUP installed and be2net/lpfc/rpm/etc. cmds executed.
Created attachment 400567 [details] final dup x86 lpfc logs
Created attachment 400568 [details] final dup x64 lpfc logs
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2010-0156.html