There is still a bug here. The priority callouts need to be static, or else they will still lockup reading the library files.
I made all the non-static priority callouts simply symlinks to the static ones. This fixes the problem transparently to the user.
*** Bug 431994 has been marked as a duplicate of this bug. ***
So what is our recommended setting in /etc/multipath.conf ? We've been using /sbin/mpath_prio_[netapp|ontap] The .static binary has never been used in the config files anywhere IIRC. Why not just have the normal binaries static ?
Ben, So are you planning a new errata release which would make all the non-static callouts symlinks to the static ones? Or simply make only static callouts (without the .static extension)?
In response to comment #10, Yes, there is a new errata ready for QA that makes all the non-static callouts links to the static ones. The callouts with the .static extension need to be there for mkinitrd to work correctly. Eventually, we can remove them and just compile the regular ones statically. In response to comment #9, /sbin/mpath_prio_ontap is fine if you'd like, but anyone of them will work just fine. /sbin/mpath_prio_ontap /sbin/mpath_prio_netapp /sbin/mpath_prio_netapp.static are all just symlinks to /sbin/mpath_prio_ontap.static
added to RHEl5.2 release notes under "Resolved Issues": <quote> The priority callouts of dm-multipath are now statically compiled. This fixes a problem that occurs when running dm-multipath on devices containing the root filesystem, which caused such devices to freeze during fibre-channel path faults. </quote> please advise if any further revisions are required. thanks!
Currently, scanning for new devices is done by running the following commands: # echo "1" > /sys/class/fc_host/host<AdapterNo>/issue_lip # echo "- - -" > /sys/class/scsi_host/host<AdapterNo>/scan This is repeated for all host HBA ports. Would these commands remain the same for a root device multipath scenario? On my RHEL 5.1 root device multipathed host with Qlogic adapters, the host seems to freeze when the above commands are run. The console also throws up messages like the one listed below during the freeze: BUG: soft lockup detected on CPU#0! Call Trace: <IRQ> [<ffffffff800b50fa>] softlockup_tick+0xd5/0xe7 [<ffffffff800930e2>] update_process_times+0x42/0x68 [<ffffffff800746e3>] smp_local_timer_interrupt+0x23/0x47 [<ffffffff80074da5>] smp_apic_timer_interrupt+0x41/0x47 [<ffffffff8005bc8e>] apic_timer_interrupt+0x66/0x6c <EOI> [<ffffffff880941fc>] :scsi_mod:scsi_device_dev_release+0x0/0x16 [<ffffffff80140dc7>] kobject_release+0x0/0x9 [<ffffffff8812ef42>] :scsi_transport_fc:fc_user_scan+0x23/0x8b [<ffffffff8812ef7a>] :scsi_transport_fc:fc_user_scan+0x5b/0x8b [<ffffffff8809497c>] :scsi_mod:store_scan+0x9b/0xc5 [<ffffffff800ff5d6>] sysfs_write_file+0xb9/0xe8 [<ffffffff800161c7>] vfs_write+0xce/0x174 [<ffffffff80016a94>] sys_write+0x45/0x6e [<ffffffff8005b28d>] tracesys+0xd5/0xe0
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0128.html
Please ignore comment #15. The freeze was hit in my case because I had issued a LIP reset before the SCSI scan, which was actually not required. Once that was removed, the dynamic rescan worked fine on the root device multipathed host.
Hi, the RHEL5.2 release notes will be dropped to translation on April 15, 2008, at which point no further additions or revisions will be entertained. a mockup of the RHEL5.2 release notes can be viewed at the following link: http://intranet.corp.redhat.com/ic/intranet/RHEL5u2relnotesmockup.html please use the aforementioned link to verify if your bugzilla is already in the release notes (if it needs to be). each item in the release notes contains a link to its original bug; as such, you can search through the release notes by bug number. Cheers, Don