Red Hat Bugzilla – Bug 352001
I/O errors are thrown on FC storage lun not assigned to the host server.
Last modified: 2007-12-19 17:57:53 EST
Description of problem:
With a RHEL5.1 install on any poweredge server and an emulex card(which uses
the lpfc driver), I/O error are seen on the lun that is not assigned.
The Naviagent service was installed on the server, switch was zoned, storage
groups formed on the storage side and the host server was connected through
navisphere console to the storage group formed, but the LUNs were not
presented to the OS.
On rebooting the system I/O errors were observed in /var/log/messages.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Install RHEL5.1 with emulex card.
2. Connect the lp card to the CX storage box with a zoned switch connection.
3. Install naviagent on the installed system.
4. Form storage groups on the storage and connect them to the host server
through the naviagent service console but do not assign luns to the host
4. Reboot the system with the fibre channel connectivity.
1) I/O errors are seen on the storage lun which has not yet been assigned.
1) No I/O errors should be seen.
1) This issue has also been seen with RHEL-5 gold release i.e. kernel-2.6.18-
Created attachment 237251 [details]
/dev/sdb is the FC storage lun which has not been assigned to the host server
but still throws I/O errors.
I'm not entirely sure what Naviagent is, but from the sounds of the problem and
the messages in the log file, it certainly looks like a race condition in the
Naviagent software. The Emulex driver is actually working properly from what I
can see. It is getting an async notification of a new device on the fabric
(when the fabric came up, the device was there) and it adds the device to the
SCSI layer, the SCSI layer successfully gets an INQUIRY through to the device,
then it starts getting failures when it attempts to send the remaining commands
it normally sends during device scan (READ_CAPACITY and so on). Based on what
I've seen, and a rather limited knowledge of Naviagent, I would guess that once
the Naviagent software is brought up, it is possibly adding the devices, then
realizing they aren't exported to this machine and removing them, or something
like that. In the meantime, sometimes the Emulex driver notices the device
between the add/remove, and sometimes it doesn't, resulting in what you see.
In order to be any more help than this, I would need to know more about the
Naviagent software and it's role in device discovery (or alternatively, someone
inside Red Hat that knows more about it would have to take over for me...Cc:ing
Tom Coughlan since he might know if someone else is knowledgeable in Naviagent
Thanks for looking at this Doug. I'll ask Wayne at EMC to take a look.
The errors are on LUNZ. This is fake LUN that provides a path for in-band
comunication with the Clariion controller:
Oct 17 19:44:31 aknode5 kernel: lpfc 0000:04:00.0: 0:1303 Link Up Event x1
received Data: x1 xf7 x10 x0
Oct 17 19:44:31 aknode5 kernel: Vendor: DGC Model: LUNZ
Oct 17 19:44:31 aknode5 kernel: Type: Direct-Access
ANSI SCSI revision: 04
Oct 17 19:44:31 aknode5 kernel: sdb : READ CAPACITY failed.
Oct 17 19:44:31 aknode5 kernel: sdb : status=1, message=00, host=0, driver=08
Oct 17 19:44:31 aknode5 kernel: sd: Current: sense key: Illegal Request
Oct 17 19:44:31 aknode5 kernel: Add. Sense: Logical unit not supported
This happens when the WWID of the HBA port is not properly registered with the
Clariion. This may also happen when there are no LUNs assigned.
Hey Wayne, any updates to this? Thanks!
This is expected behavior. As Tom pointed out, this is a fake LUN used for in-
band communications via sg() between the host (Naviagent) and the array. Once
LUNs are assigned tothe storage group on teh array teh LUNZ will no longer be