Hide Forgot
Description of problem: SCSI-3 PR commands failing with invalid host byte(0x17) field. Version-Release number of selected component (if applicable): 6.0 How reproducible: consistently reproducable Steps to Reproduce: 1. do SCSI-3 PR registrations on paths of LUN 2. Issue SCSI-3 PR command to clear all registrations done on LUN 3. command fails with host_byte value set to 0x17 Actual results: SCSI-3 PR clear command fails with host_byte value set to 0x17 Expected results: SCSI-3 PROUT out command should clear registrations as expected. In case failure should return with valid host_byte value set. 0x17 seems to be invalid value. Additional info: Please find details steps/logs of issue used with Veritas Multi-pathing product 1) Take LUN hitachi_vsp0_090f having 4 paths [root@punb200m2labs01vm9 ~]# vxdmpadm getsubpaths dmpnodename=hitachi_vsp0_090f NAME STATE[A] PATH-TYPE[M] CTLR-NAME ENCLR-TYPE ENCLR-NAME ATTRS ================================================================================ sdee ENABLED(A) - c4 Hitachi_VSP hitachi_vsp0 - sdeo ENABLED(A) - c3 Hitachi_VSP hitachi_vsp0 - sdmb DISABLED - c3 Hitachi_VSP hitachi_vsp0 - sdmg ENABLED(A) - c4 Hitachi_VSP hitachi_vsp0 - 2) LUNs current PR status [root@punb200m2labs01vm9 ~]# vxdmppr read /dev/vx/rdmp/hitachi_vsp0_090f KEY-TYPE RES-TYPE ASCII-KEY HEX-VALUE PRgeneration ------------------------------------------------------------------------------- REG - CPGR0012 0x4350475230303132 0x119 3) Issue clear command to clear registrations [root@punb200m2labs01vm9 ~]# vxdmppr clear -r CPGR0012 /dev/vx/rdmp/hitachi_vsp0_090f; date Fri Oct 14 03:40:31 IST 2011 4) All the paths went into failed state. [root@punb200m2labs01vm9 ~]# vxdmpadm getsubpaths dmpnodename=hitachi_vsp0_090f NAME STATE[A] PATH-TYPE[M] CTLR-NAME ENCLR-TYPE ENCLR-NAME ATTRS ================================================================================ sdee DISABLED - c4 Hitachi_VSP hitachi_vsp0 - sdeo DISABLED - c3 Hitachi_VSP hitachi_vsp0 - sdmb DISABLED - c3 Hitachi_VSP hitachi_vsp0 - sdmg DISABLED - c4 Hitachi_VSP hitachi_vsp0 - Reason being SCSI-3 PR OUT clear command failed with host_byte field set to 0x17. msg_byte is set to 0. Valid values the host_byte can contain are 0 to 0x11. 0x17 host_byte value is not expected. VxVM syslog messages captured for additional info: Oct 14 03:40:36 punb200m2labs01vm9 kernel: sd 3:0:0:69: reservation conflict Oct 14 03:40:36 punb200m2labs01vm9 kernel: VxVM vxdmp V-5-3-0 dmp_recv_scsipkt: SCSI request failure host_byte = 0x17 msg_byte = 0x0 Oct 14 03:40:36 punb200m2labs01vm9 kernel: Oct 14 03:40:36 punb200m2labs01vm9 kernel: VxVM vxdmp V-5-3-0 dmp_check_scsipkt: SCSI request failure host_byte = 0x17 msg_byte = 0x0 rq_status = 0x7 Oct 14 03:40:36 punb200m2labs01vm9 kernel: Oct 14 03:40:36 punb200m2labs01vm9 kernel: VxVM vxdmp V-5-0-0 SCSI error opcode=0x5f returned rq_status=0x7 cdb_status=0x0 key=0x0 asc=0x0 ascq=0x0 on path 129/0x0 Oct 14 03:40:36 punb200m2labs01vm9 kernel: Oct 14 03:40:36 punb200m2labs01vm9 kernel: VxVM vxdmp V-5-3-0 dmp_pr_send_cmd failed with transport error: uscsi_rqstatus = 7ret = -1 status = 0 on dev 129/0x0 Oct 14 03:40:36 punb200m2labs01vm9 kernel: Oct 14 03:40:36 punb200m2labs01vm9 kernel: VxVM vxdmp V-5-0-112 disabled path 129/0x0 belonging to the dmpnode 201/0x120 due to path failure PR-OUT clear command using sg_persist utility also fails with similar reason: [root@punb200m2labs01vm9 include]# sg_persist --out --clear --param-sark=4350475230303132 --verbose /dev/sdee inquiry cdb: 12 00 00 00 24 00 HITACHI OPEN-V 7002 Peripheral device type: disk Persistent Reservation Out cmd: 5f 03 00 00 00 00 00 00 18 00 persistent reserve out: transport: Host_status=0x17 is invalid Driver_status=0x00 [DRIVER_OK, SUGGEST_OK] PR out: command failed ======= Here also PR OUT command failed with host_status 0x17
Since RHEL 6.2 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux.
Since RHEL 6.3 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux.
Hi, Could you provide the full dmesg output of this system? In particular I would like to know which driver has registered "sd 3:0:0:69" The host_byte is set by the hba driver, not by the general SCSI stack, so this is likely to be driver error if I understand it correctly. Cheers, Jes
Hi, In addition, could you please provide the output of the following command from the troublesome system: udevadm info -a -n /dev/vx/rdmp/hitachi_vsp0_090f | grep DRIVER Thanks, Jes
(In reply to comment #5) > Hi, > > Could you provide the full dmesg output of this system? In particular > I would like to know which driver has registered "sd 3:0:0:69" > > The host_byte is set by the hba driver, not by the general SCSI stack, so > this is likely to be driver error if I understand it correctly. That is not entirely correct. The host_byte can and will be manipulated by the SCSI mid layer.. but such manipulation should be reset (e.g. to DID_OK) before returning to the process that invoked the ioctl. comment#0 refers to RHEL6.0. Is this issue reproducible on RHEL > 6.0?
We are preparing testbed again to provide necessary information. Issue was first seen with RHEL 6.0. Need to check with > 6.0.. -- mukesh bafna, Symantec
(In reply to comment #10) > We are preparing testbed again to provide necessary information. Issue was > first seen with RHEL 6.0. Need to check with > 6.0.. Time is running out for 6.4.
Sorry for delay. Corresponding machine resources were released and its taking time to reacquire them. We are working towards it and hope to reply in couple of days.
(In reply to comment #0) > SCSI-3 PR clear command fails with host_byte value set to 0x17 ... > Reason being SCSI-3 PR OUT clear command failed with host_byte field set to > 0x17. msg_byte is set to 0. Valid values the host_byte can contain are 0 to > 0x11. 0x17 host_byte value is not expected. (In reply to comment #5) > The host_byte is set by the hba driver, not by the general SCSI stack, so > this is likely to be driver error if I understand it correctly. (In reply to comment #14) > [root@punb200m2labs01vm7 ~]# udevadm info -a -n /dev/sdp | grep DRIVER > DRIVER=="" > DRIVERS=="sd" > DRIVERS=="" > DRIVERS=="" > DRIVERS=="" > DRIVERS=="fnic" > DRIVERS=="pcieport" ... > 3. We have hit this issue with RHEL6.0GA, RHEL-6.1. We need to setup > resource and check for RHEL-6.2 and RHEL-6.3. Chris, Please take a look to see if the fnic driver can return host_byte set to 0x17 (and confirm that this is indeed not correct). Check for fixes in this area. (I think the last fnic update was in 6.1, but maybe the change is in common code?) Tom
I'm suspecting that this may be a duplicate of bug#787282, in which case it would be fixed in kernel-2.6.32-231.el6. It's certainly a case of the bits from two different error codes being set, most likely either DID_TARGET_FAILURE or DID_NEXUS_FAILURE being set in the scsi_eh thread and a generic DID_ERROR being set in fnic.
I'm closing this, based on comment 17 "this may be a duplicate of bug#787282". Re-open if the problem is seen again with kernel >= kernel-2.6.32-231.el6. *** This bug has been marked as a duplicate of bug 787282 ***