Description of problem: srp breaks and fails to connect to any targets under opensm3.3.3-* on rhel5.? due to upstream bug. (since fixed upstream) Version-Release number of selected component (if applicable): opensm-3.3.3-1.el5 How reproducible: configure an srp target on ib, initiate with earlier or later versions of opensm. switch to 3.3.3-*. scsi requests now fails to return. Steps to Reproduce: 1.use opensm 3.3.3 2.srp_daemon scan ib for available targets 3. timeouts Actual results: 2010-10-18T15:42:36.001831-04:00 xio67 kernel: scsi1 : SRP.T10:50001FF500050208 2010-10-18T15:42:41.502644-04:00 xio67 kernel: host1: SRP abort called 2010-10-18T15:42:46.501661-04:00 xio67 kernel: host1: SRP reset_device called 2010-10-18T15:42:51.500682-04:00 xio67 kernel: host1: ib_srp: SRP reset_host call ed state 0 qp_err 0 2010-10-18T15:43:11.501750-04:00 xio67 kernel: host1: SRP abort called 2010-10-18T15:43:16.500763-04:00 xio67 kernel: host1: SRP reset_device called 2010-10-18T15:43:21.500780-04:00 xio67 kernel: host1: ib_srp: SRP reset_host call ed state 0 qp_err 0 2010-10-18T15:43:31.501804-04:00 xio67 kernel: scsi 1:0:0:0: scsi: Device offline d - not ready after error recovery 2010-10-18T15:43:31.501813-04:00 xio67 kernel: scsi 1:0:0:0: timing out command, waited 22s Expected results: 2010-10-19T16:36:11.833770-04:00 xio67 kernel: scsi1 : SRP.T10:50001FF500050208 2010-10-19T16:36:11.835364-04:00 xio67 kernel: Vendor: IBM Model: DCS9900 Rev: 6.05 2010-10-19T16:36:11.835373-04:00 xio67 kernel: Type: Direct-Access ANSI SCSI revision: 05 2010-10-19T16:36:11.835722-04:00 xio67 kernel: sdb : very big device. try to use READ CAPACITY(16). 2010-10-19T16:36:11.835844-04:00 xio67 kernel: SCSI device sdb: 15627665408 512-b yte hdwr sectors (8001365 MB) 2010-10-19T16:36:11.835996-04:00 xio67 kernel: sdb: Write Protect is off 2010-10-19T16:36:11.836202-04:00 xio67 kernel: SCSI device sdb: drive cache: writ e back w/ FUA 2010-10-19T16:36:11.836429-04:00 xio67 kernel: sdb : very big device. try to use READ CAPACITY(16). 2010-10-19T16:36:11.836534-04:00 xio67 kernel: SCSI device sdb: 15627665408 512-b yte hdwr sectors (8001365 MB) 2010-10-19T16:36:11.836639-04:00 xio67 kernel: sdb: Write Protect is off 2010-10-19T16:36:11.836851-04:00 xio67 kernel: SCSI device sdb: drive cache: writ e back w/ FUA 2010-10-19T16:36:11.854145-04:00 xio67 kernel: sdb: unknown partition table Additional info: This was broken in upstream patch 3d20f82edd3246879063b77721d0bcef927bdc48 for opensm 3.3.3. Has since been patched in post 12/16/09 versions. This forces us to run later hand-built versions of opensm. Only appears to break srp calls and some minor other ib traffic. Fix is to include patch 5201f84* from opensm 3.3.5 and later. (ok patch is really 520af849615e7ee603b96498da9f3bc554470c06 but, you know. ^->)
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0969.html