Hide Forgot
Description of problem: We have at least 3 customers now having devices taken offline due to the sdxx->medium_access_timed_out being incremented in sd_eh_action() when the zfcp driver goes through error handling. The result of this is that after recovery, we wont have multipath paths re-enabled and we are exposed to losing complete device access if the surviving path fails. Version-Release number of selected component (if applicable): Seen on RHEL 6.6+ kernels, likely also an issue in earlier kernels. How reproducible: During fabric events and recovery we will see the sequence of recovery events lead to medium_access_timed_out++ and when it exceeds 2 (default setting for max_medium_access_timeouts) then we take the devices offline. We had a similar issue with the fnic driver fixed by Cisco in BZ 1341298 and I was concerned that the zfcp driver may be going through the same sequence so reached out to IBM. Steps to Reproduce: 1.System is running 2.Fabric events happen 3.We lose device Actual results: Devices taken offline, when they actually are still accessible Expected results: We do not take devices offline Additional info: This is a tough problem to solve with zfcp changes as detailed below in the response from Benjamin Block at IBM. For now we are going to suggest setting the max_medium_access_timeouts to a high value to avoid false disconnects. From customers log root@xxxxxx:PROD:~> zcat /var/log/messages-20160815.gz | grep sdy Aug 15 00:32:42 xxxxxx kernel: sd 0:0:1:4: [sdy] Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK Aug 15 00:32:42 xxxxxx kernel: sd 0:0:1:4: [sdy] CDB: Read(10): 28 00 03 78 ea c0 00 00 20 00 Aug 15 00:32:42 xxxxxx kernel: end_request: I/O error, dev sdy, sector 58256064 Aug 15 00:52:17 xxxxxx multipathd: mpathm: sdy - tur checker reports path is up root@nzvmds728:PROD:~> The key here is that we receive DID_TIME_OUT so we offline the disk 1360 static int sd_eh_action(struct scsi_cmnd *scmd, int eh_disp) 1361 { 1362 struct scsi_disk *sdkp = scsi_disk(scmd->request->rq_disk); 1363 1364 if (!scsi_device_online(scmd->device) || 1365 !scsi_medium_access_command(scmd) || 1366 host_byte(scmd->result) != DID_TIME_OUT || 1367 eh_disp != SUCCESS) 1368 return eh_disp; 1369 1370 /* 1371 * The device has timed out executing a medium access command. 1372 * However, the TEST UNIT READY command sent during error 1373 * handling completed successfully. Either the device is in the 1374 * process of recovering or has it suffered an internal failure 1375 * that prevents access to the storage medium. 1376 */ 1377 sdkp->medium_access_timed_out++; 1378 1379 /* 1380 * If the device keeps failing read/write commands but TEST UNIT 1381 * READY always completes successfully we assume that medium 1382 * access is no longer possible and take the device offline. 1383 */ 1384 if (sdkp->medium_access_timed_out >= sdkp->max_medium_access_timeouts) { 1385 scmd_printk(KERN_ERR, scmd, 1386 "Medium access timeout failure. Offlining disk!\n"); <<---------- 1387 scsi_device_set_state(scmd->device, SDEV_OFFLINE); 1388 1389 return FAILED; 1390 } 1391 1392 return eh_disp; 1393 } Response from Benjamin Block @IBM Hello Laurence, some small update for the problems your customers see. I removed some parts of the history so it doesn't get too long. On 00:21 Wed 24 Aug , Laurence Oberman wrote: > ----- Original Message ----- > > From: "Steffen Maier" <maier.ibm.com> > > To: "Laurence Oberman" <loberman> > > Cc: "Benjamin Block" <bblock.ibm.com> > > Sent: Wednesday, June 22, 2016 10:20:07 AM > > Subject: Re: Issue seen with zfcp seems to match the known issue with the fnic driver whihc was not returning > > DID_ABORT > > > > Hi Laurence, > > > > On 06/20/2016 09:13 PM, Laurence Oberman wrote: > > > I have a customer using the zfcp driver in RHEL7 and they are seeing [:snip:] > > > > We are aware of a few zfcp bugs regarding recovery and we're almost done > > fixing them, so stay tuned: > > > > 1) > > Race in blocking fc_rport on fabric RSCN unnecessarily causing and > > potentially escalating scsi_eh. > > I suspect this to also erroneously trigger sd's medium access control > > because a TUR might succeed but a subsequent I/O command might again > > fail (with DID_TIME_OUT) due to the race (fooling fc_timed_out()). > > This might be related to (where we did not yet know that it's in zfcp) > > LTC bug 129581 / RH bug 1258680 > > "RHEL6.7 - I/O lockup on FS or dm layer after a few target port cable > > pull iterations" > > We have since then fixed this one Bug mentioned above by Steffen. Martin K. Petersen accepted those fixes for 4.9. This should ease the situation with Medium access timeout. But with recent discoveries in our code, we are not sure whether this fixes all the problems we currently see. But the Medium access timeouts themself are still buggy in quite some other ways, see below for that. > > > > 2) > > Use-after-free for lun and target reset TMF causing kernel panic in > > response handler path. > > We have a working fix for this too, but haven't had yet time to fully review this. > > > > > We had a similar issue with the fnic driver and recently Cisco has > > > addressed this with a patch. > > > See below. > > > > > > I am wondering if we need to also address this in the zfcp driver. [:snip:] > > > } > > > } > > > > So this looks like DID_TRANSPORT_DISRUPTED is also a case which does not > > lead to an error (handling) result, so maybe zfcp is good (enough) here. > > > > What do you think? > > Like Steffen said in his last mail, we still think that returning commands with DID_TRANSPORT_DISRUPTED doesn't causes the issues you/your customers see here. > > We have another 2 customers seeing this now where fabric aborts on zfcp lead to the > "Medium access timeout failure.Offlining disk!" issue. > > This make me wonder if we should look into changing the response in the zfcp driver. > > Aug 15 00:32:42 xxxxxxx kernel: sd 0:0:1:4: [sdy] Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK > Aug 15 00:32:42 xxxxxxx kernel: sd 0:0:1:4: [sdy] CDB: Read(10): 28 00 03 78 ea c0 00 00 20 00 > Aug 15 00:32:42 xxxxxxx kernel: end_request: I/O error, dev sdy, sector 58256064 > Aug 15 00:52:17 xxxxxxx multipathd: mpathm: sdy - tur checker reports path is up > root@nzvmds728ROD:~> > > The key here is that we receive DID_TIME_OUT so we offline the disk > [:snip:] > So receiving DID_TIME_OUT for all commands that in fact did time out is really not the problem here, but rather how the SD code counts this. While it is not yet clear whether we sometimes cause those timeouts ourself - by causing race-condition like the one Steffen described in Bug (1) -, the handling in SD is still bad. Lets have a short look (the source-code shown is from a RHEL7.2 kernel): Lets assume SCSI EH starts with 2 or more commands for a single sdev that did timeout - for whatever reason. In case of zFCP - where we implement the different eh-hooks for aborts, resets and the like manually and don't use the general eh_strategy_handler()-hook - EH will call scsi_unjam_host() for the host in question: 2162 int scsi_error_handler(void *data) 2163 { .... 2205 if (shost->transportt->eh_strategy_handler) 2206 shost->transportt->eh_strategy_handler(shost); 2207 else 2208 scsi_unjam_host(shost); .... 2229 } In scsi_unjam_host() it will put all pending commands into its own eh_work_q. Then it will get sense-data for each of those commands that contain a Check Condition.. but with commands that did timeout, this is not the case, so getting sense will be a no-op here: 2131 static void scsi_unjam_host(struct Scsi_Host *shost) 2132 { .... 2143 if (!scsi_eh_get_sense(&eh_work_q, &eh_done_q)) 2144 if (!scsi_eh_abort_cmds(&eh_work_q, &eh_done_q)) 2145 scsi_eh_ready_devs(shost, &eh_work_q, &eh_done_q); .... 2152 } EH will next try to abort all these commands in scsi_eh_abort_cmds(). So next this will lead to zFCP's abort hook zfcp_scsi_eh_abort_handler() to be called for each command, in the effort to abort them. For this example, lets assume that all those aborts succeeded - it might be that the commands that did timeout got lost or something, but the storage is fine otherwise, so new SCSI commands succeed as they should. This will lead to all those commands being put into the local check_list and this to be given to scsi_eh_test_devices() to be tested. 1315 static int scsi_eh_abort_cmds(struct list_head *work_q, 1316 struct list_head *done_q) 1317 { 1318 struct scsi_cmnd *scmd, *next; 1319 LIST_HEAD(check_list); .... 1323 list_for_each_entry_safe(scmd, next, work_q, eh_entry) { .... 1338 rtn = scsi_try_to_abort_cmd(shost->hostt, scmd); .... 1347 scmd->eh_eflags &= ~SCSI_EH_CANCEL_CMD; 1348 if (rtn == FAST_IO_FAIL) 1349 scsi_eh_finish_cmd(scmd, done_q); 1350 else 1351 list_move_tail(&scmd->eh_entry, &check_list); 1352 } 1353 1354 return scsi_eh_test_devices(&check_list, work_q, done_q, 0); 1355 } Because all commands in work_q got successfully aborted, work_q will be empty; and check_list contains all the commands that previously where in that list. In scsi_eh_test_devices() now is the crux of this Medium access timeout behaviour. 1260 static int scsi_eh_test_devices(struct list_head *cmd_list, 1261 struct list_head *work_q, 1262 struct list_head *done_q, int try_stu) 1263 { 1264 struct scsi_cmnd *scmd, *next; 1265 struct scsi_device *sdev; 1266 int finish_cmds; 1267 1268 while (!list_empty(cmd_list)) { 1269 scmd = list_entry(cmd_list->next, struct scsi_cmnd, eh_entry); 1270 sdev = scmd->device; 1271 1272 if (!try_stu) { 1273 if (scsi_host_eh_past_deadline(sdev->host)) { 1274 /* Push items back onto work_q */ 1275 list_splice_init(cmd_list, work_q); 1276 SCSI_LOG_ERROR_RECOVERY(3, 1277 sdev_printk(KERN_INFO, sdev, 1278 "%s: skip test device, past eh deadline", 1279 current->comm)); 1280 break; 1281 } 1282 } 1283 1284 finish_cmds = !scsi_device_online(scmd->device) || 1285 (try_stu && !scsi_eh_try_stu(scmd) && 1286 !scsi_eh_tur(scmd)) || 1287 !scsi_eh_tur(scmd); 1288 1289 list_for_each_entry_safe(scmd, next, cmd_list, eh_entry) 1290 if (scmd->device == sdev) { 1291 if (finish_cmds && 1292 (try_stu || 1293 scsi_eh_action(scmd, SUCCESS) == SUCCESS)) 1294 scsi_eh_finish_cmd(scmd, done_q); 1295 else 1296 list_move_tail(&scmd->eh_entry, work_q); 1297 } 1298 } 1299 return list_empty(work_q); 1300 } So, we iterate over all SCSI-commands for that host, that previously successfully got aborted; and before that, did time out. In line 1284 finish_cmds will become 1 because scsi_eh_tur() will be successful - like I said before, lets just assume the storage healed itself via recovery or something else happened, and its now working just fine. The loop in line 1289 will go over all remaining commands in the cmd_list, and if those commands are for the same sdev, as the one for which the TUR just was successful (a typical sign that the sdev is working fine), it will call scsi_eh_action(). This is now the point where we reach towards the scsi-disk driver. For SD, scsi_eh_action() will call the function sd_eh_action(): 1574 static int sd_eh_action(struct scsi_cmnd *scmd, int eh_disp) 1575 { 1576 struct scsi_disk *sdkp = scsi_disk(scmd->request->rq_disk); 1577 1578 if (!scsi_device_online(scmd->device) || 1579 !scsi_medium_access_command(scmd) || 1580 host_byte(scmd->result) != DID_TIME_OUT || 1581 eh_disp != SUCCESS) 1582 return eh_disp; 1583 1584 /* 1585 * The device has timed out executing a medium access command. 1586 * However, the TEST UNIT READY command sent during error 1587 * handling completed successfully. Either the device is in the 1588 * process of recovering or has it suffered an internal failure 1589 * that prevents access to the storage medium. 1590 */ 1591 sdkp->medium_access_timed_out++; 1592 1593 /* 1594 * If the device keeps failing read/write commands but TEST UNIT 1595 * READY always completes successfully we assume that medium 1596 * access is no longer possible and take the device offline. 1597 */ 1598 if (sdkp->medium_access_timed_out >= sdkp->max_medium_access_timeouts) { 1599 scmd_printk(KERN_ERR, scmd, 1600 "Medium access timeout failure. Offlining disk!\n"); 1601 scsi_device_set_state(scmd->device, SDEV_OFFLINE); 1602 1603 return FAILED; 1604 } 1605 1606 return eh_disp; 1607 } This is the one function you already looked at. The argument eh_disp is SUCCESS; and lets assume that medium_access_timed_out is zero at the start of this overall SCSI EH run. In line 1578 we have some tests, the disk is online (TUR worked before), the command in question is a I/O command, it did run into a timeout, and eh_disp is SUCCESS. So this test will fail and not use the early-out path. So it will increase the Medium access timeout counter for each (!!!) of the SCSI commands that are currently in EH. Which of course will lead to the error you see in the kernel message buffer. And more, because it will return FAILED with at least one command, this command will be put back into the work_q in the calling function scsi_eh_test_devices(), and thus cause SCSI EH to escalate to more severe steps - like device reset. And this although EH just healed the state of its devices and everything is working fine (and with how strange some storage servers react to device and/or bus reset TMFs, this can cause for this situation to osculate to even more commands running into timeouts and bad responses - I have seen this live already). So this semantic here is plain wrong, if you ask me. We are in a single EH run, because something in the path towards the storage had a hicup, and (at least) 2 I/O commands for a single sdev did run into a timeout. This is really nothing special right here, its annoying for the work-load, and it should not happen in an ideal world, but we can not guarantee that. But with the code-flow I described above, this will immediately lead SD to disable that disk without a chance to automatically recover this, ever (the operator has to intervene). So yeah, we still are not sure whether we as zFCP can do more to avoid commands from even running into this timeout-situation - there might be, even with the patch for problem (1) above, still situations where we make the midlayer run into a timeout, although we know that the cable is pulled or just got re-plugged - but then this behaviours here would still be bad. Like I said before, with the patch for problem (1), this should get better. > I am wondering if we should increase the medium_access_timed_out count > to a high number as a workaround here until we hear back from IBM. > > The default is 2 > > 12:0:0:1]# cat max_medium_access_timeouts > 2 > > #!/bin/bash > cd /sys/block > for i in sd*/device/scsi_disk/* > do You can just iterate over /sys/class/scsi_disk/* for all SCSI-disks. > cat $i/max_medium_access_timeouts > echo 5 > $i/max_medium_access_timeouts > cat $i/max_medium_access_timeouts > done > If the code stays as it is right now, increasing the timeout can help, but doesn't fix the overall problem. We did, after finding this behaviour here, also already recommend increasing the timeout to a customers that did run into a similar problem. But 5 is still low for what we are talking about. This only means that if 5 commands and not 2, like in my example above, timeout, this will lead to the disk being offlined. Also, if you want to make such changes persistent, you are better of using a udev-rule to adapt the value for max_medium_access_timeouts; maybe something like this: ACTION=="add", SUBSYSTEM=="scsi_disk", ATTR{max_medium_access_timeouts}:="4294967295" But please make sure that the customers understand, that this is only a work-around. It will also not prevent SCSI EH from happening in these scenarios and thus degrade the running workload. But it should prevent SD from permanently disabling disks and SCSI EH from escalating although this is really not necessary. I hope this helps you a bit. Beste Grüße / Best regards, - Benjamin Block > > Perhaps IBM can attempt to reproduce in-house by causing fabric > events, as I dont have access to zfcp and S390 here for this sort of > reproducer. > -- Linux on z Systems Development / IBM Systems & Technology Group IBM Deutschland Research & Development GmbH Vorsitz. AufsR.: Martina Koederitz / Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294
This is very similar to BZ #1182838, where a switch firmware rolling update caused path flapping (see also linked customer case 01321891). In that BZ/case, the scsi TUR path checker was also in use, and during the firmware update TURs issued by multipathd were succeeding, but subsequent I/O was failing .. i.e. we had failover/failback flapping every checker_interval seconds. The bug is that TUR should NOT have been succeeding (because the device is not ready even though technically it is accessible). The workaround/solution was to change to the directio path_checker and the case was resolved. In the case attached to this BZ, if TURs were not succeeding during the firmware update, then the paths would have remained failed until they returned after the upgrade, and hence we would not have tripped max_medium_access_timeouts and hence the scsi err handler would not have eventually offlined the devices ... which is when everything went downhill with manual intervention required to recover. So it seems an alternative solution/workaround here would be to change to path_checker = directio in multipath.conf to avoid the TUR issue.
Test kernel with patch to SCSI error handling available for testing at: http://people.redhat.com/emilne/RPMS/.bz1370212/ Contains the following change: commit 8ab5d0046f69034fb7f74abc25f209262d2098c1 Author: Ewan D. Milne <emilne> Date: Fri Sep 23 09:50:12 2016 -0400 scsi_error: count medium access timeout only once per EH run The current medium access timeout counter will be increased for each command, so if there are enough failed commands we'll hit the medium access timeout for even a single failure. Fix this by making the timeout per EH run, ie the counter will only be increased once per device and EH run. Signed-off-by: Hannes Reinecke <hare> (Modified for RHEL6 -- KABI changes, also changed to add argument to scsi_eh_action, scsi_driver.eh_action, and sd_eh_action instead of overloading the existing eh_disp argument with a reset flag.) Signed-off-by: Ewan D. Milne <emilne> --- diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 047cc20..d3e4550 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -50,6 +50,7 @@ #define HOST_RESET_SETTLE_TIME (10) static int scsi_eh_try_stu(struct scsi_cmnd *scmd); +static int scsi_eh_action(struct scsi_cmnd *scmd, int rtn, bool reset); /* called with shost->host_lock held */ void scsi_eh_wakeup(struct Scsi_Host *shost) @@ -130,6 +131,7 @@ int scsi_eh_scmd_add(struct scsi_cmnd *scmd, int eh_flag) ret = 1; scmd->eh_eflags |= eh_flag; + scsi_eh_action(scmd, 0, 1); list_add_tail(&scmd->eh_entry, &shost->eh_cmd_q); shost->host_failed++; scsi_eh_wakeup(shost); @@ -975,12 +977,12 @@ static int scsi_request_sense(struct scsi_cmnd *scmd) return scsi_send_eh_cmnd(scmd, NULL, 0, scmd->device->eh_timeout, ~0); } -static int scsi_eh_action(struct scsi_cmnd *scmd, int rtn) +static int scsi_eh_action(struct scsi_cmnd *scmd, int rtn, bool reset) { if (scmd->request->cmd_type != REQ_TYPE_BLOCK_PC) { struct scsi_driver *sdrv = scsi_cmd_to_driver(scmd); if (sdrv->eh_action) - rtn = sdrv->eh_action(scmd, rtn); + rtn = sdrv->eh_action(scmd, rtn, reset); } return rtn; } @@ -1155,7 +1157,7 @@ static int scsi_eh_test_devices(struct list_head *cmd_list, if (scmd->device == sdev) { if (finish_cmds && (try_stu || - scsi_eh_action(scmd, SUCCESS) == SUCCESS)) + scsi_eh_action(scmd, SUCCESS, 0) == SUCCESS)) scsi_eh_finish_cmd(scmd, done_q); else list_move_tail(&scmd->eh_entry, work_q); @@ -1289,7 +1291,7 @@ static int scsi_eh_stu(struct Scsi_Host *shost, list_for_each_entry_safe(scmd, next, work_q, eh_entry) { if (scmd->device == sdev && - scsi_eh_action(scmd, SUCCESS) == SUCCESS) + scsi_eh_action(scmd, SUCCESS, 0) == SUCCESS) scsi_eh_finish_cmd(scmd, done_q); } } @@ -1353,7 +1355,7 @@ static int scsi_eh_bus_device_reset(struct Scsi_Host *shost, list_for_each_entry_safe(scmd, next, work_q, eh_entry) { if (scmd->device == sdev && - scsi_eh_action(scmd, rtn) != FAILED) + scsi_eh_action(scmd, rtn, 0) != FAILED) scsi_eh_finish_cmd(scmd, done_q); } diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index f812367..d6ed528 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -105,7 +105,7 @@ static int sd_suspend(struct device *, pm_message_t state); static int sd_resume(struct device *); static void sd_rescan(struct device *); static int sd_done(struct scsi_cmnd *); -static int sd_eh_action(struct scsi_cmnd *, int); +static int sd_eh_action(struct scsi_cmnd *, int, bool); static void sd_read_capacity(struct scsi_disk *sdkp, unsigned char *buffer); static void scsi_disk_release(struct device *cdev); static void sd_print_sense_hdr(struct scsi_disk *, struct scsi_sense_hdr *); @@ -1349,6 +1349,7 @@ static const struct block_device_operations sd_fops = { * sd_eh_action - error handling callback * @scmd: sd-issued command that has failed * @eh_disp: The recovery disposition suggested by the midlayer + * @reset: Reset the medium access timed out increment flag * * This function is called by the SCSI midlayer upon completion of an * error test command (currently TEST UNIT READY). The result of sending @@ -1357,10 +1358,14 @@ static const struct block_device_operations sd_fops = { * test unit ready (so wrongly see the device as having a successful * recovery) **/ -static int sd_eh_action(struct scsi_cmnd *scmd, int eh_disp) +static int sd_eh_action(struct scsi_cmnd *scmd, int eh_disp, bool reset) { struct scsi_disk *sdkp = scsi_disk(scmd->request->rq_disk); + if (reset) { + sdkp->medium_access_reset = 0; + return eh_disp; + } if (!scsi_device_online(scmd->device) || !scsi_medium_access_command(scmd) || host_byte(scmd->result) != DID_TIME_OUT || @@ -1374,7 +1379,10 @@ static int sd_eh_action(struct scsi_cmnd *scmd, int eh_disp) * process of recovering or has it suffered an internal failure * that prevents access to the storage medium. */ - sdkp->medium_access_timed_out++; + if (!sdkp->medium_access_reset) { + sdkp->medium_access_timed_out++; + sdkp->medium_access_reset = 1; + } /* * If the device keeps failing read/write commands but TEST UNIT diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h index ebf68e3..c7c7434 100644 --- a/drivers/scsi/sd.h +++ b/drivers/scsi/sd.h @@ -90,6 +90,7 @@ struct scsi_disk { unsigned lbpvpd : 1; #ifndef __GENKSYMS__ unsigned cache_override : 1; /* temp override of WCE,RCD */ + unsigned medium_access_reset : 1; #endif }; #define to_scsi_disk(obj) container_of(obj,struct scsi_disk,dev) diff --git a/include/scsi/scsi_driver.h b/include/scsi/scsi_driver.h index 20fdfc2..e1dd47a 100644 --- a/include/scsi/scsi_driver.h +++ b/include/scsi/scsi_driver.h @@ -16,7 +16,7 @@ struct scsi_driver { void (*rescan)(struct device *); int (*done)(struct scsi_cmnd *); - int (*eh_action)(struct scsi_cmnd *, int); + int (*eh_action)(struct scsi_cmnd *, int, bool); }; #define to_scsi_driver(drv) \ container_of((drv), struct scsi_driver, gendrv)
Many Thanks Ewan, I have offered this to the customer to test it for us.
Thanks. Let me know how it works out. We have had a bit of discussion, and I think there are other issues that this does not fix, but I would rather fix 95% of the problem for the customer now and worry about other aspects later. This is a -660 kernel and as far as I can tell it does not have any recent zfcp fixes, we are going to need a separate BZ to track those if we don't have one already.
Understood I will open a new BZ and track the coming zfcp changes when they show up upstream Steffen did not say when they would be coming out, rather he just said for me to keep an eye out. When they show up will open the BZ to get them back-ported as it would appear they are important in the general stability sense for the zfcp driver. This new BZ to tyrack the zfcp changes will be have to be 2 BZ'S one for 6.9 and 7.2+ Thanks!!
Patch was missing a line diff -Nurp linux-2.6.32-573.18.1.el6.orig/drivers/scsi/libfc/fc_exch.c linux-2.6.32-573.18.1.el6/drivers/scsi/libfc/fc_exch.c --- linux-2.6.32-573.18.1.el6.orig/drivers/scsi/libfc/fc_exch.c 2016-01-06 10:15:32.000000000 -0500 +++ linux-2.6.32-573.18.1.el6/drivers/scsi/libfc/fc_exch.c 2016-10-12 20:33:54.558469871 -0400 @@ -815,14 +815,19 @@ err: * EM is selected when a NULL match function pointer is encountered * or when a call to a match function returns true. */ -static inline struct fc_exch *fc_exch_alloc(struct fc_lport *lport, - struct fc_frame *fp) +static struct fc_exch *fc_exch_alloc(struct fc_lport *lport, + struct fc_frame *fp) { struct fc_exch_mgr_anchor *ema; + struct fc_exch *ep; - list_for_each_entry(ema, &lport->ema_list, ema_list) - if (!ema->match || ema->match(fp)) - return fc_exch_em_alloc(lport, ema->mp); + list_for_each_entry(ema, &lport->ema_list, ema_list) { + if (!ema->match || ema->match(fp)) { + ep = fc_exch_em_alloc(lport, ema->mp); + if (ep) + return ep; + } + } return NULL; } diff -Nurp linux-2.6.32-573.18.1.el6.orig/include/linux/netdevice.h linux-2.6.32-573.18.1.el6/include/linux/netdevice.h --- linux-2.6.32-573.18.1.el6.orig/include/linux/netdevice.h 2016-01-06 10:15:59.000000000 -0500 +++ linux-2.6.32-573.18.1.el6/include/linux/netdevice.h 2016-10-12 20:08:52.828043196 -0400 @@ -1103,6 +1103,10 @@ struct net_device #define NETIF_F_ALL_TSO (NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_TSO_ECN) +#define NETIF_F_ALL_FCOE (NETIF_F_FCOE_CRC | NETIF_F_FCOE_MTU | \ + NETIF_F_FSO) + + /* * If one device supports one of these features, then enable them * for all in netdev_increment_features. diff -Nurp linux-2.6.32-573.18.1.el6.orig/include/scsi/fc_frame.h linux-2.6.32-573.18.1.el6/include/scsi/fc_frame.h --- linux-2.6.32-573.18.1.el6.orig/include/scsi/fc_frame.h 2016-01-06 10:15:10.000000000 -0500 +++ linux-2.6.32-573.18.1.el6/include/scsi/fc_frame.h 2016-10-12 20:08:52.829043197 -0400 @@ -137,6 +137,8 @@ static inline struct fc_frame *fc_frame_ fp = fc_frame_alloc_fill(dev, len); else fp = _fc_frame_alloc(len); + if(!fp) + printk("RHDEBUG: In fcp_frame_alloc, we returned fp = %p\n",fp); return fp; } diff -Nurp linux-2.6.32-573.18.1.el6.orig/net/8021q/vlan_dev.c linux-2.6.32-573.18.1.el6/net/8021q/vlan_dev.c --- linux-2.6.32-573.18.1.el6.orig/net/8021q/vlan_dev.c 2016-01-06 10:15:58.000000000 -0500 +++ linux-2.6.32-573.18.1.el6/net/8021q/vlan_dev.c 2016-10-12 20:08:52.829043197 -0400 @@ -522,7 +522,9 @@ static int vlan_dev_init(struct net_devi netdev_extended(dev)->hw_features = NETIF_F_ALL_CSUM | NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_ALL_TSO | - NETIF_F_HIGHDMA | NETIF_F_SCTP_CSUM; + NETIF_F_HIGHDMA | NETIF_F_SCTP_CSUM | + NETIF_F_ALL_FCOE; + dev->features |= real_dev->vlan_features | NETIF_F_LLTX; dev->gso_max_size = real_dev->gso_max_size; [loberman@dhcp-33-21 SOURCES]$
Yes mistake. 26 is for another BZ. Thanks for catching that. Yes build from source. Ignore comnent 26
(In reply to loberman from comment #29) > Yes mistake. 26 is for another BZ. > Thanks for catching that. > > Yes build from source. > > Ignore comnent 26 Thanks Laurence! Let me build the test kernel for all arch with patch in comment 17.
I have put an s390x version of the test kernel with patch to SCSI error handling in: http://people.redhat.com/emilne/RPMS/.bz1370212/
>Hi Ewan, I have got update from the customer, that they could test a kernel for >s390 system. The link in comment#17 shows test kernel for x86_64 arch only. >Could you please let me know if there is a test kernel available for s390 arch. >that I can share with the customer. > >Thanks, >Milan. > >(In reply to Ewan D. Milne from comment #33) > I have put an s390x version of the test kernel with patch to SCSI error > handling in: > > http://people.redhat.com/emilne/RPMS/.bz1370212/ Is there any update on whether the customer was able to test this? We are past the 6.9 deadline at this point.
Upstream has a fix forthcoming: https://marc.info/?l=linux-scsi&m=148827743226480&w=2 Regards Laurence
Per discussion w/support, closing as WONTFIX. There is a workaround available, which is to increase max_medium_access_timeouts to a large value (see KB article https://access.redhat.com/site/solutions/2575901). It appears as if the problem may no longer be occurring due to fixes on the array side as well as we are no longer receiving reports of this problem.