From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322) Description of problem: Sending multiple scsi inquiry commands to a disk attached via a qlogic fiber card eventually causes the scsi inquiry command to hang in wait_for_completion. This has only been seen on IA64 and x86_64 systems, I have been unable to reproduce on IA32 sysytems. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: Running the following will result in a number of hung scsiinfo processes after a few seconds (where sdc is a fiber disk connected via a qlogic card). while :; do scsiinfo -i /dev/sdc & done Additional info:
This has only been seen on IA64 and x86_64 systems, I have been unable to reproduce on IA32 sysytems. Stacktrace of a hung scsiinfo process (IA64): Stack traceback for pid 11048 0xe000004074b18000 11048 8704 0 1 D 0xe000004074b18490 scsiinfo 0xa00000010008cfb0 schedule+0xf70 args (0xe000004070a167f8, 0xe0000040443e3800, 0xe000000039df0000, 0xe000004070a1677c, 0xe0000040443e3830) kernel 0xa00000010008c040 0xa00000010008d8a0 0xa00000010008eff0 wait_for_completion+0x1b0 args (0xe000004074b1fc10, 0x2, 0xe000004074b1fbc8, 0xe000004074b1fbd0, 0xa00000010038a140) kernel 0xa00000010008ee40 0xa00000010008f100 0xa00000010038a140 blk_execute_rq+0x1a0 args (0xe0000000023a2480, 0xe000000039c76d80, 0xe0000000023baac0, 0xe0000000023bab78, 0xa000000100392030) kernel 0xa000000100389fa0 0xa00000010038a1a0 0xa000000100392030 scsi_cmd_ioctl+0x10d0 args (0x400, 0xe0000000023bab40, 0xfffffffbfff, 0xe0000000023babb0, 0xe0000000023a2480) kernel 0xa000000100390f60 0xa000000100392260 0xa00000020003f640 [sd_mod]sd_ioctl+0x100 args (0x5382, 0xe000004044012780, 0x1, 0x600000000000bb20, 0xe000000039c76d80) sd_mod 0xa00000020003f540 0xa00000020003fa60 0xa00000010038ccb0 blkdev_ioctl+0x110 args (0xe000004043288e80, 0x600000000000bb20, 0xa00000020003f540, 0xe000000039c76d80, 0xe000004043288f38) kernel 0xa00000010038cba0 0xa00000010038d640 0xa00000010015a280 block_ioctl+0x40 args (0xe000000004a5aaf8, 0xe000004044012780, 0x1, 0x600000000000bb20, 0xa000000100177ba0) 0xa00000010038ccb0 blkdev_ioctl+0x110 args (0xe000004043288e80, 0x00000000000bb20, 0xa00000020003f540, 0xe000000039c76d80, 0xe000004043288f38) kernel 0xa00000010038cba0 0xa00000010038d640 0xa00000010015a280 block_ioctl+0x40 args (0xe000000004a5aaf8, 0xe000004044012780, 0x1, 0x600000000000bb20, 0xa000000100177ba0)
Running the following will result in a number of hung scsiinfo processes after a few seconds (where sdc is a fiber disk connected via a qlogic card). while :; do scsiinfo -i /dev/sdc & done
It is happening on i386 also: Reproduction steps: 1) write the following two scripts swan:~ # cat test.sh while [ 0 ] do scsiinfo -i /dev/sdc >/dev/null 2>&1 done swan:~ # cat runparllal.sh limit=10 count=1 while [ $count -le $limit ] do ./test.sh & let count=count+1 done Now run "runparllal.sh". After some time "scsiinfo" commands are hangs.
No update from RedHat ?
qla_iocb.c -- don't use block layer hw segment counts??? Please look to the maintainer of this code for the fix.
On my two test systems, both systems pass the "do commands get lost" test and no commands ever fail 2.6.9-5.0.1.12 smp kernel (although with the QLogic driver in particular there does appear to be fairness issues, in other words with 10 bash scripts trying to all send commands to the drive, 1 of the 10 will be sending 100+ commands per second while the other 9 will be momentarily stalled, but the other 9 always end up getting their chance eventually, so they aren't stalled completely, with aic79xx driver this isn't an issue, all the scripts run at about the same rate). However, when attempting to let the test scripts run overnight on the 2.6.9-5.0.1.12 smp kernel on both ia32 and x86_64, both machines crashed. The x86_64 machine triggered the oom killer and basically killed everything on the machine without making any headway towards freeing up the memory it needed. The ia32 machine died completely and wouldn't respond to anything keyboard input, network pings, etc. So, the basic summary right now is I'm no longer seeing the issues that Veritas was seeing, but there are new issues of a different nature that have to be addressed.
My testing confirms that this bk changeset: [dledford@compaq-rhel4 linus]$ bk export -tpatch -r1.1938.423.2 # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/03 12:40:55-08:00 greg # [PATCH] sysfs: fix sysfs_dir_close memory leak # # sysfs_dir_close did not free the "cursor" sysfs_dirent used for keeping # track of position in the list of sysfs_dirent nodes. Consequently, # doing a "find /sys" would leak a sysfs_dirent for each of the 1140 # directories in my /sys tree, or about 36kB each time. # # # From: "Adam J. Richter" <adam> # Signed-off-by: Greg Kroah-Hartman <greg> # Signed-off-by: Linus Torvalds <torvalds> # # fs/sysfs/dir.c # 2004/12/03 10:42:51-08:00 greg +2 -0 # sysfs: fix sysfs_dir_close memory leak # diff -Nru a/fs/sysfs/dir.c b/fs/sysfs/dir.c --- a/fs/sysfs/dir.c 2005-02-09 21:21:24 -05:00 +++ b/fs/sysfs/dir.c 2005-02-09 21:21:24 -05:00 @@ -351,6 +351,8 @@ list_del_init(&cursor->s_sibling); up(&dentry->d_inode->i_sem); + release_sysfs_dirent(cursor); + return 0; } solves the leaked size-32 kmalloc's and should therefore solve the OOM problem. It's possible that the lockup on the ia32 box was actually a memory deadlock and could possibly be this same thing. However, during further testing, an oops was encountered and bugzilla 147638 was created to track the oops.
Some more update on this bug: This bug is in "qla2x00_start_scsi()" (drivers/scsi/qla2xxx/qla_iocb.c). It is relaying on the "cmd->request->nr_hw_segments" value to compute the required number of request entries. This has junk value for inquiry commands. The reason is "request->nr_hw_segments" is not initialised in "get_request()" function. Since "cmd->request->nr_hw_segments" has junk value, and the corresponding required number of request entries are large. With this large number of entries, qla2x00_start_scsi() failed to issue the request. So, this request is repetedly put into pending queue, and is getting repetedly timedout. Solution: ---------- "qla2x00_start_scsi()" should not depend on "cmd->request->nr_hw_segments" to compute the required number of request entries, Insted it can use the output of "pci_map_sg()".
Created attachment 111409 [details] qla2x00 driver bug fix patch Buddhi, thank you for the pointer. After code inspection, you are correct regarding the qlogic driver. It can be considered a bug that the qlogic driver was ever looking at request->nr_hw_segments in the first place as that's specifically a block layer request struct where as low level SCSI device drivers really should never look at anything outside the scsi_command struct for their information. I've written a patch to correct this problem and some related PCI DMA mapping issues in the qlogic driver that were found while investigating this problem. That test patch is attached here and is currently being tested by me. Upon successful test completion, I'll submit it for review and possible inclusion in our next kernel update.
The qlogic patch has been submitted to our internal list for review (and has already received several ACK's and should make U1) as well as submitted upstream for review and accepted QLogic for inclusion in their future driver updates.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-420.html