Bug 245823 - [LSI-E 5.2 bug] I/O terminates on linux system due to Out Of Memory
[LSI-E 5.2 bug] I/O terminates on linux system due to Out Of Memory
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.0
All Linux
medium Severity high
: ---
: ---
Assigned To: Mike Christie
Martin Jenner
: OtherQA
Depends On: 238795
Blocks: 217105 425461
  Show dependency treegraph
 
Reported: 2007-06-26 16:52 EDT by Mike Christie
Modified: 2010-10-22 11:55 EDT (History)
12 users (show)

See Also:
Fixed In Version: RHBA-2008-0314
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-05-21 10:44:59 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
set proper bounce buffer limit for iscsi_tcp (436 bytes, patch)
2007-06-26 17:10 EDT, Mike Christie
no flags Details | Diff
Patch to iscsi-initiator file to fix this issue (1.77 KB, patch)
2007-11-27 23:47 EST, Raghavendra Biligiri
no flags Details | Diff
Patch to iscsi-xmit-pdu file to fix this issue (865 bytes, patch)
2007-11-27 23:48 EST, Raghavendra Biligiri
no flags Details | Diff

  None (edit)
Comment 1 Mike Christie 2007-06-26 16:56:27 EDT
This is clone of 238795 for RHEL5. We also need to fix iscsi_tcp and ib_iser in
RHEL 5.
Comment 2 Mike Christie 2007-06-26 17:10:30 EDT
Created attachment 157959 [details]
set proper bounce buffer limit for iscsi_tcp
Comment 6 RHEL Product and Program Management 2007-06-28 13:33:15 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 7 Larry Troan 2007-07-26 09:32:25 EDT
Dell is requesting this be fixed in 5.1 as it affects iSCSI.
Note that the bug is fixed in 4.6 via bug 238795 and it's a one line change.
Comment 9 Charles Rose 2007-07-27 04:30:11 EDT
Latest Update from Marco Peereboom@Dell: "The fix isn't correct; it seems as if
it isn't done in the right spot.  I have a fix that makes the dma mask -1(ull)
which basically fixes the issue for real."
Comment 10 John Feeney 2007-07-27 13:48:10 EDT
Can the proposed fix from comment #9 be included in this bugzilla?
Comment 11 Andrius Benokraitis 2007-08-14 10:16:32 EDT
Marco @ Dell - if we don't hear from you soon this issue will have to be
deferred or closed. We are soon approaching the closing of 5.1 beta feedback.
Thanks!
Comment 13 Andrius Benokraitis 2007-09-17 15:56:19 EDT
Due to lack of information from Dell, this has been deferred to 5.2.
Comment 15 RHEL Product and Program Management 2007-10-28 18:15:21 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 16 Raghavendra Biligiri 2007-11-27 23:47:05 EST
Created attachment 270691 [details]
Patch to iscsi-initiator file to fix this issue
Comment 17 Raghavendra Biligiri 2007-11-27 23:48:13 EST
Created attachment 270701 [details]
Patch to iscsi-xmit-pdu file to fix this issue
Comment 19 Raghavendra Biligiri 2007-11-27 23:54:07 EST
Uploaded the patch received from Yanqing in comment #16 and #17.
Comment 20 Mike Christie 2007-11-28 14:00:26 EST
(In reply to comment #16)
> Created an attachment (id=270691) [edit]
> Patch to iscsi-initiator file to fix this issue
> 

This patch is bogus. It is reversing a patch that LSI did for RHEL 4 which was
not wrong but not nice. We instead did something like comment #2 for RHEL4 and
upstream. If Dell could explain why their patch is right and ours is wrong it
would be great.
Comment 21 Mike Christie 2007-11-28 14:02:24 EST
(In reply to comment #17)
> Created an attachment (id=270701) [edit]
> Patch to iscsi-xmit-pdu file to fix this issue
> 

This is a patch for RHEL4, and is not applicable to RHEL5's initiator. It was
put into RHEL 4.6.
Comment 22 Mike Christie 2007-11-28 14:09:30 EST
(In reply to comment #20)
> (In reply to comment #16)
> > Created an attachment (id=270691) [edit] [edit]
> > Patch to iscsi-initiator file to fix this issue
> > 
> 
> This patch is bogus. It is reversing a patch that LSI did for RHEL 4 which was
> not wrong but not nice. We instead did something like comment #2 for RHEL4 and
> upstream. If Dell could explain why their patch is right and ours is wrong it
> would be great.

Just one other note. If you look at scsi_scan.c it will call scsi_alloc_queue()
which calls scsi_calculate_bounce_limit() and initially the bounce limit is set
to 0xffffffff, but when scsi_alloc_queue(), so for scanning and device setup we
have a small limit but that should not be a issue for the few commands that sent
like report luns, inquiry, etc. When the device is then added the
slave_configure is called and the driver sets the bounce limit.

If you do thing that we need to set the bounce limit for the setup process you
could do it in the slave_alloc callout, but you do not need to do the fake
platform_device just so scsi_calculate_bounce_limit() picks it up. The model
upstream is using is for drivers to call the blk_queue callouts in the slave
callouts when it can.
Comment 23 Don Zickus 2008-01-10 15:41:57 EST
in 2.6.18-66.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5
Comment 25 Mike McGrath 2008-02-20 10:13:30 EST
I downloaded this kernel on 2 servers.  Fixed reboots on one but I'm still
having issues with the other.

Is there a "right" solution for this yet?
Comment 26 Mike Christie 2008-02-20 12:21:35 EST
(In reply to comment #25)
> I downloaded this kernel on 2 servers.  Fixed reboots on one but I'm still
> having issues with the other.
> 
> Is there a "right" solution for this yet?

Is this comment in the right bugzilla? This bugzilla fixed a problem where heavy
IO on a iscsi disk was causing problems.
Comment 27 Mike McGrath 2008-02-20 12:28:55 EST
by "causing problems" I take it you don't mean rebooting or kernel panics?
Comment 28 Mike Christie 2008-02-20 12:52:52 EST
(In reply to comment #27)
> by "causing problems" I take it you don't mean rebooting or kernel panics?

Yeah, IO will just not be executed because we basically run out of memory.

What problem are you hitting? From comment #26 it sounds like you hit a problem
during reboot? Was there an oops or hang?
Comment 29 Mike McGrath 2008-02-20 13:12:04 EST
not totally sure whats going on as we've never seen it, I'm logging the console
now.  I'd initially opened up #429469 but thought this bug might be the issue.  
Comment 30 John Poelstra 2008-03-20 23:53:00 EDT
Greetings Red Hat Partner,

A fix for this issue should be included in the latest packages contained in
RHEL5.2-Snapshot1--available now on partners.redhat.com.  

Please test and confirm that your issue is fixed.

After you (Red Hat Partner) have verified that this issue has been addressed,
please perform the following:
1) Change the *status* of this bug to VERIFIED.
2) Add *keyword* of PartnerVerified (leaving the existing keywords unmodified)

If this issue is not fixed, please add a comment describing the most recent
symptoms of the problem you are having and change the status of the bug to ASSIGNED.

If you are receiving this message in Issue Tracker, please reply with a message
to Issue Tracker about your results and I will update bugzilla for you.  If you
need assistance accessing ftp://partners.redhat.com, please contact your Partner
Manager.

Thank you
Comment 31 John Poelstra 2008-04-02 17:35:28 EDT
Greetings Red Hat Partner,

A fix for this issue should be included in the latest packages contained in
RHEL5.2-Snapshot3--available now on partners.redhat.com.  

Please test and confirm that your issue is fixed.

After you (Red Hat Partner) have verified that this issue has been addressed,
please perform the following:
1) Change the *status* of this bug to VERIFIED.
2) Add *keyword* of PartnerVerified (leaving the existing keywords unmodified)

If this issue is not fixed, please add a comment describing the most recent
symptoms of the problem you are having and change the status of the bug to ASSIGNED.

If you are receiving this message in Issue Tracker, please reply with a message
to Issue Tracker about your results and I will update bugzilla for you.  If you
need assistance accessing ftp://partners.redhat.com, please contact your Partner
Manager.

Thank you
Comment 32 John Poelstra 2008-04-09 18:42:21 EDT
Greetings Red Hat Partner,

A fix for this issue should be included in the latest packages contained in
RHEL5.2-Snapshot4--available now on partners.redhat.com.  

Please test and confirm that your issue is fixed.

After you (Red Hat Partner) have verified that this issue has been addressed,
please perform the following:
1) Change the *status* of this bug to VERIFIED.
2) Add *keyword* of PartnerVerified (leaving the existing keywords unmodified)

If this issue is not fixed, please add a comment describing the most recent
symptoms of the problem you are having and change the status of the bug to ASSIGNED.

If you are receiving this message in Issue Tracker, please reply with a message
to Issue Tracker about your results and I will update bugzilla for you.  If you
need assistance accessing ftp://partners.redhat.com, please contact your Partner
Manager.

Thank you
Comment 33 John Poelstra 2008-04-23 13:40:09 EDT
Greetings Red Hat Partner,

A fix for this issue should be included in the latest packages contained in
RHEL5.2-Snapshot6--available now on partners.redhat.com.  

We are nearing GA for 5.2 so please test and confirm that your issue is fixed ASAP.

After you (Red Hat Partner) have verified that this issue has been addressed,
please perform the following:
1) Change the *status* of this bug to VERIFIED.
2) Add *keyword* of PartnerVerified (leaving the existing keywords unmodified)

If this issue is not fixed, please add a comment describing the most recent
symptoms of the problem you are having and change the status of the bug to ASSIGNED.

If you are receiving this message in Issue Tracker, please reply with a message
to Issue Tracker about your results and I will update bugzilla for you.  If you
need assistance accessing ftp://partners.redhat.com, please contact your Partner
Manager.

Thank you
Comment 34 John Poelstra 2008-05-01 12:50:22 EDT
Greetings Red Hat Partner,

A fix for this issue should be included in the latest packages contained in
RHEL5.2-Snapshot7--available now on partners.redhat.com.  

We are nearing GA for 5.2--this is the last opportunity to test and confirm that
your issue is fixed.

After you (Red Hat Partner) have verified that this issue has been addressed,
please perform the following:
1) Change the *status* of this bug to VERIFIED.
2) Add *keyword* of PartnerVerified (leaving the existing keywords unmodified)

If this issue is not fixed, please add a comment describing the most recent
symptoms of the problem you are having and change the status of the bug to ASSIGNED.

If you are receiving this message in Issue Tracker, please reply with a message
to Issue Tracker about your results and I will update bugzilla for you.  If you
need assistance accessing ftp://partners.redhat.com, please contact your Partner
Manager.

Thank you
Comment 36 errata-xmlrpc 2008-05-21 10:44:59 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0314.html

Note You need to log in before you can comment on or make changes to this bug.