Bug 158343

Summary: Segmentation fault when an iscsi-ls -l is performed on 212 devices.
Product: Red Hat Enterprise Linux 3 Reporter: Wayne Berthiaume <berthiaume_wayne>
Component: iscsi-initiator-utilsAssignee: AJ Lewis <157070.alewis>
Status: CLOSED ERRATA QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 3.0CC: conway_heather, coughlan, kaufman_susan
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2005-548 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-09-28 19:35:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 156177, 156320    
Attachments:
Description Flags
iscsi-ls -l trace before PowerPath is started
none
iscsi-ls -l trace after PowerPath is started
none
iscsi-ls -l trace after PowerPath is stopped but after the seg fault none

Description Wayne Berthiaume 2005-05-20 18:17:27 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.4) Gecko/20050318 Red Hat/1.4.4-1.3.5

Description of problem:
When an iscsi-ls -l is issued to a configuration consisting of EMCPower-4.3.2 and 212 devices spread across two arrays I get a sementation fault. After the fault the stack is corupted so if service PowerPath stop is performed the segmentation fault will contine to occur. If PowerPath is not started and an iscsi-ls -l is performed against the same 212 devices it works fine. Below is an analysis from one of our developers:

May 18 2005  2:20PM Zhimin Jiang:
Two bugs:
1) array overflow - this is the bug that causes the segmentation fault.
line 206 of scsi-info.c(in function do_scsi_83_inquiry) should be changed
from
*tmpid = malloc(sizeof(char) * (id_length * 2));
to
*tmpid = malloc(sizeof(char) * (id_length * 2) + 1);

and line 244 of scsi-info.c(in do_scsi_80_inquiry) should be changed from
info->page80 = malloc(sizeof(char) * (id_length * 2));
to
info->page80 = malloc(sizeof(char) * (id_length * 2) + 1);

The above change is to provide an extra byte of space to accommodate the string
determinator('\0'). Otherwise overflow will occur.

The segmentation happens when the iscsi-ls's memory usage reaches some amount. With PP, iscsi-ls internally allocates more memory to process PP devices such as
/dev/emcpower**, this triggers the segmentation fault. I am pretty sure if you add more devices to the host, the command will seg fault even without PP. In fact, I malloced a 10KB at the beginning of the command's main function and it caused it to seg fault right away.

2) memory leak in function _get_devs_from_proc (defined in iscsi-ls.c).
This function should release the memory allocated to local variable 'procbuf' upon it exits. So its last statement(line 193 in iscsi-ls.c) should be changed
from
return 1;
to
if(procbuf) free(procbuf);
return 1;

One interesting thing was the segmentation fault also happened when I added the line of code to free 'procbuf' before fixing bug #1. So freeing 'procbuf' will also trigger bug #1. Wondering why the free(procbuf) was missing at the first place.

May 18 2005  2:38PM Zhimin Jiang:
BTW, the iscsi-ls I built on 4/29/2005 is an oder version(3.6.2-4). The version comes with RHEL 3.0 U5 is 3.6.2-7. Version 3.6.2-7 processes /dev/sd?X(X is an
alphabet letter) devices which are not processed in the older version. Which means the new version uses more memory so the segmentation fault happens, while the old version uses less memory so the segmentation has not been triggered.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.configure 256 LUNs - system disks
2.install EMCpower.LINUX-4.3.2-011
3.perform iscsi-ls -l
  

Actual Results:  SEGMETATION FAULT

Expected Results:  You should get a target list and the associated LUNs to each target.

Additional info:

Comment 1 Wayne Berthiaume 2005-05-20 18:23:57 UTC
Tom, please attach the straces I sent you last week to this BZ. 
Thank you,
Wayne.

Comment 2 AJ Lewis 2005-05-20 18:25:47 UTC
Great - i'll get those fixes into the tree.  I'm a bit confused about the
comment about the stack being corrupted though.  Once iscsi-ls has ended, it's
stack space has been closed.  Are you saying the iscsi driver's stack space is
getting corrupted by iscsi-ls?

Comment 5 Wayne Berthiaume 2005-05-24 13:12:53 UTC
In test I found that if I removed PowerPath after the segmentation fault 
occurred, subsequent iscsi-ls -l calls would yield a segmentation fault; 
however, if I started with a fresh server without PowerPath running iscsi-ls -l 
works fine in the same configuration without any faults. 

Comment 6 Wayne Berthiaume 2005-05-24 13:17:18 UTC
Created attachment 114768 [details]
iscsi-ls -l trace before PowerPath is started

Comment 7 Wayne Berthiaume 2005-05-24 13:18:49 UTC
Created attachment 114769 [details]
iscsi-ls -l trace after PowerPath is started

This is an strace of the failure with PowerPath running.

Comment 8 Wayne Berthiaume 2005-05-24 13:21:19 UTC
Created attachment 114770 [details]
iscsi-ls -l trace after PowerPath is stopped but after the seg fault 

This is a trace of the seg fault after PowerPath has been stopped but the
server has not been rebooted. iscsi-ls -l will continue to seg fault until the
server has been rebooted.

Comment 9 AJ Lewis 2005-06-01 15:57:09 UTC
Memory allocation fixes committed to the 3.6 upstream tree.

Comment 14 Wayne Berthiaume 2005-09-09 20:28:59 UTC
Patch tested successfully.

RHEL 3.0 U6 beta (lk 2.4.21-35) in test to confirm fix.

Comment 15 Red Hat Bugzilla 2005-09-28 19:35:44 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2005-548.html


Comment 16 Bob Johnson 2006-04-11 15:46:07 UTC
This issue is on Red Hat Engineering's list of planned work items 
for the upcoming Red Hat Enterprise Linux 3.8 release.  Engineering 
resources have been assigned and barring unforeseen circumstances, Red 
Hat intends to include this item in the 3.8 release.