Bug 177072

Summary: iscsid double close on discovery socket leads to syslog message sent over discovery socket
Product: Red Hat Enterprise Linux 4 Reporter: Dave Wysochanski <davidw>
Component: iscsi-initiator-utilsAssignee: Mike Christie <mchristi>
Status: CLOSED ERRATA QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: coughlan, xdl-redhat-bugzilla
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2006-0495 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-08-10 21:54:29 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 181409    
Attachments:
Description Flags
Patch against RHEL4 U3 beta that fixes this problem none

Description Dave Wysochanski 2006-01-05 23:09:35 UTC
Description of problem:
I've noted this very strange, but very reproducible problem, and I think I have
a simple fix for it.

The problem is that when a RHEL server is rebooted and one of the targets is
initially unreachable because the network isn't quite up, strange behavior
occurs.  When the target is eventually reachable, the first thing sent over
the socket is a syslog message (seriously, I have traces).

Upon inspecting the code, and looking at the messages occuring during the
failure to connect, I found a case where close() will be called multiple times
on the same socket.  The attached patch fixes the problem.

Sequence of code looks like this:
- iscsid starts, reads config file, forks child discovery process

child process:

1. sendtargets_discovery_process(): calls iscsi_connect()

2. iscsi_connect(): the connect() call fails printing "cannot make connection ..."
to /var/log/messages file; NOTE: at this point, close(sock) is called, but the
value of 'sock' (the fd for the socket) is returned in session->socket_fd

3. sendtargets_discovery_process(): iscsi_connect() returned '0', so 'then' 
clause is executed; "Connection to Discovery Address A.B.C.D failed" is
written to /var/log/messages; goto set_address; code executes and calls 
iscsi_disconnect()

4. iscsi_disconnect(): since session->socket_fd is a real value, close() is 
again called on the same value as was done in #2; session->socket_fd = -1

5. Sequence #1-#4 is repeated another 3 times

6. Sequence #1 is repeated, but this time iscsi_connect() is successful;
session->socket_fd is set to 'sock' value, and 1 is returned

7. sendtargets_discovery_process(): iscsi_connect() returns '1', so 'then' 
clause is skipped and logmsg() is called with "Connected to Discovery Address
A.B.C.D"; this message somehow doesn't go to syslog socket though, but 
instead goes to session->socket_fd, which goes to the target.  The target sees
this garbage packet, and disconnects the session

8. sendtargets_discovery_process(): iscsi_login() is called

9. iscsi_login(): read() is called, and it exists with '0' value; this calls
logmsg() which calls syslog() with "Connection to Discovery Address A.B.C.D
closed".  This again goes over the socket to the target.  Also, iscsi_host_err()
is called, with "Login I/O error, failed to receive a PDU\n", which also goes
over the socket; a value of LOGIN_IO_ERROR is returned from iscsi_login()

10. sendtargets_discovery_process(): iscsi_login() returns with LOGIN_IO_ERROR,
and logmsg() is called with "retrying discovery login to A.B.C.D"; 
it's interesting that it at least appears that this message goes both to
/var/log/messages _and_ across the socket (???)


Version-Release number of selected component (if applicable):
RHEL4 U3 beta iscsi-initiator-utils-4.0.3.0-3

How reproducible:
- Happens every time on my IBM Penguin ALTUS 1000E (dual x86_64)

Steps to Reproduce:
1. Put one or more DiscoveryAddress entries in /etc/iscsi.conf
2. Boot machine
3. Observe syslog message goes over iscsi socket, followed by iSCSI packets;
target rejects garbage and disconnects the session; process continues with the
initiator unable to connect to that target
  
Actual results:
Initiator unable to connect to target, continues to send syslog message before
valid iSCSI PDUs

Expected results:
Valid iSCSI login sequence from initiator for discovery session

Additional info:

I've only repro'd this on x86_64 but the patch seems to fix an arch independent
behavior.

Comment 1 Dave Wysochanski 2006-01-05 23:09:35 UTC
Created attachment 122853 [details]
Patch against RHEL4 U3 beta that fixes this problem

Comment 8 Red Hat Bugzilla 2006-08-10 21:54:31 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0495.html