177072 – iscsid double close on discovery socket leads to syslog message sent over discovery socket

Bug 177072 - iscsid double close on discovery socket leads to syslog message sent over discovery socket

Summary: iscsid double close on discovery socket leads to syslog message sent over dis...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 4
Classification:	Red Hat
Component:	iscsi-initiator-utils
Sub Component:
Version:	4.0
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Mike Christie
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	181409
TreeView+	depends on / blocked

Reported:	2006-01-05 23:09 UTC by Dave Wysochanski
Modified:	2007-11-30 22:07 UTC (History)
CC List:	2 users (show)
Fixed In Version:	RHBA-2006-0495
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2006-08-10 21:54:29 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Patch against RHEL4 U3 beta that fixes this problem (633 bytes, text/plain) 2006-01-05 23:09 UTC, Dave Wysochanski	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2006:0495	0	normal	SHIPPED_LIVE	iscsi-initiator-utils bug fix update	2006-08-10 04:00:00 UTC

Description Dave Wysochanski 2006-01-05 23:09:35 UTC

Description of problem:
I've noted this very strange, but very reproducible problem, and I think I have
a simple fix for it.

The problem is that when a RHEL server is rebooted and one of the targets is
initially unreachable because the network isn't quite up, strange behavior
occurs.  When the target is eventually reachable, the first thing sent over
the socket is a syslog message (seriously, I have traces).

Upon inspecting the code, and looking at the messages occuring during the
failure to connect, I found a case where close() will be called multiple times
on the same socket.  The attached patch fixes the problem.

Sequence of code looks like this:
- iscsid starts, reads config file, forks child discovery process

child process:

1. sendtargets_discovery_process(): calls iscsi_connect()

2. iscsi_connect(): the connect() call fails printing "cannot make connection ..."
to /var/log/messages file; NOTE: at this point, close(sock) is called, but the
value of 'sock' (the fd for the socket) is returned in session->socket_fd

3. sendtargets_discovery_process(): iscsi_connect() returned '0', so 'then' 
clause is executed; "Connection to Discovery Address A.B.C.D failed" is
written to /var/log/messages; goto set_address; code executes and calls 
iscsi_disconnect()

4. iscsi_disconnect(): since session->socket_fd is a real value, close() is 
again called on the same value as was done in #2; session->socket_fd = -1

5. Sequence #1-#4 is repeated another 3 times

6. Sequence #1 is repeated, but this time iscsi_connect() is successful;
session->socket_fd is set to 'sock' value, and 1 is returned

7. sendtargets_discovery_process(): iscsi_connect() returns '1', so 'then' 
clause is skipped and logmsg() is called with "Connected to Discovery Address
A.B.C.D"; this message somehow doesn't go to syslog socket though, but 
instead goes to session->socket_fd, which goes to the target.  The target sees
this garbage packet, and disconnects the session

8. sendtargets_discovery_process(): iscsi_login() is called

9. iscsi_login(): read() is called, and it exists with '0' value; this calls
logmsg() which calls syslog() with "Connection to Discovery Address A.B.C.D
closed".  This again goes over the socket to the target.  Also, iscsi_host_err()
is called, with "Login I/O error, failed to receive a PDU\n", which also goes
over the socket; a value of LOGIN_IO_ERROR is returned from iscsi_login()

10. sendtargets_discovery_process(): iscsi_login() returns with LOGIN_IO_ERROR,
and logmsg() is called with "retrying discovery login to A.B.C.D"; 
it's interesting that it at least appears that this message goes both to
/var/log/messages _and_ across the socket (???)


Version-Release number of selected component (if applicable):
RHEL4 U3 beta iscsi-initiator-utils-4.0.3.0-3

How reproducible:
- Happens every time on my IBM Penguin ALTUS 1000E (dual x86_64)

Steps to Reproduce:
1. Put one or more DiscoveryAddress entries in /etc/iscsi.conf
2. Boot machine
3. Observe syslog message goes over iscsi socket, followed by iSCSI packets;
target rejects garbage and disconnects the session; process continues with the
initiator unable to connect to that target
  
Actual results:
Initiator unable to connect to target, continues to send syslog message before
valid iSCSI PDUs

Expected results:
Valid iSCSI login sequence from initiator for discovery session

Additional info:

I've only repro'd this on x86_64 but the patch seems to fix an arch independent
behavior.

Comment 1 Dave Wysochanski 2006-01-05 23:09:35 UTC

Created attachment 122853 [details]
Patch against RHEL4 U3 beta that fixes this problem

Comment 8 Red Hat Bugzilla 2006-08-10 21:54:31 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0495.html

Note You need to log in before you can comment on or make changes to this bug.