Bug 595682 - 'service iscsi restart' gets stuck if it fails to logout from sessions
Summary: 'service iscsi restart' gets stuck if it fails to logout from sessions
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: iscsi-initiator-utils
Version: 5.5.z
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Chris Leech
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-05-25 11:29 UTC by Yaniv Kaul
Modified: 2015-09-07 11:03 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-06-02 13:20:19 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Yaniv Kaul 2010-05-25 11:29:36 UTC
Description of problem:
I have multiple open sessions. On the server side, I've deleted those targets.
When I try to 'service iscsi restart', it gets stuck (See additional info).


Version-Release number of selected component (if applicable):
iscsi-initiator-utils-6.2.0.871-0.16.el5

How reproducible:
Always

Steps to Reproduce:
1. 
2.
3.
  
Actual results:
stuck.

Expected results:
Should forcefully(?) logout.

Additional info:
[root@brown-vdsb ~]# service iscsi restart
Logging out of session [sid: 10, target: iqn.2006-01.com.openfiler:tsn.3b3a671a38ea, portal: 10.35.64.20,3260]
Logging out of session [sid: 11, target: iqn.2006-01.com.openfiler:tsn.3b0fc66dfe7e, portal: 10.35.64.20,3260]
Logging out of session [sid: 12, target: iqn.2006-01.com.openfiler:tsn.a0ef6f67f7b4, portal: 10.35.64.20,3260]
Logging out of session [sid: 13, target: iqn.2006-01.com.openfiler:tsn.8cbfd99ae55f, portal: 10.35.64.20,3260]
Logging out of session [sid: 14, target: iqn.2006-01.com.openfiler:tsn.7e3f50a9c970, portal: 10.35.64.20,3260]
Logging out of session [sid: 15, target: iqn.2006-01.com.openfiler:tsn.04d2ddfe2925, portal: 10.35.64.20,3260]
Logging out of session [sid: 16, target: iqn.2006-01.com.openfiler:tsn.0d8fc616281d, portal: 10.35.64.20,3260]
Logging out of session [sid: 17, target: iqn.2006-01.com.openfiler:tsn.1f64fdc51faa, portal: 10.35.64.20,3260]
Logging out of session [sid: 18, target: iqn.2006-01.com.openfiler:tsn.278f93e17af2, portal: 10.35.64.20,3260]
Logging out of session [sid: 19, target: iqn.2006-01.com.openfiler:tsn.39833377a491, portal: 10.35.64.20,3260]
Logging out of session [sid: 20, target: iqn.2006-01.com.openfiler:tsn.3183a517ea3e, portal: 10.35.64.20,3260]
Logging out of session [sid: 21, target: iqn.2006-01.com.openfiler:tsn.f0dd607d57dc, portal: 10.35.64.20,3260]
Logging out of session [sid: 22, target: iqn.2006-01.com.openfiler:tsn.5d1849d02702, portal: 10.35.64.20,3260]
Logging out of session [sid: 23, target: iqn.2006-01.com.openfiler:tsn.255a3830426f, portal: 10.35.64.20,3260]
Logging out of session [sid: 24, target: iqn.2006-01.com.openfiler:tsn.b0cca3b30405, portal: 10.35.64.20,3260]
Logging out of session [sid: 25, target: iqn.2006-01.com.openfiler:tsn.8a9ec3595fab, portal: 10.35.64.20,3260]
Logging out of session [sid: 26, target: iqn.2006-01.com.openfiler:tsn.92181e8a8669, portal: 10.35.64.20,3260]
Logging out of session [sid: 27, target: iqn.2006-01.com.openfiler:tsn.1fabffb54f6a, portal: 10.35.64.20,3260]
Logging out of session [sid: 28, target: iqn.2006-01.com.openfiler:tsn.663da70ec864, portal: 10.35.64.20,3260]
Logging out of session [sid: 29, target: iqn.2006-01.com.openfiler:tsn.f12aba6ae753, portal: 10.35.64.20,3260]
Logging out of session [sid: 30, target: iqn.2006-01.com.openfiler:tsn.03636a2469ae, portal: 10.35.64.20,3260]
Logging out of session [sid: 31, target: iqn.2006-01.com.openfiler:tsn.83f984c89548, portal: 10.35.64.20,3260]
Logging out of session [sid: 32, target: iqn.2006-01.com.openfiler:tsn.2dec765d38c1, portal: 10.35.64.20,3260]
Logging out of session [sid: 33, target: iqn.2006-01.com.openfiler:tsn.ykaul, portal: 10.35.64.20,3260]
Logging out of session [sid: 8, target: iqn.2006-01.com.openfiler:tsn.ad4357202999, portal: 10.35.64.20,3260]
Logging out of session [sid: 9, target: iqn.2006-01.com.openfiler:tsn.17239a735f17, portal: 10.35.64.20,3260]


(stuck here until ctrl-c):
iscsiadm: caught SIGINT, exiting...
Stopping iSCSI daemon: ^[[Aiscsiadm: caught SIGINT, exiting...

iscsid (pid  19267) is running...                          [  OK  ]
Setting up iSCSI targets: iscsiadm: No records found!
                                                           [  OK  ]

Comment 1 Mike Christie 2010-05-25 20:33:39 UTC
Do you mean it gets stuck on the Logging out of session part, or on the starting up of sessions part?

For the former how long did you wait before hitting the ctrl-C?

What target is this with?

Did you delete the targets on the target?
- Is there any errors in /var/log/messages?
- What is the logout timeout (grep node.conn[0].timeo.logout_timeout /etc/iscsi/iscsid.conf or iscsiadm -m node -T ... | grep node.conn[0].timeo.logout_timeout).
- When you run the restart command what is the state of the sessions and luns/devices/hosts (iscsiadm -m session -P 3).


Or when you say you deleted the targets do you mean you did it in the node db (in /var/lib/iscsi/nodes using iscsiadm -m node -o delete or rm -f -r?

Comment 2 Yaniv Kaul 2010-05-25 20:53:14 UTC
(In reply to comment #1)
> Do you mean it gets stuck on the Logging out of session part, or on the
> starting up of sessions part?

The logging out.

> 
> For the former how long did you wait before hitting the ctrl-C?

More than a minute.

> 
> What target is this with?

OpenFiler 2.3. (sorry, gave up on F12 is a target due to a different bug)

> 
> Did you delete the targets on the target?

Yes.

> - Is there any errors in /var/log/messages?
> - What is the logout timeout (grep node.conn[0].timeo.logout_timeout
> /etc/iscsi/iscsid.conf or iscsiadm -m node -T ... | grep
> node.conn[0].timeo.logout_timeout).
> - When you run the restart command what is the state of the sessions and
> luns/devices/hosts (iscsiadm -m session -P 3).

I'll have to reproduce and look at all those.

> 
> 
> Or when you say you deleted the targets do you mean you did it in the node db
> (in /var/lib/iscsi/nodes using iscsiadm -m node -o delete or rm -f -r?    

Using Openfiler's web UI.

Comment 3 Mike Christie 2010-05-25 21:47:43 UTC
(In reply to comment #2)
> (In reply to comment #1)
> > Do you mean it gets stuck on the Logging out of session part, or on the
> > starting up of sessions part?
> 
> The logging out.
> 
> > 
> > For the former how long did you wait before hitting the ctrl-C?
> 
> More than a minute.
> 
> > 
> > What target is this with?
> 
> OpenFiler 2.3. (sorry, gave up on F12 is a target due to a different bug)
> 
> > 
> > Did you delete the targets on the target?
> 
> Yes.
> 

What are the write cache settings on the target?

There could be two issues.
1. There was a bug that was fixed here
https://bugzilla.redhat.com/show_bug.cgi?id=580840, so make sure the kernel is 2.6.18-194.3.1.el5.

The bug would cause shutdown to take several minutes.

2. It can take a while to logout if the target does not send some indication that the target is deleted which I do not think open-filer does (the target in fedora/RHEL does not either).

If there is still IO in a iscsi queue when the target is deleted, then we could end up waiting for that to fail or for some timer to break us out of the wait for the command to complete/fail. Which timer depends on where the command is when the target is deleted and your iscsi and scsi timer settings. But it could take several minutes. It could be:

node.session.timeo.replacement_timeout + (/sys/block/sdX/device/timeout * number of commands you see in /sys/class/scsi_host/hostN/host_busy) + node.session.err_timeo.abort_timeout + node.conn[0].timeo.noop_out_timeout + node.conn[0].timeo.noop_out_interval seconds.

Comment 5 RHEL Program Management 2014-03-07 13:45:51 UTC
This bug/component is not included in scope for RHEL-5.11.0 which is the last RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX (at the end of RHEL5.11 development phase (Apr 22, 2014)). Please contact your account manager or support representative in case you need to escalate this bug.

Comment 6 RHEL Program Management 2014-06-02 13:20:19 UTC
Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in RHEL5 stream. If the issue is critical for your business, please provide additional business justification through the appropriate support channels (https://access.redhat.com/site/support).


Note You need to log in before you can comment on or make changes to this bug.