Bug 1502740 - [iSCSI]: windows BSOD seen when OSD addition done during IO from windows
Summary: [iSCSI]: windows BSOD seen when OSD addition done during IO from windows
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: iSCSI
Version: 3.0
Hardware: Unspecified
OS: Linux
high
urgent
Target Milestone: rc
: 3.*
Assignee: Mike Christie
QA Contact: Tejas
Erin Donnelly
URL:
Whiteboard:
Depends On:
Blocks: 1494421
TreeView+ depends on / blocked
 
Reported: 2017-10-16 14:58 UTC by Tejas
Modified: 2019-02-16 22:08 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
.Having more than one path from an initiator to an iSCSI gateway is not supported In the iSCSI gateway, `tcmu-runner` might return the same inquiry and Asymmetric logical unit access (ALUA) info for all iSCSI sessions to a target port group. This can cause the initiator or multipath layer to use the incorrect port info to reference the internal structures for paths and devices, which can result in failures, failover and failback failing, or incorrect multipath and SCSI log or tool output. Therefore, having more than one iSCSI session from an initiator to an iSCSI gateway is not supported.
Clone Of:
Environment:
Last Closed: 2019-02-16 22:08:26 UTC
Embargoed:


Attachments (Terms of Use)

Description Tejas 2017-10-16 14:58:30 UTC
Description of problem:
  
Scenario:
1. Started IO from windows on a fresh config(logged in and luns mapped)
2. Started IO from a windows VM on ESX for additional load.
3. After a while of IOs, also a windows initiator reboot, I am seeing 8 sessions for 4 TPGs. Not sure from where 4 extra sessions got created.
Seeing exactly 2 of each path.
4. IO s continued, srated OSD addition  to the ceph cluster.
5. Lot of pg remapping happening, and IOs stopped.
6. Windows initiator crashed

Minidump and MEMORY.DMP file present.

Version-Release number of selected component (if applicable):
windows 2016
ceph version 12.2.1-14.el7cp
3.10.0-714.el7.test.x86_64 (kernel on ceph cluster)

Comment 4 Jason Dillaman 2017-10-16 15:35:11 UTC
@Tejas: What does "pg remapping happening" mean? Are OSDs crashing or are you manually doing things in the background to slow down the OSDs' responsiveness?

Comment 5 Tejas 2017-10-16 16:26:36 UTC
Hi Jason,
   I added a new OSD node with 8 OSDs, so the object redistribution is happening to the new OSDs. I did not manually do anything except add the OSDs.

Comment 6 Mike Christie 2017-10-16 16:38:35 UTC
> 3. After a while of IOs, also a windows initiator reboot, I am seeing 8 sessions for 4 TPGs. Not sure from where 4 extra sessions got created.
Seeing exactly 2 of each path.

Where do you see the extra sessions? The target side or initiator side or both? If on the target side is it in gwcli or the configfs interface?

Comment 7 Mike Christie 2017-10-16 18:15:22 UTC
Tejas,

For the extra sessions, you just have 4 extra sessions defined in the "Favorite Targets", so whenever you reboot or restart the iscsi service you will get the extra sessions. Did you by any chance maybe setup iscsi targets and forget you had already set up some Favorite Targets?

We do not support multiple sessions to the same target port group from the same initiator, because tcmu-runner returns incorrect inquiry data. This will cause windows failover/failback issues, but I am not sure if it would cause a crash. It could cause the wrong paths to be referenced and it looks like during the test IO timed out and failovers were attempted.

Do you want me to fix up the Favorites? We should fix that then rerun the test.

Comment 8 Tejas 2017-10-16 18:33:19 UTC
Mike,
  okay let me try the same run tomorrow with just 4 sessions defined, and we can confirm ifthe crash was due to that.

Comment 9 Mike Christie 2017-10-16 18:46:44 UTC
Tejas,

Ok. Just FYI, I looked at the dmp and it looks like the multiple sessions and bad inquiry data might be the cause for the crash. Here is the trace from the dmp. Of course we do not have the source, but going by the function names, it seems like it might have been trying to update the alua tpg info and so we probably hit the bug I mentioned:

nt!KeBugCheckEx
nt!KiBugCheckDispatch+0x69
nt!KiPageFault+0x247
msdsm!DsmpUpdateTargetPortGroupEntry+0x3d9
msdsm!DsmpParseTargetPortGroupsInformation+0x18b
msdsm!DsmInquire+0xdc7
mpio!DsmPrx_INQUIRE_DRIVER+0x84
mpio!MPIOAddSingleDevice+0x1a6
mpio!MPIODeviceRegistration+0x94
mpio!MPIOFdoInternalDeviceControl+0xd4
mpio!MPIOFdoDispatch+0xa6
CLASSPNP!ClassSendIrpSynchronous+0x4d
CLASSPNP!ClassSendDeviceIoControlSynchronous+0xd9
CLASSPNP!ClasspMpdevStartDevice+0x165
CLASSPNP!ClassMpdevPnPDispatch+0x34e
nt!IoSynchronousCallDriver+0x51
nt!IoForwardIrpSynchronously+0x41
partmgr!PmStartDevice+0x70
partmgr!PmPnp+0x112
partmgr!PmGlobalDispatch+0x63
nt!PnpAsynchronousCall+0xe5
nt!PnpSendIrp+0x92
nt!PnpStartDevice+0x88
nt!PnpStartDeviceNode+0xdb
nt!PipProcessStartPhase1+0x53
nt!PipProcessDevNodeTree+0x401
nt!PiProcessReenumeration+0xa6
nt!PnpDeviceActionWorker+0x166
nt!ExpWorkerThread+0xe9
nt!PspSystemThreadStartup+0x41
nt!KiStartSystemThread+0x16


For the command timeout issue that started this, I think we might have to increase the command timers on the initiators.

Comment 10 Jason Dillaman 2017-10-16 19:02:49 UTC
@Mike: I thought the 25 second initiator timeout was chosen based upon ESX hard-coded limitations? Are you just suggesting increasing the timeout for Linux/Windows initiators

Comment 14 Jason Dillaman 2017-10-25 13:32:34 UTC
OK -- so it sounds like we can close this as NOTABUG if it only occurs when Windows connects to the same target portal multiple times.

Comment 15 Tejas 2017-10-25 13:38:38 UTC
We can keep this open till MCS is implemented and then verify it.

Comment 23 Mike Christie 2019-02-16 22:08:26 UTC
Closing since it was a config issue that we have documented.


Note You need to log in before you can comment on or make changes to this bug.