Bug 1975368 - Pure Storage iSCSI Driver: LUN with id >255 can't be connected properly when flat addressing is used
Summary: Pure Storage iSCSI Driver: LUN with id >255 can't be connected properly when ...
Keywords:
Status: NEW
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-cinder
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Gorka Eguileor
QA Contact: Evelina Shames
RHOS Documentation Team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-23 14:10 UTC by Takashi Kajinami
Modified: 2024-01-09 15:52 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 2006960 0 None None None 2023-02-21 19:29:59 UTC
OpenStack gerrit 874689 0 None NEW WIP: Support non SAM LUN addressing 2023-02-21 19:29:59 UTC
OpenStack gerrit 874690 0 None NEW Pure: Report SAM-2 addressing mode for LUNs 2023-02-21 19:29:59 UTC
OpenStack gerrit 905127 0 None NEW SCSI: Support non SAM LUN addressing 2024-01-09 15:52:50 UTC
OpenStack gerrit 905129 0 None NEW Pure: Report SAM-2 addressing mode for LUNs 2024-01-09 15:52:50 UTC
Red Hat Issue Tracker OSP-5414 0 None None None 2022-11-24 08:44:23 UTC

Description Takashi Kajinami 2021-06-23 14:10:58 UTC
Description of problem:

Pure Storage by default uses peripheral addressing for LUN ID < 256 while it uses flat addressing for LUN >= 256 .

When peripheral addressing is used then LUN IDs presented in iscsi portals on the storage equipment look exactly same as LUN IDs.
 0x0001 > LUN  1
 0x000a > LUN 10
 0x0021 > LUN 33

On the other hand, when flat addressing is used, LUN IDs look like having additional 0x400.
This is because Pure Storage uses that 0x4000 as a flag to indicate that it is using flat addressing.
 0x4100 > LUN 256
 0x4104 > LUN 260
 0x4110 > LUN 272

However a problem is that RHEL doesn't treat that 0x4000 separately but it mixes up it with the raw LUN ID.
This means that if RHEL scans a volume with ID >= 256, it can't use the raw LUN ID but need to increase it by 16384 so that device can be detected as expected.

$ echo '0 0 220' | sudo tee -a /sys/class/scsi_host/host<host>/scan
 -> This works because LUN ID < 256

$ echo '0 0 261' | sudo tee -a /sys/class/scsi_host/host<host>/scan
 -> This doesn't work because LUN ID >= 256

$ echo '0 0 16645' | sudo tee -a /sys/class/scsi_host/host<host>/scan
 -> This works because 261 + 16384 = 16645

Currently pure storage driver and os-brick are not aware of this behavior.
The pure storage driver returns a raw lun id (like 261) even when lun id is greater than 255, and os-brick uses that raw lun id, and RHEL can't detect the scsi device properly.

As per discussion with Pure Storage support it is likely that this problem can be solved by setting host personality to oracle-vm-server(this makes all LUN presented use peripheral addressing method) but it'd be useful if cinder or os-brick can detect flat addressing automatically and use proper lun id when scanning scsi devices.


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Create a volume
2. Attach a volume to an instance

Actual results:
Volume attachment fails because of no iscsi device found if LUN ID > 255 and flat addressing is used.

Expected results:
Volume attachment succeeds even if LUN ID > 255 and flat addressing is used.


Additional info:

Comment 3 Alan Bishop 2021-06-24 17:54:38 UTC
I realize a customer is tripping over this issue, and from reading the links in comment #2 it seems to be somewhat controversial. I found another post from 2-1/2 years ago with comments from the sample people: https://github.com/hreinecke/sg3_utils/issues/31

The sticking point for me is the firm statement from the subject matter expert (SME) in the third link suggests the SCSI standards really don't lend themselves to inferring the target's addressing mode. It seems like many factors can be involved:

- The target's behavior
- The initiator's HBA
- The Linux kernel
- The userspace tools (e.g. sg_utils)

It doesn't seem a good idea to me for cinder and os-brick to jump into the middle and magically resolve a problem the other communities continue to struggle with. If the issue can be handled by tuning the cinder Pure driver's configuration (setting the host personality to oracle-vm-server) then that seems prudent. This is my own opinion, and other RH cinder team members may have a different view.

Comment 5 Gorka Eguileor 2023-02-10 16:35:57 UTC
As it was said in previous comments the reason why the device doesn't appear is because the scan is failing, and since we are using the manual scan feature in OpeniSCSI to prevent races the device doesn't appear without a successful scan.

Just making os-brick detect that LUN > 255 and then assuming that the storage system is using flat space addressing is probably not the right thing to do.

If the storage system is using SAM-2 commands, then os-brick needs to pass 256 to the scan command, whereas if it's using SCSI-3, then it needs to pass 16640.

The solution would be to add a parameter to the connection information that tells the addressing mode when a conversion is needed (defaults to no conversion), and add the code to handle different values in os-brick.

Comment 6 Gorka Eguileor 2023-02-21 19:30:00 UTC
I've looked a bit more into it, and my last comment is not correct.

SAM: Transparent 64bits
SAM-2:
 - LUN < 256 uses peripheral addressing, which is equivalent to transparent since MSB high bits are 00b
 - LUN >= 256 uses flat addressing, which has an offset of 16384 because the MSB high bits are 01b
 SAM-3:
  - LUN < 256 the storage array can chose the addressing mode between peripheral addressing or flat addressing (same offset)
  - LUN >= 256 flat addressing

And these are just some of the addressing modes, since peripheral can do multi-level.

I have proposed a wip patch to add support for some of the basic addressing modes in os-brick as well as another patch to Pure iSCSI and FC drivers that leverage it.
Pure will be testing this manually upstream since usual testing don't have that many LUNs mapped to the same host.


Note You need to log in before you can comment on or make changes to this bug.