Bug 1387994

Summary: persistent reservation: PR out (Register): Not ready
Product: [Fedora] Fedora Reporter: Prasanna Kumar Kalever <prasanna.kalever>
Component: tcmu-runnerAssignee: Maurizio Lombardi <mlombard>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 26CC: agrover, mchristi, pkarampu, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-29 11:36:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Prasanna Kumar Kalever 2016-10-24 07:17:56 UTC
Description of problem:

[root@dhcp43-174 ~]# iscsiadm -m discovery -t st -p 10.70.42.88 -l                                                                 
10.70.42.88:3260,1 iqn.2016-08.org.gluster:10.70.42.88
Logging in to [iface: default, target: iqn.2016-08.org.gluster:10.70.42.88, portal: 10.70.42.88,3260] (multiple)
Login to [iface: default, target: iqn.2016-08.org.gluster:10.70.42.88, portal: 10.70.42.88,3260] successful.

[root@dhcp43-174 ~]# lsblk
NAME                        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                           8:0    0    8G  0 disk 
[...]

[root@dhcp43-174 ~]# dmesg 
[324420.506809] scsi host3: iSCSI Initiator over TCP/IP
[324420.515311] scsi 3:0:0:0: Direct-Access     LIO-ORG  TCMU device      0002 PQ: 0 ANSI: 5
[324420.520934] sd 3:0:0:0: Attached scsi generic sg1 type 0
[324420.521883] sd 3:0:0:0: [sda] 16777216 512-byte logical blocks: (8.59 GB/8.00 GiB)
[324420.524148] sd 3:0:0:0: [sda] Write Protect is off
[324420.524156] sd 3:0:0:0: [sda] Mode Sense: 03 00 00 00
[324420.524545] sd 3:0:0:0: [sda] Asking for cache data failed
[324420.524642] sd 3:0:0:0: [sda] Assuming drive cache: write through
[324420.541886] sd 3:0:0:0: [sda] Attached SCSI disk

[root@dhcp43-174 ~]# sg_persist -n -v --out --register --param-sark=0x123ABC --device=/dev/sda                                     
    Persistent Reservation Out cmd: 5f 00 00 00 00 00 00 00 18 00 
persistent reserve out:  Fixed format, current;  Sense key: Not Ready
 Additional sense: Logical unit communication failure
PR out (Register): Not ready sense key

[root@dhcp43-174 ~]# sg_persist --out --register --param-sark=0x123ABC --device=/dev/sda                                           
  LIO-ORG   TCMU device       0002
  Peripheral device type: disk
PR out (Register): Not ready


[root@dhcp43-174 ~]# sg_persist -n -v --read-keys --device=/dev/sda                                                                
    Persistent Reservation In cmd: 5e 00 00 00 00 00 00 20 00 00 
persistent reservation in:  Fixed format, current;  Sense key: Not Ready
 Additional sense: Logical unit communication failure
PR in (Read keys): Not ready sense key

[root@dhcp43-174 ~]# sg_persist --read-keys --device=/dev/sda                                                                      
  LIO-ORG   TCMU device       0002
  Peripheral device type: disk
PR in (Read keys): Not ready

[root@dhcp43-174 ~]# dmesg 
[324420.506809] scsi host3: iSCSI Initiator over TCP/IP
[324420.515311] scsi 3:0:0:0: Direct-Access     LIO-ORG  TCMU device      0002 PQ: 0 ANSI: 5
[324420.520934] sd 3:0:0:0: Attached scsi generic sg1 type 0
[324420.521883] sd 3:0:0:0: [sda] 16777216 512-byte logical blocks: (8.59 GB/8.00 GiB)
[324420.524148] sd 3:0:0:0: [sda] Write Protect is off
[324420.524156] sd 3:0:0:0: [sda] Mode Sense: 03 00 00 00
[324420.524545] sd 3:0:0:0: [sda] Asking for cache data failed
[324420.524642] sd 3:0:0:0: [sda] Assuming drive cache: write through
[324420.541886] sd 3:0:0:0: [sda] Attached SCSI disk

How reproducible:
100%

Steps to Reproduce:
1. Login to iscsi target exposed using tcmu-runner glfs (gluster API) interface
2. lsblk, notice the device name say /dev/sda
3. sg_persist --out --register --param-sark=0x123ABC --device=/dev/sda 

Actual results:
Say's "Not Ready"

Expected results:
Register the key

Additional info:
Googled around followed some suggestion of creating the missing directories 'alua and pr' under /var/target/ but that didn't work

# ls /var/target/
alua  pr

Comment 1 Andy Grover 2016-11-08 17:01:17 UTC
TCMU is passing through all SCSI commands. Support for PR would need to be implemented in a tcmu-runner helper function or the glfs handler, rather than relying on PR support in the kernel.

Comment 2 Prasanna Kumar Kalever 2016-11-11 08:49:35 UTC
Could you please let us know your plans on the iSCSI PR support commands
implementation in tcmu-runner

Comment 3 Andy Grover 2016-11-11 16:40:05 UTC
I don't have any near-term plans to implement these myself, but would certainly merge an implementation if someone contributed it. CCing mchristi, I think he might want these too -- collaborate?

Comment 4 Mike Christie 2016-11-11 20:02:44 UTC
Yeah, if/when we go with tcmu we would need the same functionality.

Single node support should not be too difficult.

We need HA support though, so the info needs to be distributed and sync across multiple nodes running LIO. Something like DLM (look at SCST's PR DLM module for an example in kernel but it could be done in userspace too) or a device specific lock/API would need to be used.

Prasanna, are you guys also needing HA?

Comment 5 Prasanna Kumar Kalever 2016-11-15 06:32:19 UTC
Yes Mike, we actually may have to run the tcmu-runner in multiple nodes with same wwn to achieve multipathing at the client side.

The challenges here are,
1. Locking the the block device to one initiator at a time, only one Initiator can have write access, while other can read the block.

2. In case if a client (which gain write access) mounts the block device, applications (in the modern case Kubernates) should make sure to allow single writer from the mount points.

Mike, any plans ?

Comment 6 Mike Christie 2016-11-15 18:52:36 UTC
(In reply to Prasanna Kumar Kalever from comment #5)
> Yes Mike, we actually may have to run the tcmu-runner in multiple nodes with
> same wwn to achieve multipathing at the client side.
> 
> The challenges here are,
> 1. Locking the the block device to one initiator at a time, only one
> Initiator can have write access, while other can read the block.
> 
> 2. In case if a client (which gain write access) mounts the block device,
> applications (in the modern case Kubernates) should make sure to allow
> single writer from the mount points.
> 
> Mike, any plans ?

I will send you a email with info and links, so we can discuss again.

Comment 7 Fedora End Of Life 2017-02-28 10:30:10 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 26 development cycle.
Changing version to '26'.

Comment 9 Fedora End Of Life 2018-05-03 08:23:40 UTC
This message is a reminder that Fedora 26 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 26. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '26'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 26 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 10 Fedora End Of Life 2018-05-29 11:36:23 UTC
Fedora 26 changed to end-of-life (EOL) status on 2018-05-29. Fedora 26
is no longer maintained, which means that it will not receive any
further security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.