Bug 1637422

Summary: 4.0: APP pod unable to start after target port failure in cases where single paths are mounted on APP pods(BZ#1599742)
Product: OpenShift Container Platform Reporter: Jan Safranek <jsafrane>
Component: StorageAssignee: Jan Safranek <jsafrane>
Status: CLOSED ERRATA QA Contact: Liang Xia <lxia>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.1.0CC: aos-bugs, aos-storage-staff, apanagio, bchilds, hchiramm, jhou, jsafrane, madam, nberry
Target Milestone: ---Keywords: Reopened
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: atomic-openshift-3.11.24-1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1609703 Environment:
Last Closed: 2019-06-04 10:40:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1609703    
Bug Blocks: 1596021, 1598740, 1609788, 1637413    

Comment 1 Jan Safranek 2018-10-09 09:08:16 UTC
4.0 PR: https://github.com/openshift/origin/pull/21197

Comment 5 Liang Xia 2019-02-21 09:15:46 UTC
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.0.0-0.nightly-2019-02-20-194410   True        False         5h47m   Cluster version is 4.0.0-0.nightly-2019-02-20-194410


$ oc get pv iscsi ; oc get pvc iscsic ; oc get pods iscsi
NAME    CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM         STORAGECLASS   REASON   AGE
iscsi   5Gi        RWO            Retain           Bound    lxia/iscsic                           15m
NAME     STATUS   VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
iscsic   Bound    iscsi    5Gi        RWO                           15m
NAME    READY   STATUS              RESTARTS   AGE
iscsi   0/1     ContainerCreating   0          14m


$ oc describe pod iscsi
Name:               iscsi
Namespace:          lxia
Priority:           0
PriorityClassName:  <none>
Node:               ip-10-0-131-97.ap-northeast-1.compute.internal/10.0.131.97
Start Time:         Thu, 21 Feb 2019 16:59:17 +0800
Labels:             name=iscsi
Annotations:        openshift.io/scc: privileged
Status:             Pending
IP:                 
Containers:
  myfrontend:
    Container ID:   
    Image:          aosqe/hello-openshift
    Image ID:       
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /mnt/iscsi from iscsi (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-hchbw (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  iscsi:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  iscsic
    ReadOnly:   false
  default-token-hchbw:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-hchbw
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                  Age                 From                                                     Message
  ----     ------                  ----                ----                                                     -------
  Normal   Scheduled               14m                 default-scheduler                                        Successfully assigned lxia/iscsi to ip-10-0-131-97.ap-northeast-1.compute.internal
  Normal   SuccessfulAttachVolume  14m                 attachdetach-controller                                  AttachVolume.Attach succeeded for volume "iscsi"
  Warning  FailedMount             76s (x6 over 12m)   kubelet, ip-10-0-131-97.ap-northeast-1.compute.internal  Unable to mount volumes for pod "iscsi_lxia(f8515d6a-35b6-11e9-b36d-060b2a863566)": timeout expired waiting for volumes to attach or mount for pod "lxia"/"iscsi". list of unmounted volumes=[iscsi]. list of unattached volumes=[iscsi default-token-hchbw]
  Warning  FailedMount             15s (x15 over 14m)  kubelet, ip-10-0-131-97.ap-northeast-1.compute.internal  MountVolume.WaitForAttach failed for volume "iscsi" : failed to get any path for iscsi disk, last err seen:
iscsi: failed to attach disk: Error: iscsiadm: No records found
 (exit status 21)

Comment 7 Jan Safranek 2019-02-25 11:14:12 UTC
I talked to Liang on IRC and debugged this issue. It turned out to be misconfiguration of IQN in PV on Kubernets side and and ACLs on target side.

Comment 8 Jan Safranek 2019-02-25 11:14:54 UTC
Heh, moving back to ON_QA to test the rest of the bug :-)

Comment 9 Liang Xia 2019-02-27 08:26:12 UTC
QE had a quick check, iSCSI works on OCP 4.0 now. Thus remove keyword testblocker.

Will verify the bug in detail later.

Comment 10 Liang Xia 2019-02-28 02:54:21 UTC
Verified the issue is fix with below version,

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.0.0-0.nightly-2019-02-25-234632   True        False         5h47m   Cluster version is 4.0.0-0.nightly-2019-02-25-234632

Comment 13 errata-xmlrpc 2019-06-04 10:40:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758