Bug 1637422
Summary: | 4.0: APP pod unable to start after target port failure in cases where single paths are mounted on APP pods(BZ#1599742) | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jan Safranek <jsafrane> |
Component: | Storage | Assignee: | Jan Safranek <jsafrane> |
Status: | CLOSED ERRATA | QA Contact: | Liang Xia <lxia> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.1.0 | CC: | aos-bugs, aos-storage-staff, apanagio, bchilds, hchiramm, jhou, jsafrane, madam, nberry |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | 4.1.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | atomic-openshift-3.11.24-1 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | 1609703 | Environment: | |
Last Closed: | 2019-06-04 10:40:35 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Bug Depends On: | 1609703 | ||
Bug Blocks: | 1596021, 1598740, 1609788, 1637413 |
Comment 1
Jan Safranek
2018-10-09 09:08:16 UTC
$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-02-20-194410 True False 5h47m Cluster version is 4.0.0-0.nightly-2019-02-20-194410 $ oc get pv iscsi ; oc get pvc iscsic ; oc get pods iscsi NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE iscsi 5Gi RWO Retain Bound lxia/iscsic 15m NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE iscsic Bound iscsi 5Gi RWO 15m NAME READY STATUS RESTARTS AGE iscsi 0/1 ContainerCreating 0 14m $ oc describe pod iscsi Name: iscsi Namespace: lxia Priority: 0 PriorityClassName: <none> Node: ip-10-0-131-97.ap-northeast-1.compute.internal/10.0.131.97 Start Time: Thu, 21 Feb 2019 16:59:17 +0800 Labels: name=iscsi Annotations: openshift.io/scc: privileged Status: Pending IP: Containers: myfrontend: Container ID: Image: aosqe/hello-openshift Image ID: Port: <none> Host Port: <none> State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Environment: <none> Mounts: /mnt/iscsi from iscsi (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-hchbw (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: iscsi: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: iscsic ReadOnly: false default-token-hchbw: Type: Secret (a volume populated by a Secret) SecretName: default-token-hchbw Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 14m default-scheduler Successfully assigned lxia/iscsi to ip-10-0-131-97.ap-northeast-1.compute.internal Normal SuccessfulAttachVolume 14m attachdetach-controller AttachVolume.Attach succeeded for volume "iscsi" Warning FailedMount 76s (x6 over 12m) kubelet, ip-10-0-131-97.ap-northeast-1.compute.internal Unable to mount volumes for pod "iscsi_lxia(f8515d6a-35b6-11e9-b36d-060b2a863566)": timeout expired waiting for volumes to attach or mount for pod "lxia"/"iscsi". list of unmounted volumes=[iscsi]. list of unattached volumes=[iscsi default-token-hchbw] Warning FailedMount 15s (x15 over 14m) kubelet, ip-10-0-131-97.ap-northeast-1.compute.internal MountVolume.WaitForAttach failed for volume "iscsi" : failed to get any path for iscsi disk, last err seen: iscsi: failed to attach disk: Error: iscsiadm: No records found (exit status 21) I talked to Liang on IRC and debugged this issue. It turned out to be misconfiguration of IQN in PV on Kubernets side and and ACLs on target side. Heh, moving back to ON_QA to test the rest of the bug :-) QE had a quick check, iSCSI works on OCP 4.0 now. Thus remove keyword testblocker. Will verify the bug in detail later. Verified the issue is fix with below version, $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-02-25-234632 True False 5h47m Cluster version is 4.0.0-0.nightly-2019-02-25-234632 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758 |