Bug 1637422
| Summary: | 4.0: APP pod unable to start after target port failure in cases where single paths are mounted on APP pods(BZ#1599742) | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Jan Safranek <jsafrane> |
| Component: | Storage | Assignee: | Jan Safranek <jsafrane> |
| Status: | CLOSED ERRATA | QA Contact: | Liang Xia <lxia> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.1.0 | CC: | aos-bugs, aos-storage-staff, apanagio, bchilds, hchiramm, jhou, jsafrane, madam, nberry |
| Target Milestone: | --- | Keywords: | Reopened |
| Target Release: | 4.1.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | atomic-openshift-3.11.24-1 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1609703 | Environment: | |
| Last Closed: | 2019-06-04 10:40:35 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1609703 | ||
| Bug Blocks: | 1596021, 1598740, 1609788, 1637413 | ||
|
Comment 1
Jan Safranek
2018-10-09 09:08:16 UTC
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.0.0-0.nightly-2019-02-20-194410 True False 5h47m Cluster version is 4.0.0-0.nightly-2019-02-20-194410
$ oc get pv iscsi ; oc get pvc iscsic ; oc get pods iscsi
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
iscsi 5Gi RWO Retain Bound lxia/iscsic 15m
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
iscsic Bound iscsi 5Gi RWO 15m
NAME READY STATUS RESTARTS AGE
iscsi 0/1 ContainerCreating 0 14m
$ oc describe pod iscsi
Name: iscsi
Namespace: lxia
Priority: 0
PriorityClassName: <none>
Node: ip-10-0-131-97.ap-northeast-1.compute.internal/10.0.131.97
Start Time: Thu, 21 Feb 2019 16:59:17 +0800
Labels: name=iscsi
Annotations: openshift.io/scc: privileged
Status: Pending
IP:
Containers:
myfrontend:
Container ID:
Image: aosqe/hello-openshift
Image ID:
Port: <none>
Host Port: <none>
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/mnt/iscsi from iscsi (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-hchbw (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
iscsi:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: iscsic
ReadOnly: false
default-token-hchbw:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-hchbw
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 14m default-scheduler Successfully assigned lxia/iscsi to ip-10-0-131-97.ap-northeast-1.compute.internal
Normal SuccessfulAttachVolume 14m attachdetach-controller AttachVolume.Attach succeeded for volume "iscsi"
Warning FailedMount 76s (x6 over 12m) kubelet, ip-10-0-131-97.ap-northeast-1.compute.internal Unable to mount volumes for pod "iscsi_lxia(f8515d6a-35b6-11e9-b36d-060b2a863566)": timeout expired waiting for volumes to attach or mount for pod "lxia"/"iscsi". list of unmounted volumes=[iscsi]. list of unattached volumes=[iscsi default-token-hchbw]
Warning FailedMount 15s (x15 over 14m) kubelet, ip-10-0-131-97.ap-northeast-1.compute.internal MountVolume.WaitForAttach failed for volume "iscsi" : failed to get any path for iscsi disk, last err seen:
iscsi: failed to attach disk: Error: iscsiadm: No records found
(exit status 21)
I talked to Liang on IRC and debugged this issue. It turned out to be misconfiguration of IQN in PV on Kubernets side and and ACLs on target side. Heh, moving back to ON_QA to test the rest of the bug :-) QE had a quick check, iSCSI works on OCP 4.0 now. Thus remove keyword testblocker. Will verify the bug in detail later. Verified the issue is fix with below version, $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.nightly-2019-02-25-234632 True False 5h47m Cluster version is 4.0.0-0.nightly-2019-02-25-234632 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758 |