Bug 1943496
| Summary: | [Manila CSI driver] could not mount volume in one node while other nodes work fine | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Wei Duan <wduan> |
| Component: | Storage | Assignee: | Mike Fedosin <mfedosin> |
| Storage sub component: | OpenStack CSI Drivers | QA Contact: | Jon Uriarte <juriarte> |
| Status: | CLOSED WORKSFORME | Docs Contact: | |
| Severity: | high | ||
| Priority: | medium | CC: | adduarte, aos-bugs, emacchi, mbooth, mfedosin, piqin, pprinett, tbarron |
| Version: | 4.7 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.8.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-05-06 11:22:20 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Wei Duan
2021-03-26 09:07:31 UTC
Hit this issue in 4.6.0-0.nightly-2021-03-25-230637 too. One of pod is stuck in "ContainerCreating" status. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2021-03-25-230637 True False 75m Cluster version is 4.6.0-0.nightly-2021-03-25-230637 $ oc get pod NAME READY STATUS RESTARTS AGE ds-5-4jrjf 1/1 Running 0 3m27s ds-5-6qmd5 0/1 ContainerCreating 0 3m27s ds-5-m74xf 1/1 Running 0 3m27s $ oc describe pod ds-5-6qmd5 <skip> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 35s default-scheduler Successfully assigned default/ds-5-6qmd5 to piqin-0326-1-nbxx6-worker-0-st7m5 Warning FailedMount <invalid> kubelet MountVolume.SetUp failed for volume "pvc-c32cdf6e-7803-45ce-bf46-a00eb013a5f2" : rpc error: code = DeadlineExceeded desc = context deadline exceeded Warning FailedMount <invalid> kubelet Unable to attach or mount volumes: unmounted volumes=[local], unattached volumes=[local default-token-slcrx]: timed out waiting for the condition First mustgather doesn't contain manila csi driver logs, so it's hard to understand what happened there. Manila operator didn't report any errors... The second one from Qin Ping has required logs, and there is only one error message: 2021-03-26T11:50:44.707506583Z Mounting command: mount 2021-03-26T11:50:44.707506583Z Mounting arguments: -t nfs 172.16.32.1:/volumes/_nogroup/93174795-380d-4331-9437-e18de9014c86 /var/lib/kubelet/pods/cb38e6bd-2f6e-4472-aac4-a87c1d5d9297/volumes/kubernetes.io~csi/pvc-c32cdf6e-7803-45ce-bf46-a00eb013a5f2/mount 2021-03-26T11:50:44.707506583Z Output: mount.nfs: Connection timed out Could it be a network issue? Maybe. We tried the "mount -t nfs" cmd on the problematic worker node, it returned the same error.
Checked the network config of the problematic worker node, lgtm. Maybe the issue of PSI cluster?
Additional info:
1. Network config for piqin-0326-txhx4-worker-0-8n2t5 node
sh-4.4# ip addr|grep 172
inet 172.16.34.116/20 brd 172.16.47.255 scope global dynamic noprefixroute ens4
172: veth65d6dca8@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default
sh-4.4# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default host-192-168-0- 0.0.0.0 UG 100 0 0 ens3
10.128.0.0 0.0.0.0 255.252.0.0 U 0 0 0 tun0
169.254.169.254 host-192-168-0- 255.255.255.255 UGH 100 0 0 ens3
169.254.169.254 172.16.34.1 255.255.255.255 UGH 101 0 0 ens4
172.16.32.0 0.0.0.0 255.255.240.0 U 101 0 0 ens4
172.30.0.0 0.0.0.0 255.255.0.0 U 0 0 0 tun0
192.168.0.0 0.0.0.0 255.255.192.0 U 100 0 0 ens3
I tried to reproduce it several times on PSI but I couldn't, so I think this issue was caused by an unstable environment. I'm going to close this bz now. Please reopen if the issue happens again. |