Bug 1776773
| Summary: | [Vsphere][4.3] Volume cannot mount to node after upgrade and for from scratch case | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Wei Duan <wduan> | ||||
| Component: | Machine Config Operator | Assignee: | Erica von Buelow <evb> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Michael Nguyen <mnguyen> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | urgent | ||||||
| Version: | 4.3.0 | CC: | aos-bugs, bbennett, chaoyang, evb, hekumar, jcallen, kgarriso, lxia, wduan | ||||
| Target Milestone: | --- | Keywords: | TestBlocker | ||||
| Target Release: | 4.3.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | |||||||
| : | 1777082 (view as bug list) | Environment: | |||||
| Last Closed: | 2020-01-23 11:14:34 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 1777082 | ||||||
| Bug Blocks: | |||||||
| Attachments: |
|
||||||
|
Description
Wei Duan
2019-11-26 11:01:49 UTC
Some more info about one of the PV/volume.
$ oc get pv pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902 1Gi RWO Delete Bound wduan/sc-resourcegroup-04 thin 166m
$ oc get pv pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902 -o yaml | grep volumePath
volumePath: '[nvme-ds1] kubevols/qe-minmli-428-xzwsj-dynamic-pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902.vmdk'
In vSphere events,
Reconfigured compute-3 on vsphere-qe.vmware.devcluster.openshift.com in dc1. Modified: config.hardware.device(1000).device: (2000, 2002, 2001) -> (2000, 2002, 2001, 2003); Added: config.hardware.device(2003): (key = 2003, deviceInfo = (label = "Hard disk 4", summary = "1,048,576 KB"), backing = (fileName = "ds:///vmfs/volumes/5c9ce559-d9430ec0-e8d5-506b4bb49f6a/kubevols/qe-minmli-428-xzwsj-dynamic-pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902.vmdk", datastore = 'vim.Datastore:c95eb2db-783e-4b6a-b867-01da64d6716e:datastore-266', backingObjectId = "", diskMode = "independent_persistent", split = false, writeThrough = false, thinProvisioned = true, eagerlyScrub = <unset>, uuid = "6000C299-4015-7f24-5b5a-7a735746b2d5", contentId = "6fee0501c61825b198c93044fffffffe", changeId = <unset>, parent = null, deltaDiskFormat = <unset>, digestEnabled = false, deltaGrainSize = <unset>, deltaDiskFormatVariant = <unset>, sharing = "sharingNone", keyId = null), connectable = null, slotInfo = null, controllerKey = 1000, unitNumber = 3, capacityInKB = 1048576, capacityInBytes = 1073741824, shares = (shares = 1000, level = "normal"), storageIOAllocation = (limit = -1, shares = (shares = 1000, level = "normal"), reservation = 0), diskObjectId = "3889-2003", vFlashCacheConfigInfo = null, iofilter = <unset>, vDiskId = null); config.extraConfig("scsi0:3.redo"): (key = "scsi0:3.redo", value = ""); Deleted:
$ oc get pod -n wduan
NAME READY STATUS RESTARTS AGE
sc-resourcegroup-04 0/1 ContainerCreating 0 170m
$ oc get pod sc-resourcegroup-04 -n wduan -o yaml | grep nodeName
nodeName: compute-3
$ oc get node compute-3 -o yaml | tail -11
volumesAttached:
- devicePath: /dev/disk/by-id/wwn-0x6000c2994512ce66e7773f4366e9bb8a
name: kubernetes.io/vsphere-volume/[nvme-ds1] kubevols/qe-minmli-428-xzwsj-dynamic-pvc-1ae9daea-101c-11ea-91dc-0050568b94af.vmdk
- devicePath: /dev/disk/by-id/wwn-0x6000c29ac292917e9d030724cb6b45e4
name: kubernetes.io/vsphere-volume/[nvme-ds1] kubevols/qe-minmli-428-xzwsj-dynamic-pvc-1aec1a38-101c-11ea-91dc-0050568b94af.vmdk
- devicePath: /dev/disk/by-id/wwn-0x6000c29940157f245b5a7a735746b2d5
name: kubernetes.io/vsphere-volume/[nvme-ds1] kubevols/qe-minmli-428-xzwsj-dynamic-pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902.vmdk
volumesInUse:
- kubernetes.io/vsphere-volume/1bc67be5-5c6f-44e7-ae96-73bc93fc198c-pvc-1aec1a38-101c-11ea-91dc-0050568b94af
- kubernetes.io/vsphere-volume/d6e9b986-807b-4b44-b077-50a8d1103255-pvc-1ae9daea-101c-11ea-91dc-0050568b94af
- kubernetes.io/vsphere-volume/e36bc1da-d27b-4caf-ae1b-e4fbfeec0808-pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902
On the node compute-3,
sh-4.4# /usr/lib/udev/scsi_id -g -u -d /dev/sdd
36000c29940157f245b5a7a735746b2d5
sh-4.4# mount | grep sdd
sh-4.4# df -h | grep sdd
sh-4.4# ls -lh /dev/sdd
brw-rw----. 1 root disk 8, 48 Nov 26 10:00 /dev/sdd
sh-4.4# fdisk -l /dev/sdd
Disk /dev/sdd: 1 GiB, 1073741824 bytes, 2097152 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
sh-4.4# ls -lh /dev/disk/by-uuid/
total 0
lrwxrwxrwx. 1 root root 10 Nov 26 08:02 477c3d77-20c6-4ff3-8bb3-dc2543eedfbd -> ../../sda3
lrwxrwxrwx. 1 root root 9 Nov 26 09:58 544b95be-c9f0-4fc9-92f3-3942ea0fc81d -> ../../sdb
lrwxrwxrwx. 1 root root 10 Nov 26 08:03 91de875e-af22-4585-91cb-e74437f6af68 -> ../../sda2
lrwxrwxrwx. 1 root root 9 Nov 26 08:04 d9ebc209-83dc-4bac-88dd-cb3eaeabbdda -> ../../sdc
$ oc get pods -n wduan -o yaml | grep -w uid
uid: e36bc1da-d27b-4caf-ae1b-e4fbfeec0808
sh-4.4# mount | grep e36bc1da-d27b-4caf-ae1b-e4fbfeec0808
tmpfs on /var/lib/kubelet/pods/e36bc1da-d27b-4caf-ae1b-e4fbfeec0808/volumes/kubernetes.io~secret/default-token-5vkxf type tmpfs (rw,relatime,seclabel)
sh-4.4# df -h | grep e36bc1da-d27b-4caf-ae1b-e4fbfeec0808
tmpfs 3.9G 24K 3.9G 1% /var/lib/kubelet/pods/e36bc1da-d27b-4caf-ae1b-e4fbfeec0808/volumes/kubernetes.io~secret/default-token-5vkxf
$ oc describe pod -n wduan
Name: sc-resourcegroup-04
Namespace: wduan
Priority: 0
PriorityClassName: <none>
Node: compute-3/139.178.76.25
Start Time: Tue, 26 Nov 2019 18:00:10 +0800
Labels: name=frontendhttp
Annotations: openshift.io/scc: anyuid
Status: Pending
IP:
Containers:
myfrontend:
Container ID:
Image: docker.io/aosqe/hello-openshift
Image ID:
Port: 80/TCP
Host Port: 0/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/mnt/local from local (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-5vkxf (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
local:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: sc-resourcegroup-04
ReadOnly: false
default-token-5vkxf:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-5vkxf
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 29m (x10 over 151m) kubelet, compute-3 Unable to attach or mount volumes: unmounted volumes=[local], unattached volumes=[default-token-5vkxf local]: timed out waiting for the condition
Warning FailedMount 9m48s (x92 over 3h2m) kubelet, compute-3 (combined from similar events): MountVolume.SetUp failed for volume "pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902" : mount failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/e36bc1da-d27b-4caf-ae1b-e4fbfeec0808/volumes/kubernetes.io~vsphere-volume/pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902 --scope -- mount -o bind /var/lib/kubelet/plugins/kubernetes.io/vsphere-volume/mounts/[nvme-ds1] kubevols/qe-minmli-428-xzwsj-dynamic-pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902.vmdk /var/lib/kubelet/pods/e36bc1da-d27b-4caf-ae1b-e4fbfeec0808/volumes/kubernetes.io~vsphere-volume/pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902
Output: Running scope as unit: run-rd4c939ccc1564752b1995d2c19c3affb.scope
mount: /var/lib/kubelet/pods/e36bc1da-d27b-4caf-ae1b-e4fbfeec0808/volumes/kubernetes.io~vsphere-volume/pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902: special device /var/lib/kubelet/plugins/kubernetes.io/vsphere-volume/mounts/[nvme-ds1] kubevols/qe-minmli-428-xzwsj-dynamic-pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902.vmdk does not exist.
Warning FailedMount 4m9s (x61 over 3h2m) kubelet, compute-3 Unable to attach or mount volumes: unmounted volumes=[local], unattached volumes=[local default-token-5vkxf]: timed out waiting for the condition
*** Bug 1775685 has been marked as a duplicate of this bug. *** *** Bug 1777195 has been marked as a duplicate of this bug. *** For an update, we had some CI issues blocking merging this PR. We have resolved that problem and should be able to merge the fix into 4.3 as soon as within a couple hours: https://github.com/openshift/machine-config-operator/pull/1293 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062 |