Bug 1776773
Summary: | [Vsphere][4.3] Volume cannot mount to node after upgrade and for from scratch case | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Wei Duan <wduan> | ||||
Component: | Machine Config Operator | Assignee: | Erica von Buelow <evb> | ||||
Status: | CLOSED ERRATA | QA Contact: | Michael Nguyen <mnguyen> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 4.3.0 | CC: | aos-bugs, bbennett, chaoyang, evb, hekumar, jcallen, kgarriso, lxia, wduan | ||||
Target Milestone: | --- | Keywords: | TestBlocker | ||||
Target Release: | 4.3.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1777082 (view as bug list) | Environment: | |||||
Last Closed: | 2020-01-23 11:14:34 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 1777082 | ||||||
Bug Blocks: | |||||||
Attachments: |
|
Description
Wei Duan
2019-11-26 11:01:49 UTC
Some more info about one of the PV/volume. $ oc get pv pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902 NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902 1Gi RWO Delete Bound wduan/sc-resourcegroup-04 thin 166m $ oc get pv pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902 -o yaml | grep volumePath volumePath: '[nvme-ds1] kubevols/qe-minmli-428-xzwsj-dynamic-pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902.vmdk' In vSphere events, Reconfigured compute-3 on vsphere-qe.vmware.devcluster.openshift.com in dc1. Modified: config.hardware.device(1000).device: (2000, 2002, 2001) -> (2000, 2002, 2001, 2003); Added: config.hardware.device(2003): (key = 2003, deviceInfo = (label = "Hard disk 4", summary = "1,048,576 KB"), backing = (fileName = "ds:///vmfs/volumes/5c9ce559-d9430ec0-e8d5-506b4bb49f6a/kubevols/qe-minmli-428-xzwsj-dynamic-pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902.vmdk", datastore = 'vim.Datastore:c95eb2db-783e-4b6a-b867-01da64d6716e:datastore-266', backingObjectId = "", diskMode = "independent_persistent", split = false, writeThrough = false, thinProvisioned = true, eagerlyScrub = <unset>, uuid = "6000C299-4015-7f24-5b5a-7a735746b2d5", contentId = "6fee0501c61825b198c93044fffffffe", changeId = <unset>, parent = null, deltaDiskFormat = <unset>, digestEnabled = false, deltaGrainSize = <unset>, deltaDiskFormatVariant = <unset>, sharing = "sharingNone", keyId = null), connectable = null, slotInfo = null, controllerKey = 1000, unitNumber = 3, capacityInKB = 1048576, capacityInBytes = 1073741824, shares = (shares = 1000, level = "normal"), storageIOAllocation = (limit = -1, shares = (shares = 1000, level = "normal"), reservation = 0), diskObjectId = "3889-2003", vFlashCacheConfigInfo = null, iofilter = <unset>, vDiskId = null); config.extraConfig("scsi0:3.redo"): (key = "scsi0:3.redo", value = ""); Deleted: $ oc get pod -n wduan NAME READY STATUS RESTARTS AGE sc-resourcegroup-04 0/1 ContainerCreating 0 170m $ oc get pod sc-resourcegroup-04 -n wduan -o yaml | grep nodeName nodeName: compute-3 $ oc get node compute-3 -o yaml | tail -11 volumesAttached: - devicePath: /dev/disk/by-id/wwn-0x6000c2994512ce66e7773f4366e9bb8a name: kubernetes.io/vsphere-volume/[nvme-ds1] kubevols/qe-minmli-428-xzwsj-dynamic-pvc-1ae9daea-101c-11ea-91dc-0050568b94af.vmdk - devicePath: /dev/disk/by-id/wwn-0x6000c29ac292917e9d030724cb6b45e4 name: kubernetes.io/vsphere-volume/[nvme-ds1] kubevols/qe-minmli-428-xzwsj-dynamic-pvc-1aec1a38-101c-11ea-91dc-0050568b94af.vmdk - devicePath: /dev/disk/by-id/wwn-0x6000c29940157f245b5a7a735746b2d5 name: kubernetes.io/vsphere-volume/[nvme-ds1] kubevols/qe-minmli-428-xzwsj-dynamic-pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902.vmdk volumesInUse: - kubernetes.io/vsphere-volume/1bc67be5-5c6f-44e7-ae96-73bc93fc198c-pvc-1aec1a38-101c-11ea-91dc-0050568b94af - kubernetes.io/vsphere-volume/d6e9b986-807b-4b44-b077-50a8d1103255-pvc-1ae9daea-101c-11ea-91dc-0050568b94af - kubernetes.io/vsphere-volume/e36bc1da-d27b-4caf-ae1b-e4fbfeec0808-pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902 On the node compute-3, sh-4.4# /usr/lib/udev/scsi_id -g -u -d /dev/sdd 36000c29940157f245b5a7a735746b2d5 sh-4.4# mount | grep sdd sh-4.4# df -h | grep sdd sh-4.4# ls -lh /dev/sdd brw-rw----. 1 root disk 8, 48 Nov 26 10:00 /dev/sdd sh-4.4# fdisk -l /dev/sdd Disk /dev/sdd: 1 GiB, 1073741824 bytes, 2097152 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes sh-4.4# ls -lh /dev/disk/by-uuid/ total 0 lrwxrwxrwx. 1 root root 10 Nov 26 08:02 477c3d77-20c6-4ff3-8bb3-dc2543eedfbd -> ../../sda3 lrwxrwxrwx. 1 root root 9 Nov 26 09:58 544b95be-c9f0-4fc9-92f3-3942ea0fc81d -> ../../sdb lrwxrwxrwx. 1 root root 10 Nov 26 08:03 91de875e-af22-4585-91cb-e74437f6af68 -> ../../sda2 lrwxrwxrwx. 1 root root 9 Nov 26 08:04 d9ebc209-83dc-4bac-88dd-cb3eaeabbdda -> ../../sdc $ oc get pods -n wduan -o yaml | grep -w uid uid: e36bc1da-d27b-4caf-ae1b-e4fbfeec0808 sh-4.4# mount | grep e36bc1da-d27b-4caf-ae1b-e4fbfeec0808 tmpfs on /var/lib/kubelet/pods/e36bc1da-d27b-4caf-ae1b-e4fbfeec0808/volumes/kubernetes.io~secret/default-token-5vkxf type tmpfs (rw,relatime,seclabel) sh-4.4# df -h | grep e36bc1da-d27b-4caf-ae1b-e4fbfeec0808 tmpfs 3.9G 24K 3.9G 1% /var/lib/kubelet/pods/e36bc1da-d27b-4caf-ae1b-e4fbfeec0808/volumes/kubernetes.io~secret/default-token-5vkxf $ oc describe pod -n wduan Name: sc-resourcegroup-04 Namespace: wduan Priority: 0 PriorityClassName: <none> Node: compute-3/139.178.76.25 Start Time: Tue, 26 Nov 2019 18:00:10 +0800 Labels: name=frontendhttp Annotations: openshift.io/scc: anyuid Status: Pending IP: Containers: myfrontend: Container ID: Image: docker.io/aosqe/hello-openshift Image ID: Port: 80/TCP Host Port: 0/TCP State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Environment: <none> Mounts: /mnt/local from local (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-5vkxf (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: local: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: sc-resourcegroup-04 ReadOnly: false default-token-5vkxf: Type: Secret (a volume populated by a Secret) SecretName: default-token-5vkxf Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedMount 29m (x10 over 151m) kubelet, compute-3 Unable to attach or mount volumes: unmounted volumes=[local], unattached volumes=[default-token-5vkxf local]: timed out waiting for the condition Warning FailedMount 9m48s (x92 over 3h2m) kubelet, compute-3 (combined from similar events): MountVolume.SetUp failed for volume "pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902" : mount failed: exit status 32 Mounting command: systemd-run Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/e36bc1da-d27b-4caf-ae1b-e4fbfeec0808/volumes/kubernetes.io~vsphere-volume/pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902 --scope -- mount -o bind /var/lib/kubelet/plugins/kubernetes.io/vsphere-volume/mounts/[nvme-ds1] kubevols/qe-minmli-428-xzwsj-dynamic-pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902.vmdk /var/lib/kubelet/pods/e36bc1da-d27b-4caf-ae1b-e4fbfeec0808/volumes/kubernetes.io~vsphere-volume/pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902 Output: Running scope as unit: run-rd4c939ccc1564752b1995d2c19c3affb.scope mount: /var/lib/kubelet/pods/e36bc1da-d27b-4caf-ae1b-e4fbfeec0808/volumes/kubernetes.io~vsphere-volume/pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902: special device /var/lib/kubelet/plugins/kubernetes.io/vsphere-volume/mounts/[nvme-ds1] kubevols/qe-minmli-428-xzwsj-dynamic-pvc-91bd1cbc-9e92-41d0-ae14-960084f3d902.vmdk does not exist. Warning FailedMount 4m9s (x61 over 3h2m) kubelet, compute-3 Unable to attach or mount volumes: unmounted volumes=[local], unattached volumes=[local default-token-5vkxf]: timed out waiting for the condition *** Bug 1775685 has been marked as a duplicate of this bug. *** *** Bug 1777195 has been marked as a duplicate of this bug. *** For an update, we had some CI issues blocking merging this PR. We have resolved that problem and should be able to merge the fix into 4.3 as soon as within a couple hours: https://github.com/openshift/machine-config-operator/pull/1293 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062 |