Bug 1845853

Summary: Pods consuming azuredisk volume are stuck in ContainerCreating status on RHEL78 node
Product: OpenShift Container Platform Reporter: Qin Ping <piqin>
Component: StorageAssignee: Jan Safranek <jsafrane>
Status: CLOSED NOTABUG QA Contact: Qin Ping <piqin>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 4.3.zCC: aos-bugs, jsafrane, scuppett
Target Milestone: ---Keywords: TestBlocker
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1847085 (view as bug list) Environment:
Last Closed: 2020-06-15 18:45:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1847085, 1847087    

Description Qin Ping 2020-06-10 09:13:17 UTC
Description of problem:
Pods consuming azuredisk volume are stuck in ContainerCreating status on RHEL78 node

Version-Release number of selected component (if applicable):
4.3.0-0.nightly-2020-06-08-205315

How reproducible:
Always

Steps to Reproduce:
1. Deploy a cluster with RHEL78 nodes only
2. Create a PVC with default storageclass
3. Create a Pod using this PVC


Actual results:
# oc get pod -n wduan
NAME                              READY   STATUS              RESTARTS   AGE
mydeploy03-85dd4c8d4d-zvp2w       0/1     ContainerCreating   0          132m

# oc describe pod mydeploy03-85dd4c8d4d-zvp2w -n wduan
<skip>
Events:
  Type     Reason                  Age                    From                              Message
  ----     ------                  ----                   ----                              -------
  Warning  FailedScheduling        <unknown>              default-scheduler                 persistentvolumeclaim "mydep-pvc03" not found
  Warning  FailedScheduling        <unknown>              default-scheduler                 persistentvolumeclaim "mydep-pvc03" not found
  Normal   Scheduled               <unknown>              default-scheduler                 Successfully assigned wduan/mydeploy03-85dd4c8d4d-zvp2w to qe-43debug-kjqrr-rhel-2
  Normal   SuccessfulAttachVolume  133m                   attachdetach-controller           AttachVolume.Attach succeeded for volume "pvc-b757615d-39c0-4f81-96b2-4efa447de13e"
  Warning  FailedMount             56m (x8 over 122m)     kubelet, qe-43debug-kjqrr-rhel-2  Unable to attach or mount volumes: unmounted volumes=[local], unattached volumes=[default-token-lhwjc local]: timed out waiting for the condition
  Warning  FailedMount             17m (x39 over 131m)    kubelet, qe-43debug-kjqrr-rhel-2  Unable to attach or mount volumes: unmounted volumes=[local], unattached volumes=[local default-token-lhwjc]: timed out waiting for the condition
  Warning  FailedMount             2m22s (x71 over 133m)  kubelet, qe-43debug-kjqrr-rhel-2  MountVolume.WaitForAttach failed for volume "pvc-b757615d-39c0-4f81-96b2-4efa447de13e" : azureDisk - WaitForAttach failed within timeout node (qe-43debug-kjqrr-rhel-2) diskId:(qe-43debug-kjqrr-dynamic-pvc-b757615d-39c0-4f81-96b2-4efa447de13e) lun:(1)


Expected results:
Pod can be run successfully.

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:
1. If the pod is rescheduled to the RHCOS node, the pod can be run successfully.
2. Checked from azure webconsle, the volume is attached to the rhel node successfully.
3. There are some difference of kubelets on RHCOS node and RHEL node.
RHCOS Node:
openshift-hyperkube-4.3.24-202006011358.git.1.8aed31d.el8.x86_64
hyperkube kubelet --version
Kubernetes v1.16.2+18cfcc9

RHEL78 node:
openshift-hyperkube-4.3.25-202006060952.git.1.96c30f6.el7.x86_64
hyperkube kubelet --version
Kubernetes v1.16.2+5e2117a

related kubelet log:
"Jun 10 06:12:28 qe-43debug-kjqrr-rhel-2 hyperkube[1830]: E0610 06:12:28.684097    1830 nestedpendingoperations.go:270] Operation for "\"kubernetes.io/azure-disk//subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/qe-43debug-kjqrr-rg/providers/Microsoft.Compute/disks/qe-43debug-kjqrr-dynamic-pvc-b757615d-39c0-4f81-96b2-4efa447de13e\"" failed. No retries permitted until 2020-06-10 06:12:36.684049841 +0000 UTC m=+9062.589858658 (durationBeforeRetry 8s). Error: "Volume has not been added to the list of VolumesInUse in the node's volume status for volume \"pvc-b757615d-39c0-4f81-96b2-4efa447de13e\" (UniqueName: \"kubernetes.io/azure-disk//subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/qe-43debug-kjqrr-rg/providers/Microsoft.Compute/disks/qe-43debug-kjqrr-dynamic-pvc-b757615d-39c0-4f81-96b2-4efa447de13e\") pod \"mydeploy03-85dd4c8d4d-zvp2w\" (UID: \"e0ad8d8f-8dbe-4fde-9389-1b0de8215d9a\") ""

Comment 4 Stephen Cuppett 2020-06-10 15:56:50 UTC
This is not a blocker for 4.5.0 GA. Setting target release to current development branch (4.6.0). For fixes (if any) requested/required on earlier versions, clones will be created for 4.5.z or earlier as appropriate.

Comment 5 Jan Safranek 2020-06-10 19:47:55 UTC
The root cause is that udev rules in RHEL7 do not create symlinks in /dev/azure, only in /dev/azure/iscsi<n>.

RHEL8, with a Azure Disk attached as /dev/sdc:
/dev/disk/azure/resource -> ../../sdb
/dev/disk/azure/resource-part1 -> ../../sdb1
/dev/disk/azure/root -> ../../sda
/dev/disk/azure/root-part1 -> ../../sda1
/dev/disk/azure/root-part2 -> ../../sda2
/dev/disk/azure/root-part3 -> ../../sda3
/dev/disk/azure/root-part4 -> ../../sda4
/dev/disk/azure/scsi1
/dev/disk/azure/scsi1/lun0 -> ../../../sdc

RHEL7, with a Azure Disk attached as /dev/sdd:

/dev/disk/azure/scsi0
/dev/disk/azure/scsi0/lun0 -> ../../../sda
/dev/disk/azure/scsi0/lun0-part1 -> ../../../sda1
/dev/disk/azure/scsi0/lun0-part2 -> ../../../sda2
/dev/disk/azure/scsi0/lun0-part3 -> ../../../sda3
/dev/disk/azure/scsi0/lun0-part4 -> ../../../sda4
/dev/disk/azure/scsi0/lun1 -> ../../../sdb
/dev/disk/azure/scsi0/lun1-part1 -> ../../../sdb1
/dev/disk/azure/scsi1
/dev/disk/azure/scsi1/lun1 -> ../../../sdd

Notice /dev/disk/azure/resource and root in RHEL8. On RHEL7, I can see this in logs because /dev/disk/azure is empty there (contains only subdirs):
Jun 10 15:54:26.846816 qe-43debug-kjqrr-rhel-2 hyperkube[1716]: I0610 15:54:26.846388    1716 azure_common_linux.go:105] /dev/disk/azure is not populated, now try to parse 0:0:0:0 directly
Jun 10 15:54:26.847285 qe-43debug-kjqrr-rhel-2 hyperkube[1716]: I0610 15:54:26.846433    1716 azure_common_linux.go:105] /dev/disk/azure is not populated, now try to parse 0:0:0:1 directly

And kubelet can't find attached disks because of that.

We can either fix the udev rules or backport https://github.com/kubernetes/kubernetes/pull/86612. It has been backported to Kubernetes 1.18-1.15, not sure if it was backported to OCP too.

Comment 6 Jan Safranek 2020-06-15 12:32:43 UTC
I checked that 4.4 - 4.6 already have the aforementioned patch and we need to fix only 4.3.

Comment 9 Jan Safranek 2020-06-15 18:45:18 UTC
According Eric, only 4.4. BZ is really needed