Bug 1467970
Summary: | Timeouts with Azure Disk attaching volumes to pods | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Eduardo Minguez <eminguez> | ||||||||||||
Component: | Storage | Assignee: | Hemant Kumar <hekumar> | ||||||||||||
Status: | CLOSED ERRATA | QA Contact: | Wenqi He <wehe> | ||||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||||
Priority: | unspecified | ||||||||||||||
Version: | 3.5.0 | CC: | aos-bugs, bchilds, mhepburn | ||||||||||||
Target Milestone: | --- | ||||||||||||||
Target Release: | 3.5.z | ||||||||||||||
Hardware: | Unspecified | ||||||||||||||
OS: | Unspecified | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
Doc Text: |
Cause:
Presence of lazy_itable_init=0,lazy_journal_init=0 field on mkfs arguments caused ext4 formatting to be really slow on Azure.
Consequence:
A pod can be in pending state or mount operation can simply timeout if formatting the disk takes too long.
Fix:
Remove lazy_itable_init=0,lazy_journal_init=0 options from mkfs command.
Result:
The time it takes to format ext4 disk is more than halved on Azure.
|
Story Points: | --- | ||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | 2017-08-31 17:00:23 UTC | Type: | Bug | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Attachments: |
|
Description
Eduardo Minguez
2017-07-05 17:07:42 UTC
Created attachment 1294674 [details]
pv
Created attachment 1294675 [details]
pvc
Also can you post problematic PVC which is not attaching? Also, please add PV & PVC info as before and also master-controller logs. Now that attach/detach has been moved to controller, we should have clues in controller logs about the error. Created attachment 1295006 [details]
pv & pvc
Created attachment 1295008 [details]
controllers
i found this to be still a problem for dynamic provisioning using ref.arch ansible installer and a deployment on azure from a day ago using ocp: oc v3.5.5.31 kubernetes v1.5.2+43a9be4 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://ocpdemo.australiasoutheast.cloudapp.azure.com:8443 openshift v3.5.5.31 kubernetes v1.5.2+43a9be4 i fixed it by running: oc annotate $(oc get node --no-headers -o name) volumes.kubernetes.io/controller-managed-attach-detach="true" and restarting all node processes in cluster. i assume this ansible arg would also have sufficed: [OSEv3:vars] openshift_node_kubelet_args={'enable-controller-attach-detach': ['true'] } also it took some time (~7min) to format 10Gi disk partition as raised elsewhere: Jul 17 06:49:48 infranode3 atomic-openshift-node: I0717 06:49:48.563819 1583 mount_linux.go:364] Disk "/dev/sde" appears to be unformatted, attempting to format as type: "ext4" with options: [-E lazy_itable_init=0,lazy_journal_init=0 -F /dev/sde] Jul 17 06:56:45 infranode3 atomic-openshift-node: I0717 06:56:45.753545 1583 mount_linux.go:369] Disk successfully formatted (mkfs): ext4 - /dev/sde /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/azure-disk/mounts/kubernetes-dynamic-pvc-03559b41-6abc-11e7-b9f2-000d3ae0ec80.vhd but seems to work ok after that. hope that helps. The PR for changing mkfs arguments have been merged. https://github.com/openshift/ose/pull/799 We can expect mkfs timings to go down once new build is released. Tested this on below version: openshift v3.5.5.31.6 kubernetes v1.5.2+43a9be4 Created a 10Gi dymamic pvc and a pod to consume this pvc, pod could be running within 3mins $ oc get pvc NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE azpvc Bound pvc-3ee2d76b-780f-11e7-a47b-000d3a1b87c9 10Gi RWO sc-qb6s8 1m $ oc get pods NAME READY STATUS RESTARTS AGE azpvcpo 1/1 Running 0 3m Thanks. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1828 |