Bug 1438474 - Cannot attach Azure Disk
Summary: Cannot attach Azure Disk
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 3.4.0
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: ---
: 3.4.z
Assignee: hchen
QA Contact: Wenqi He
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-04-03 13:29 UTC by Vladislav Walek
Modified: 2017-04-19 19:43 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1440892 (view as bug list)
Environment:
Last Closed: 2017-04-19 19:43:17 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0989 0 normal SHIPPED_LIVE OpenShift Container Platform 3.4, 3.3, 3.2, and 3.1 bug fix update 2017-04-19 23:42:19 UTC

Description Vladislav Walek 2017-04-03 13:29:23 UTC
Description of problem:

Customer is trying to attach the Azure VHD disk to the OpenShift, as docker registry. Unfortunately, the openshift is showing following errors:

Mar 30 14:12:47 <node> atomic-openshift-node[2994]: E0330 14:12:47.066793    2994 kubelet_node_status.go:69] Unable to construct api.Node object for kubelet: failed to get external ID from cloud provider: compute.VirtualMachinesClient#Get: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed" Message="The client '<client_id>' with object id '<client_id>' does not have authorization to perform action 'Microsoft.Compute/virtualMachines/read' over scope '/subscriptions/<subscription_id>/resourceGroups/<LOCATION_ID>-SVT/providers/Microsoft.Compute/virtualMachines/<node>'."

But it disappears and following error is shown:

Mar 30 15:49:13 <node> atomic-openshift-node[15682]: E0330 15:49:13.665327   15682 nestedpendingoperations.go:253] Operation for "\"kubernetes.io/azure-disk/dockerregistry01\"" failed. No retries permitted until 2017-03-30 15:51:13.665304822 +0000 UTC (durationBeforeRetry 2m0s). Error: recovered from panic "runtime error: invalid memory address or nil pointer dereference". (err=<nil>) Call stack: 

The last error still occurs.

Version-Release number of selected component (if applicable):

OpenShift Container Platform 3.4.0

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 10 Wenqi He 2017-04-07 07:25:51 UTC
Not quite sure about the panic issue, but I found if there was a pod creating with a invalid disk, then create a pod with valid disk always fail

$ oc version
openshift v3.4.1.15
kubernetes v1.4.0+776c994


$ oc get pods 
NAME      READY     STATUS              RESTARTS   AGE
azcaro    0/1       ContainerCreating   0          19m
azrarw    0/1       ContainerCreating   0          20m
azrwro    0/1       ContainerCreating   0          20m

while azrwro and azrarw are using invalid disks, but azcaro is using a valid one.

Comment 14 Wenqi He 2017-04-07 07:33:45 UTC
After delete the pod with invalid disks, create a new pod with a valid one, this pod could be running:

$ oc get pods
NAME      READY     STATUS    RESTARTS   AGE
azcaro    1/1       Running   0          4m

$ oc exec -it azcaro sh
/ $ ls /mnt/azure/
20170309    20170310    20170313    lost+found
/ $ exit

Comment 20 Wenqi He 2017-04-11 01:39:23 UTC
Due to comment 15 and comment 14 Change, this bug is fixed. Thanks

Comment 22 errata-xmlrpc 2017-04-19 19:43:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0989


Note You need to log in before you can comment on or make changes to this bug.