Bug 1564976

Summary: No write permission in a directory mounted as a PVC with Azure File
Product: OpenShift Container Platform Reporter: Takayoshi Tanaka <tatanaka>
Component: StorageAssignee: hchen
Status: CLOSED ERRATA QA Contact: Wenqi He <wehe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.7.1CC: ansverma, aos-bugs, aos-storage-staff, bchilds, bleanhar, joe.madden, jupierce
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-18 20:53:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Takayoshi Tanaka 2018-04-09 05:26:23 UTC
Description of problem:
No write permission in a directory mounted as a PVC with Azure File

Version-Release number of selected component (if applicable):
openshift v3.7.42
kubernetes v1.7.6+a08f5eeb62

(It doesn't happen in v3.7.23)

How reproducible:
Always

Steps to Reproduce:
1. Install or Upgrade OpenShift 3.7.42
2. Run a pod mounting a PVC with Azure File

Actual results:
A pod failed to write a file.

Expected results:
A pod can write a file.

In the Pod (PVC is mounted at /data2):
sh-4.2$ ls -al /data2/
total 0
sh-4.2$ touch /data2/ocp37
touch: cannot touch '/data2/ocp37': Permission denied
sh-4.2$ mkdir /data2/ocp37
mkdir: cannot create directory '/data2/ocp37': Permission denied

In the node:
//xxx.file.core.windows.net/ocp-filetest-fileshare01 on /var/lib/origin/openshift.local.volumes/pods/uuid/volumes/kubernetes.io~azure-file/azure-file-pv01 type cifs (rw,relatime,vers=3.0,sec=ntlmssp,cache=strict,username=user,domain=X,uid=0,noforceuid,gid=0,noforcegid,addr=x.x.x.x,file_mode=0755,dir_mode=0755,persistenthandles,nounix,serverino,mapposix,rsize=1048576,wsize=1048576,echo_interval=60,actimeo=1)


PV Dump:
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/bound-by-controller: "yes"
  creationTimestamp: 2018-04-09T05:02:04Z
  name: azure-file-pv01
  resourceVersion: "15904"
  selfLink: /api/v1/persistentvolumes/azure-file-pv01
  uid: 25721c83-3bb3-11e8-a924-000d3a929c29
spec:
  accessModes:
  - ReadWriteMany
  azureFile:
    secretName: azure-secret
    shareName: ocp-filetest-fileshare01
  capacity:
    storage: 5Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: pvcfile02
    namespace: test
    resourceVersion: "15902"
    uid: 37a4981d-3bb3-11e8-a924-000d3a929c29
  persistentVolumeReclaimPolicy: Retain
status:
  phase: Bound


PVC Dump:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
  creationTimestamp: 2018-04-09T05:02:35Z
  name: pvcfile02
  namespace: test
  resourceVersion: "15906"
  selfLink: /api/v1/namespaces/test/persistentvolumeclaims/pvcfile02
  uid: 37a4981d-3bb3-11e8-a924-000d3a929c29
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 5Gi
  volumeName: azure-file-pv01
status:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 5Gi
  phase: Bound

Additional info:
The same thing happens also in OpenShift 3.9 but we can work around using mountOptions in PV or StorageClass at OpenShift 3.9.
The dir_mode and file_mode of mount options are hard-coded at OpenShift 3.7.

Comment 2 Wenqi He 2018-04-09 05:35:11 UTC
I have reported similar bug before, pls check my comment in https://bugzilla.redhat.com/show_bug.cgi?id=1543229#c4

Comment 3 Takayoshi Tanaka 2018-04-09 05:48:20 UTC
@Wenqi

Thank you for a comment. However, I have two concerns.

- The issue is fixed only at OpenShift 3.9. Do you have an idea to work around at OpenShift 3.7.42?
I'm afraid we should fix the issue for v3.7.42.

- The issue is fixed by introducing fsGroup. Since it seems the user operation is required when upgrading 3.7 to 3.9, do we need a document?
https://github.com/openshift/origin/pull/18526

Also, I'm writing a KCS for permission issue of Azure File.

Comment 5 Wenqi He 2018-04-09 05:55:37 UTC
(In reply to Takayoshi Tanaka from comment #3)
> @Wenqi
> 
> Thank you for a comment. However, I have two concerns.
> 
> - The issue is fixed only at OpenShift 3.9. Do you have an idea to work
> around at OpenShift 3.7.42?
> I'm afraid we should fix the issue for v3.7.42.


Could you please try my solution in your 3.7 env to check whether it works or not? If not, I think we need to backport some PRs to resolve this issue. Thanks.

Comment 6 Takayoshi Tanaka 2018-04-09 06:29:20 UTC
"mountOptions" is not available at 3.7 because it's introduced at kubernetes 1.9 (and OpenShift 3.9).

Comment 7 hchen 2018-04-10 15:20:22 UTC
The regression was caused by [1] and the file/dir mode is (partially) reverted to 0755 [2], both fixes are back in 3.7.42. However, there is no upstream consensus to go back to 0777.  


1. https://github.com/kubernetes/kubernetes/pull/48460
2. https://github.com/kubernetes/kubernetes/pull/56551

Comment 9 hchen 2018-04-24 15:11:28 UTC
@Anshul, 3.6 doesn't have this issue. The file and dir modes are 0777

Comment 12 hchen 2018-05-03 16:10:36 UTC
The file/dir mode regression happens if the pod uid/gid are not the same as those in azure file. Customer can either upgrade to 3.9 to use mount option, or stay at 3.7.23 (before the regression issue)

Comment 13 hchen 2018-05-03 17:01:01 UTC
3.7 fix is at https://github.com/openshift/ose/pull/1244

Comment 14 hchen 2018-05-03 17:36:30 UTC
backport merged

Comment 18 Wenqi He 2018-05-17 10:26:30 UTC
I have manually successful installed OCP with latest 3.7 with below version:
openshift v3.7.48
kubernetes v1.7.6+a08f5eeb62

This bug is fixed in 3.7
$ oc get pods
NAME      READY     STATUS    RESTARTS   AGE
azfpod    1/1       Running   0          9m
$ oc exec -it azfpod sh
/ $ ls /mnt/azure/
/ $ touch /mnt/azure/wehe
/ $ ls /mnt/azure/
wehe
/ $ exit

BTW, because of the image tag issue, make so much effort on installation with update local packages, update image tag, and finally deploy OCP 3.7 latest successfully.

Comment 20 Joe Madden 2018-06-04 10:21:53 UTC
Hi All,

We have updated to Openshift 3.7.46 and are now able to write to azure file. Can someone confirm this was backported to 3.7.46 as the change logs does not have this Bugzilla listed.

Thanks.

Comment 21 Brenton Leanhardt 2018-06-04 12:07:51 UTC
I can confirm that https://github.com/openshift/ose/pull/1244 is merged to all versions after v3.7.45-1.