Bug 1826420 - When a default Azure File storage class is created, there is a bug that is causing the UID/GID to be flipped when the volume gets mounted to the pod.
Summary: When a default Azure File storage class is created, there is a bug that is ca...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: unspecified
Hardware: All
OS: Linux
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Hemant Kumar
QA Contact: Qin Ping
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-21 16:05 UTC by mhockelb
Modified: 2024-01-06 04:28 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-22 03:57:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description mhockelb 2020-04-21 16:05:20 UTC
Related JIRA: https://issues.redhat.com/browse/THREESCALE-4996

------------------------------------

We identified the root cause of the Azure File (RWX) storage issue that we have been seeing with 3scale - it is a Kubernetes Azure File storage bug.

We also identified a short-term workaround that may be used until the underlying bug is fixed. We tested the fix with a templated deployment of 3scale 2.8 on OCP 3.11 running on Azure Commercial. We believe, based on the nature of this storage bug, that it may also present in deployments on OCP 4.x.

Details are below.

------------------------------------

If you create a default Azure File storage class, there is a bug that is causing the UID/GID to be flipped when the volume gets mounted to the pod.

For example, if you create the following storage class:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata: 
  name: azure-file-default
mountOptions: 
- mfsymlinks
- cache=strict
parameters: 
  skuName: Standard_LRS
provisioner: kubernetes.io/azure-file
reclaimPolicy: Delete
volumeBindingMode: Immediate

Create a PVC using the storage class and map the claim to a Pod. Once that pod starts, you will see that the directory you chose to mount to have the UID and GID swapped.

sh-4.2$ ls -lah /
drwxr-xr-x.   2 root       1000620000    0 Apr 16 15:00 azure-file-default

You can see in the above output that the UID=root and GID=1000620000. This is backward and it should be UID=1000620000 and GID=root

This is a bug that needs to be fixed.

In an effort to work around the bug shown above, we were able to get the following to work.

Instead, we decided to force the UID and GID in the storage class. The major downside to this workaround is that UIDs are specific to a given project, meaning that the storage class we create can ONLY be used for a single project.

First, let's find out the UID being used for our project:

$ oc get namespace 3scale -o yaml
apiVersion: v1
kind: Namespace
metadata: 
  annotations: 
    openshift.io/description: ""
    openshift.io/display-name: ""
    openshift.io/requester: clusteradmin
    openshift.io/sa.scc.mcs: s0:c24,c4
    openshift.io/sa.scc.supplemental-groups: 1000560000/10000
    openshift.io/sa.scc.uid-range: 1000560000/10000
  creationTimestamp: 2020-04-15T18:03:09Z
  name: 3scale
  resourceVersion: "1322750"
  selfLink: /api/v1/namespaces/3scale-test3
  uid: 5d209470-7f43-11ea-ae16-000d3a75996f
spec: 
  finalizers: 
  - kubernetes
status: 
  phase: Active

We can see in this case that our UID is 1000560000.

Next, we can create a new storage class and force it to use the correct UID and GID to work around the bug.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata: 
  name: azure-file-3scale
mountOptions: 
- uid=1000560000
- gid=0
- mfsymlinks
- cache=strict
parameters: 
  skuName: Standard_LRS
provisioner: kubernetes.io/azure-file
reclaimPolicy: Delete
volumeBindingMode: Immediate

Then we can see the UID and GID mapping are correct and we are able to read, write, chmod, symlink, etc. as we expect to be able to.

sh-4.2$ ls -lah /
drwxr-xr-x.   2 1000620000 root          0 Apr 16 15:13 azure-file-3scale

To reiterate, the bug described up top needs to be fixed ASAP as this is a blocker from being able to use Azure File in OpenShift. While we are able to work around it in an emergency scenario for now, with the limitation that we would need a separate storage class for each project that requires Azure File, it is not sustainable.

Comment 1 Hemant Kumar 2020-04-21 21:17:40 UTC
What is the version of openshift? Can you post pod's YAML when this happens? 

Part of this is reasonable because UID, GID of a Azure File volume can only be set at mount time and hence must be specified either in StorageClass or in PV.

Comment 3 mhockelb 2020-04-21 22:01:35 UTC
OCP 3.11.  I have a login to this OCP environment that you can use to access/debug.  Contact me directly for the login info.  This is a test environment provided to us by MSFT, but it will not be available to us for much longer.

Comment 7 mhockelb 2020-04-22 12:29:06 UTC
No, we do not agree that this is "not a bug".  The UID is an auto-generated value for each OCP namespace and when it is not specified in the PVC explicitly, the plug-in is mounting the group ID as the owner and the owner ID as the group.  That is a bug.

Comment 8 mhockelb 2020-04-22 12:32:01 UTC
(In reply to mhockelb from comment #7)
> No, we do not agree that this is "not a bug".  The UID is an auto-generated
> value for each OCP namespace and when it is not specified in the Storage Class
> explicitly, the plug-in is mounting the group ID as the owner and the owner
> ID as the group.  That is a bug.

Comment 9 Hemant Kumar 2020-04-22 14:45:05 UTC
> No, we do not agree that this is "not a bug".  The UID is an auto-generated value for each OCP namespace and when it is not specified in the PVC explicitly, the plug-in is mounting the group ID as the owner and the owner ID as the group.  That is a bug.

But UIDs are not specified in PVC at all..

I think there is some confusion going on here.  I still think this is not a bug but what *might* have happened is there was a StorageClass which had gid specified in it and hence volume was mounted with gid specified in storageclass. That is why we need pod and PV of this when happened. I just don't see how gid can be applied to the volume without PV or StorageClass having it.

Comment 10 Hemant Kumar 2020-04-22 15:13:00 UTC
The other thing that could have happened is - the Pod that was created had a fsGroup in it and that is why volume was mounted with gid set to fsGroup at mount time. But it is still not UID and GID being swapped. This is working as expected. If you want UID to be set for the azure volumes, the only place to specify is in PV spec or StorageClass. This is working as intended.

If you don't want to statically create PVs that map to specific UIDs then use fsGroup or supplemental group with 0770 permission specified in storageclass as Ryan suggested.

Comment 11 Jared Hocutt 2020-04-22 15:35:43 UTC
Please do not close a BZ as NOTABUG because you do not understand the problem we are describing. It's completely fair to point out if we did not provide adequate information or were not clear, but please give us the time to provide that information to you.

As described in the original description for this bug, if you create a StorageClass for AzureFile as follows:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata: 
  name: azure-file-default
mountOptions: 
- mfsymlinks
- cache=strict
parameters: 
  skuName: Standard_LRS
provisioner: kubernetes.io/azure-file
reclaimPolicy: Delete
volumeBindingMode: Immediate

You will note that there is NOT a uid or gid defined in this StorageClass. That is *expected* as we would expect the StorageClass to use the uid and gid from the project where the PV is being used, like other StorageClasses do in OpenShift.

However, when we use that storage class on a Pod, we *expect* the mount to have the uid defined as 1000620000 (or whatever the specific one is for that project) and the gid to be 0.

sh-4.2$ ls -lah /
drwxr-xr-x.   2 root       1000620000    0 Apr 16 15:00 azure-file-default

Instead, as you can see in the output from inside the Pod above, the values are swapped. The uid is set to 0 and the gid is set to 1000620000.

THIS IS THE BUG WE ARE REFERRING TO. THIS IS NOT CORRECT BEHAVIOR.

As an example to reproduce this, deploy the default HTTPD example from the OpenShift service catalog, where no special uid or gid are being specified (i.e. it is not running as root, but as the standard random uid provided by the namespace). Then create a PV using the StorageClass defined above and mount it to the directory of your choice in the HTTPD deployment.

The remaining detail in the initial post of this BZ is to show how we were able to work around the problem *for now* so that customers who are blocked can move forward, but the HUGE caveat to the workaround is that you would need an indivdual StorageClass per project, which is not a long term solution as it's just working around the current bug that exists.

If it's helpful for me to jump on a screenshare and show you the behavior we're seeing, I'm happy to do that.

Comment 12 Hemant Kumar 2020-04-22 15:58:53 UTC
First off - I am sorry for closing this bug bit too soon and I agree we should have given you more time. My apologies. 

> THIS IS THE BUG WE ARE REFERRING TO. THIS IS NOT CORRECT BEHAVIOR.

Openshift/Kubernetes does not apply UID level file permission to volumes. The fact that, Azure-File "kindof" allows this behaviour by specifying in PV/StorageClass is kind of driver detail that has leaked into Kubernetes/Openshift. For example - what happens when you have 2 containers in a pod running with different UID? One container can read/write the volume and other can't. UID based volume permissions are strongly discouraged in Kubernetes and it is by design.

> Please do not close a BZ as NOTABUG because you do not understand the problem we are describing.

I think I do understand the bug but I am afraid this is not something we can fix easily (or even want to fix because of potentially breaking other users).


If you want to use azure-file volumes in containers not running as root, openshift-storage team's recommendation is to use group permissions.

Comment 13 Jared Hocutt 2020-04-22 16:17:42 UTC
> Openshift/Kubernetes does not apply UID level file permission to volumes.
> The fact that, Azure-File "kindof" allows this behaviour by specifying in
> PV/StorageClass is kind of driver detail that has leaked into
> Kubernetes/Openshift. For example - what happens when you have 2 containers
> in a pod running with different UID? One container can read/write the volume
> and other can't. UID based volume permissions are strongly discouraged in
> Kubernetes and it is by design.

I agree, this is a behavior that is expected to be handled by the driver for any shared storage solution, since shared stored works on UID/GID permissions (e.g. the same is true for NFS). By default, the containers running in the same namespace would have the same UID/GID so this would work as expected. If a user were modifying that behavior for their containers, then they would be expected to solve that problem themselves.

The fact here is that the Azure File storage driver *does* attempt to handle this problem but is doing so incorrectly. So the fact that the logic is there shows that this is behavior they want to work, but it is working incorrectly at the moment.

At Red Hat, the large majority of our product teams try their best to follow best practices when it comes to building containers. One of those best practices is not running containers as root. Therefore, any Red Hat product that follows this best practice and requires shared storage is broken when running on Azure and using Azure File for shared storage.

IMO, it's not worth debating whether the Azure File driver should or should not be handling this behavior. The fact is that it *does* try to handle that behavior. So that decision was already made. However, that functionality has a bug in it and is breaking deployments of Red Hat products.

> If you want to use azure-file volumes in containers not running as root,
> openshift-storage team's recommendation is to use group permissions.

That's partially what we're trying to do here, but the Azure File driver is not setting the GID to 0 as expected (and as other storage drivers do).

But this is also not a completely valid solution since the mount options being provided to CIFS are also *forcing* the UID/GID on the created files/directories. Therefore, once a file or directory gets created, it cannot be chown or chmod since it's not owned by the user who created it.

Comment 14 Hemant Kumar 2020-04-22 18:02:16 UTC
Here is a solution that I have tested and verifed that works.

1. Create a SC that allows group RW permission:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata: 
  name: redhat-sc-debug
mountOptions:
- dir_mode=0770
- file_mode=0770
- mfsymlinks
- cache=strict
parameters:
  skuName: Standard_LRS
provisioner: kubernetes.io/azure-file
reclaimPolicy: Delete
volumeBindingMode: Immediate


2. Now any user who is authenticated but not an admin will land in restricted SCC. Pods created in restricted SCC will automatically be assigned an fsGroup. A pod that has fsGroup in pod's securityContext and uses AzureFile volume will automatically have volume mounted with fsGroup gid and hence will be readable/writable by the pod even if it is not root. 

^ Does this work for you? I have tested this to work. 

> That's partially what we're trying to do here, but the Azure File driver is not setting the GID to 0 as expected (and as other storage drivers do).

This actually is working as intended. For *most* storage drivers, if pod has a fsGroup present then volume's GID is set to fsGroup and is not 0. I don't know why you would expect GID to be 0, if your pod had fsGroup.  My point is - GID isn't magically being set to 1000620000. Since I don't have pod YAML that you used, I can only assume that your pod's spec had fsGroup (which is highly likely because in Openshift all pods in restricted SCC are assigned an fsGroup).

Comment 16 Hemant Kumar 2020-04-22 20:12:31 UTC
Discussed this in a call and it looks like what user/customer wants is not merely ability to read/write files but ability to perform file/directory permission changes which may require UID based permission too.

With a Azure file pod with following YAML:

apiVersion: v1
kind: Pod
metadata:
  name: busybox-pod
  labels:
    app: busybox
spec:
  securityContext:
    runAsUser: 50000
    fsGroup: 50000
  containers:
    - name: busybox
      image: gcr.io/google_containers/busybox
      command:
        - "/bin/sh"
        - "-c"
        - "while true; do date; date >>/mnt/test/date; sleep 1; done"
      volumeMounts:
        - name: vol
          mountPath: /mnt/test
  volumes:
    - name: vol
      persistentVolumeClaim:
        claimName: "myclaim"


The volume gets mounted as:

drwxrwx---    2 root     50000          0 Apr 22 17:51 test

and if pod writes any file, it is not written with UID of pod user but rather than root. GID of volume is still same as fsGroup of pod though. So user can read/write files. However there are certain operations he/she can't do because the file is owned by group and not user.

-rwxrwx---    1 root     50000     218.4K Apr 22 20:05 date

This is generally how azure-file behaves. 

If UID mount option is not specified in storageClass then it seems to default to uid=0, but this is not something openshift/kubernetes sets (i suspect because of noforceuid option it is coming from server). Here is mount options used by above volume:

//<blah> on /mnt/test type cifs (rw,relatime,vers=3.0,cache=strict,username=foo,domain=X,uid=0,noforceuid,gid=50000,forcegid,addr=20.150.39.8,file_mode=0770,dir_mode=0770,soft,persistenthandles,nounix,serverino,mapposix,mfsymlinks,rsize=1048576,wsize=1048576,echo_interval=60,actimeo=1)


and I suspect that is the reason a file written by user 50000(and group 50000) does not get uid of 50000.

Comment 17 Hemant Kumar 2020-04-22 21:11:20 UTC
So, it look like Azure File does not support CIFS Unix extensions - https://github.com/MicrosoftDocs/azure-docs/issues/17765 . In absence of that, the uid will keep defaulting to 0(if not specified at mount time) and there is no way a user created file will get user's id as UID.


Unix extensions support is needed for setuid support for files/directories created by a local process. More information - https://linux.die.net/man/8/mount.cifs (check setuid mount option). I do not think, Openshift can do anything to fix this issue.

Comment 18 Jared Hocutt 2020-04-22 21:15:38 UTC
Thank you for digging into that more Hemant. I did some digging myself and came to the same conclusion regarding the need for Unix extensions.

Given that information, I think that users will have to use a workaround of either a) run the pod as root or b) create a storage class with the UID and GID set to match the UID for the namespace where the application that needs to use Azure File is located and accept that the storage class will only be valid for that given project.

Thank you for your time in getting to the bottom of that.

Comment 19 mhockelb 2020-04-23 00:47:34 UTC
Are we done using the OCP 3.11 test environment we've been using to troubleshoot this issue?

Comment 22 Red Hat Bugzilla 2024-01-06 04:28:58 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.