1459106 – Deployments timing out if storage contains lots of small files

Bug 1459106 - Deployments timing out if storage contains lots of small files

Summary: Deployments timing out if storage contains lots of small files

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	RFE
Sub Component:
Version:	3.4.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Eric Paris
QA Contact:	Xiaoli Tian
Docs Contact:
URL:
Whiteboard:
Duplicates (2):	1597867 1677362 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-06-06 10:22 UTC by Alexander Koksharov
Modified:	2024-12-20 18:43 UTC (History)
CC List:	26 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-06-12 11:54:41 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1503906	unspecified	CLOSED	fsGroup (chown+chmod) causes time out if storage contains lots of files	2023-09-07 18:58:36 UTC
Red Hat Knowledge Base (Solution)	3200662	None	None	None	2019-06-12 12:53:54 UTC
Red Hat Knowledge Base (Solution)	4035941	None	None	None	2021-06-24 11:41:44 UTC
Red Hat Knowledge Base (Solution)	4900491	None	None	None	2021-06-24 11:41:44 UTC

Internal Links: 1503906

Description Alexander Koksharov 2017-06-06 10:22:52 UTC

Description of problem:
Persistent volume which contains lots of files (>6'000'000) which formatted with ext4 (-T option). OSE is not able to launch pods with this volume attached.

Issue appears even if application is not accessing this storage - it is just mounted.


Version-Release number of selected component (if applicable):
3.4

How reproducible:

Steps to Reproduce:
1. create a PV similar to:
# oc export pv pv-vol-05fad0544281b9a79 
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/bound-by-controller: "yes"
  creationTimestamp: null
  labels:
    failure-domain.beta.kubernetes.io/region: eu-west-1
    failure-domain.beta.kubernetes.io/zone: eu-west-1c
  name: pv-vol-05fad0544281b9a79
spec:
  accessModes:
  - ReadWriteOnce
  awsElasticBlockStore:
    fsType: ext4
    volumeID: aws://eu-west-1c/vol-05fad0544281b9a79
  capacity:
    storage: 400Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: pvc-vol-05fad0544281b9a79
    namespace: small-files
    resourceVersion: "23371"
    uid: 3be1c9c1-49f1-11e7-a3dc-06578a1a613c
  persistentVolumeReclaimPolicy: Retain
status: {}

2. create files on the storage:
# for i in {1..10 000 000}; do echo $i > ./$i.txt; done

3. scale up deployment with this volume attached. system will take several minutes to run an application. The following is reported in the events:
Events:
  FirstSeen	LastSeen	Count	From							SubobjectPath			Type		Reason		Message
  ---------	--------	-----	----							-------------			--------	------		-------
  13m		13m		1	{default-scheduler }									Normal		Scheduled	Successfully assigned test-app-3-eyhvf to ip-172-31-27-242.eu-west-1.compute.internal
  11m		6m		3	{kubelet ip-172-31-27-242.eu-west-1.compute.internal}					Warning		FailedMount	Unable to mount volumes for pod "test-app-3-eyhvf_small-files(edc7aab8-49f9-11e7-a3dc-06578a1a613c)": timeout expired waiting for volumes to attach/mount for pod "test-app-3-eyhvf"/"small-files". list of unattached/unmounted volumes=[small-files]
  11m		6m		3	{kubelet ip-172-31-27-242.eu-west-1.compute.internal}					Warning		FailedSync	Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "test-app-3-eyhvf"/"small-files". list of unattached/unmounted volumes=[small-files]
  6m		6m		1	{kubelet ip-172-31-27-242.eu-west-1.compute.internal}	spec.containers{test-app}	Normal		Pulling		pulling image "172.30.139.143:5000/small-files/test-app@sha256:34a69513e8551d069d1cd63e56982c247b3b7f3d81f54e04a7b1c167d6b624e4"
  6m		6m		1	{kubelet ip-172-31-27-242.eu-west-1.compute.internal}	spec.containers{test-app}	Normal		Pulled		Successfully pulled image "172.30.139.143:5000/small-files/test-app@sha256:34a69513e8551d069d1cd63e56982c247b3b7f3d81f54e04a7b1c167d6b624e4"
  6m		6m		1	{kubelet ip-172-31-27-242.eu-west-1.compute.internal}	spec.containers{test-app}	Normal		Created		Created container with docker id 3b229a3c2095; Security:[seccomp=unconfined]
  4m		4m		1	{kubelet ip-172-31-27-242.eu-west-1.compute.internal}	spec.containers{test-app}	Warning		Failed		Failed to start container with docker id 3b229a3c2095 with error: operation timeout: context deadline exceeded
  4m		4m		1	{kubelet ip-172-31-27-242.eu-west-1.compute.internal}					Warning		FailedSync	Error syncing pod, skipping: failed to "StartContainer" for "test-app" with RunContainerError: "runContainer: operation timeout: context deadline exceeded"


Actual results:


Expected results:

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

Comment 1 Matthew Wong 2017-06-06 22:14:43 UTC

Most likely IMO it is the recursive chown + chmod fsGroup that ensures the pod can write to the volume causing this slowdown; can you please provide the output of 'oc get pod -o yaml' to confirm if it the pod has an fsGroup set?

Comment 2 Alexander Koksharov 2017-06-07 07:49:49 UTC

 === In Red Hat Customer Portal Case 01862362 ===
--- Comment by Koksharov, Alexander on 6/7/2017 9:49 AM ---

yes, fsGroup is set for a pod, but is is not set in dc.

# oc get dc test-app | grep -i fsGroup
# 


# oc get pod test-app-3-2qb48 -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubernetes.io/created-by: |
      {"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"small-files","name":"test-app-3","uid":"601c90f0-49f1-11e7-a3dc-06578a1a613c","apiVersion":"v1","resourceVersion":"66439"}}
    openshift.io/container.test-app.image.entrypoint: '["container-entrypoint","/bin/sh","-c","$STI_SCRIPTS_PATH/usage"]'
    openshift.io/deployment-config.latest-version: "3"
    openshift.io/deployment-config.name: test-app
    openshift.io/deployment.name: test-app-3
    openshift.io/generated-by: OpenShiftNewApp
    openshift.io/scc: restricted
  creationTimestamp: 2017-06-07T07:44:59Z
  generateName: test-app-3-
  labels:
    app: test-app
    deployment: test-app-3
    deploymentconfig: test-app
  name: test-app-3-2qb48
  namespace: small-files
  resourceVersion: "66506"
  selfLink: /api/v1/namespaces/small-files/pods/test-app-3-2qb48
  uid: 34f92c81-4b55-11e7-a116-06578a1a613c
spec:
  containers:
  - image: 172.30.139.143:5000/small-files/test-app@sha256:34a69513e8551d069d1cd63e56982c247b3b7f3d81f54e04a7b1c167d6b624e4
    imagePullPolicy: Always
    name: test-app
    ports:
    - containerPort: 8080
      protocol: TCP
    resources: {}
    securityContext:
      capabilities:
        drop:
        - KILL
        - MKNOD
        - SETGID
        - SETUID
        - SYS_CHROOT
      privileged: false
      runAsUser: 1000070000
      seLinuxOptions:
        level: s0:c8,c7
    terminationMessagePath: /dev/termination-log
    volumeMounts:
    - mountPath: /data
      name: small-files
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: default-token-ooelj
      readOnly: true
  dnsPolicy: ClusterFirst
  imagePullSecrets:
  - name: default-dockercfg-j0w2s
  nodeName: ip-172-31-27-242.eu-west-1.compute.internal
  restartPolicy: Always
  securityContext:
    fsGroup: 1000070000
    seLinuxOptions:
      level: s0:c8,c7
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  volumes:
  - name: small-files
    persistentVolumeClaim:
      claimName: pvc-vol-05fad0544281b9a79
  - name: default-token-ooelj
    secret:
      defaultMode: 420
      secretName: default-token-ooelj
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: 2017-06-07T07:44:59Z
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: 2017-06-07T07:46:52Z
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: 2017-06-07T07:44:59Z
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: docker://bcf9fcce7bee45654210c96fd34dbc71a6c67274a0457a64520472e1adbdfa2b
    image: 172.30.139.143:5000/small-files/test-app@sha256:34a69513e8551d069d1cd63e56982c247b3b7f3d81f54e04a7b1c167d6b624e4
    imageID: docker-pullable://172.30.139.143:5000/small-files/test-app@sha256:34a69513e8551d069d1cd63e56982c247b3b7f3d81f54e04a7b1c167d6b624e4
    lastState: {}
    name: test-app
    ready: true
    restartCount: 0
    state:
      running:
        startedAt: 2017-06-07T07:46:51Z
  hostIP: 172.31.27.242
  phase: Running
  podIP: 10.131.0.20
  startTime: 2017-06-07T07:44:59Z

Comment 3 Matthew Wong 2017-06-07 21:36:32 UTC

OK, the easiest way to test if fsGroup is responsible for the delay is to prevent OpenShift from automatically setting the pod's fsGroup and observe. Here's one way to do it:

1. Create a serviceAccount fstest and specify it in the deployment at spec.template.spec.serviceAccount
2. Add an SCC to fstest that *doesn't* automatically set pods' fsGroups (one with fsgroup: RunAsAny) such as SCC nonroot
$ oadm policy add-scc-to-user nonroot system:serviceaccount:default:fstest
3. Recreate the deployment if needed

Then the pod should not have an fsGroup automatically set and we can observe if there is a long delay before ContainerStarted

 
In my brief testing, fsGroup is indeed responsible for a bulk of the time it takes to mount a volume. I only tested with about 1M inodes though, not 6M so I don't know if 6M could cause minutes-long delays; regardless, the difference is clear with and without fsGroup.

Avg time between "MountVolume.MountDevice succeeded" and "MountVolume.Setup succeeded"
without fsGroup: 0s
with fsGroup: 45s

Comment 4 Alexander Koksharov 2017-06-08 14:30:36 UTC

Hello Matthew,

There is some improvement if I set my deployment to use nonroot scc. Events about timeouts while attaching storage disappear. 

However, deployment still takes 3-4 minutes. Here is some info from my last test:

- with 6M files
# oc describe pod test-app-8-eyhix | grep "Security Policy:"
Security Policy:	nonroot

# oc get pods -w
NAME               READY     STATUS              RESTARTS   AGE
test-app-3-build   0/1       Completed           0          2d
test-app-8-eyhix   0/1       ContainerCreating   0          2m
NAME               READY     STATUS    RESTARTS   AGE
test-app-8-eyhix   1/1       Running   0          4m

# oc get event -o go-template="{{ range  .items }} {{.metadata.creationTimestamp}} : {{.message }} ; {{ end }}" | tr ';' '\n' | sort
...
  2017-06-08T11:39:33Z : Created pod: test-app-8-eyhix 
  2017-06-08T11:39:33Z : Successfully assigned test-app-8-eyhix to ip-172-31-27-242.eu-west-1.compute.internal 
  2017-06-08T11:39:52Z : Created container with docker id 17d6353592e3
  2017-06-08T11:39:52Z : pulling image "172.30.139.143:5000/small-files/test-app@sha256:34a69513e8551d069d1cd63e56982c247b3b7f3d81f54e04a7b1c167d6b624e4" 
  2017-06-08T11:39:52Z : Successfully pulled image "172.30.139.143:5000/small-files/test-app@sha256:34a69513e8551d069d1cd63e56982c247b3b7f3d81f54e04a7b1c167d6b624e4" 
  2017-06-08T11:41:52Z : Error syncing pod, skipping: failed to "StartContainer" for "test-app" with RunContainerError: "runContainer: operation timeout: context deadline exceeded"
  2017-06-08T11:41:52Z : Failed to start container with docker id 17d6353592e3 with error: operation timeout: context deadline exceeded 

# oc rsh test-app-8-eyhix
sh-4.2$ cd /data/
sh-4.2$ ls -la | wc -l
6000004


- with 2,7M files
sh-4.2$ ls -la /data/ | wc -l
2742013

# oc get pods -w
NAME               READY     STATUS              RESTARTS   AGE
test-app-3-build   0/1       Completed           0          3d
test-app-8-ooay4   0/1       ContainerCreating   0          10s
NAME               READY     STATUS    RESTARTS   AGE
test-app-8-ooay4   1/1       Running   0          2m


- with no files only 45s
sh-4.2$ perl -e 'chdir("/data") or die; opendir(D, "."); while($f = readdir D) {unlink($f);}'
# oc get pods -w
NAME               READY     STATUS              RESTARTS   AGE
test-app-3-build   0/1       Completed           0          3d
test-app-8-6wisk   0/1       ContainerCreating   0          4s
NAME               READY     STATUS    RESTARTS   AGE
test-app-8-6wisk   1/1       Running   0          45s



looks like there is something else what does scanning of filesystem

Comment 5 Matthew Wong 2017-06-08 21:21:46 UTC

I'm certain now that the non-fsGroup slowdown is due to SELinux relabeling (recursive chcon on the entire contents of the disk).

From observation/profiling, openshift more or less idles during the time between pod creation and container start time. Meanwhile, Docker daemon does significant  reading from the EBS disk & uses a lot of CPU.

So I did a basic test with and without SELinux relabeling (outside of openshift) and the difference was clear.

$ docker run -v /ebs-mountpoint:/test:Z ... takes minutes to start where
$ docker run -v /ebs-mountpoint:/test ... takes a second.

Unfortunately, at the moment disabling openshift's (docker's) recursive chconing is not possible. All EBS disks will undergo it.


Should also note re: the fsGroup slowdown that removing fsGroup from pods that use the volume is somewhat of a workaround in that: without openshift to automatically chown+chmod the disk, it becomes the adminstrator's responsibility to ensure pods that use the volume have the right permissions to read/write it. e.g.: ensure every file on that disk is owned by group 1234 and ensure every pod that uses that disk has supplemental group 1234. Hope that's acceptable.

Comment 6 Hemant Kumar 2017-06-08 23:31:10 UTC

As I was saying in IRC - perhaps we can mitigate this little bit by using :z (small case) and hence avoid recursive relabelling.

Comment 7 Bradley Childs 2017-06-21 20:21:05 UTC

Docker recursively sets the SELINUX label on the volume and this will take time.. if its 1 file it probably wont take much time.. if its a million directories of a million files each, it will take a lot of time.  There is no way around this except to disable selinux or buy faster hardware..

In this case the POD does eventually start.  There's nothing we can do to optimize security, performance, and usefulness of the volume.. if we don't perform a recursive label on the volume, some files wont be accessible.  If we do, it may take time to perform the operation.

All that to say, this turns into an expectation problem..  The events indicates that things aren't happening as fast as usual, but the POD eventually comes up (after 14 mins).  The fix here-- if possible, is to give better Event messages and document that starting a POD may take some time for volumes with a lot of files.

If there is a feature request around this (allow customize or switch of selinux labels) we can take that into consideration as an RFE.

Lowering priority for now.

Comment 8 Matthew Wong 2017-06-22 14:29:09 UTC

Docker takes a long time to recursively apply the selinux label (:Z). The pod does appear to start correctly eventually despite "Failed to start container with docker id" but there are two possible consequences of this delay between CreateContainer and StartContainer

"Symlink container logs to the legacy container log location for cluster logging support." does not occur.

Any pod container post-start lifecycle hooks, if any are specified, are never run.

because SyncPod appears to only run once.

Comment 9 Bradley Childs 2017-06-22 18:05:18 UTC

Transferring this to the kubernetes component as the storage portion is working as intended but kubelet is behaving weird due to the longer startup time of the docker container..

Summary:

- Docker container takes a long time to perform operation & start container, but does not error out.  Operation is expected to be slow/take long time (using a volume with lots of files and applying selinux label recursively to the volume is expected to take some time)

- Kube warns of errors during this 'long wait'  Events are marked as "Warning" but the message has language of "Failure".. This isn't ideal, but not blocker problem.

- Docker container eventually starts and so does POD, but POD behavior isn't correct.  Post-start lifecycle hooks aren't run and container logs aren't linked (as matthew mentions in comment #2)

Also upping priority for node triage.

Comment 10 Seth Jennings 2017-06-23 03:09:57 UTC

From a cursory look at the situation, this will not be easy to address.

We have the basic issue of how long do we wait for a container to start before we consider it a failure.  There is no way to detect that something in the runtime is not hung, but actually making forward progress.  With a sufficiently slow disk, it could take hours.

Post-start hooks and creation of the log symlink are synchronous with StartContainer. So if that errors out, they will not be done.

https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kuberuntime/kuberuntime_container.go#L121-L154

When the container is started later, IIRC it just gets pulled in like a pod would on node restart as a container that we don't have state about but should be running so we just reconcile the state and keep the pod running.  However, we assume the post-start hooks and log symlinks have already been done by the node process in a previous life.

tl;dr I don't think this is something we can address for 3.6.  It will take some significant reworking.

Comment 11 Seth Jennings 2017-09-06 21:01:01 UTC

The only thing that can really be done here is manually adjust the runtime-request-timeout node option.

This controls the timeout on the StartContainer call and every other runtime related call.

Comment 12 Seth Jennings 2017-09-06 21:25:59 UTC

This option can be set in the /etc/origin/node/node-config.yaml file.

For example, to set the timeout to 10 minutes:

kubeletArguments:
  ...
  runtime-request-timeout:
  - "10m"

Comment 13 Alexander Koksharov 2017-10-02 07:21:47 UTC

Is it possible to create a configuration switch which controls automatic selinux relabeling of attached storages?
 
I think it can be a solution to the subject issue if we can add a possibility to switch off automatic relabeling and delegate this responsibility to the administrator of an application.

Comment 14 Seth Jennings 2017-10-02 13:33:25 UTC

afaik, there is no such switch today.  I imagine it can be created but would involve plumbing through the stack.

If you would like to suggest the introduction of such a feature, please open an RFE bug requesting it.

Comment 15 Alexander Koksharov 2017-10-04 08:17:55 UTC

I am reopening this case as an RFE instead of creating a new one because it already has all the context about the issue.

To summarize:
- pod is using block storage which contains a lot for files (several millions)
- every time container is created system runs selinux relabelling task, which takes tens of minutes.
- such a delay from docker, causing timeout on openshift side.
- there is no way to disable relabelling at the moment.

The RFE is to create such a possibility to allow administrator to control selinux settings and disabel automatic actions.

Customer also did some research at the area and found the interesting article, which potentially may help to resolve this issue without 'plumbing through the stack' [1].

[1] https://danwalsh.livejournal.com/38157.html

Comment 16 hgomes 2018-03-21 21:14:00 UTC

(In reply to Seth Jennings from comment #11)
> The only thing that can really be done here is manually adjust the
> runtime-request-timeout node option.
> 
> This controls the timeout on the StartContainer call and every other runtime
> related call.


Is this achievable with OCP 3.5?
As the below one, is just for 3.6 and later.


(In reply to Seth Jennings from comment #12)
> This option can be set in the /etc/origin/node/node-config.yaml file.
> 
> For example, to set the timeout to 10 minutes:
> 
> kubeletArguments:
>   ...
>   runtime-request-timeout:
>   - "10m"

Comment 17 Mrunal Patel 2018-07-03 19:06:59 UTC

*** Bug 1597867 has been marked as a duplicate of this bug. ***

Comment 18 Mike McLane 2018-07-27 16:24:16 UTC

Where does this issue stand? In 3.9, when I have a GlusterFS PVC of 8k files it might take <1min to attach.. but if I have a GlusterFS pvc of >=30k files, it may take several minutes to attach.. this can cause timeouts for an RC to deploy a pod if the container is taking too long.

Comment 24 Mario Loriedo 2018-08-29 09:44:19 UTC

I think it's important to mention that this issue happens even when a volume empty subpath is mounted [1] and thousands of files exist in different subpaths of the same PV (but are not accessible from within the container).

[1] https://kubernetes.io/docs/concepts/storage/volumes/#using-subpath

Comment 25 Mike McLane 2018-08-29 23:58:57 UTC

To mloriedo's point, shouldn't the flexvol [1a] code (which gluster-subvol uses) prevent repeat chown operations [1b] when it doesn't need to execute? it looks like there is a gid check performed (release-1.9 & release-1.10) before the gid change + pvc attachment, but only for local file system types?

[mmclane@mmclane volume]$ git diff origin/master volume_linux.go
diff --git a/pkg/volume/volume_linux.go b/pkg/volume/volume_linux.go
index ef1f452..d67ee4a 100644
--- a/pkg/volume/volume_linux.go
+++ b/pkg/volume/volume_linux.go
@@ -89,3 +89,17 @@ func SetVolumeOwnership(mounter Mounter, fsGroup *int64) error {
                return nil
        })
 }
+
+// IsSameFSGroup is called only for requests to mount an already mounted
+// volume. It checks if fsGroup of new mount request is the same or not.
+// It returns false if it not the same. It also returns current Gid of a path
+// provided for dir variable.
+func IsSameFSGroup(dir string, fsGroup int64) (bool, int, error) {
+       info, err := os.Stat(dir)
+       if err != nil {
+               glog.Errorf("Error getting stats for %s (%v)", dir, err)
+               return false, 0, err
+       }
+       s := info.Sys().(*syscall.Stat_t)
+       return int(s.Gid) == int(fsGroup), int(s.Gid), nil
+}

[mmclane@mmclane volume]$ find . -type f -exec egrep -Hi issamefsgroup {} \;
./volume_unsupported.go:func IsSameFSGroup(dir string, fsGroup int64) (bool, int, error) {
./volume_linux.go:// IsSameFSGroup is called only for requests to mount an already mounted
./volume_linux.go:func IsSameFSGroup(dir string, fsGroup int64) (bool, int, error) {
./local/local.go:			fsGroupSame, fsGroupOld, err := volume.IsSameFSGroup(m.globalPath, fsGroupNew)
[mmclane@mmclane volume]$ git status
# On branch release-1.9
nothing to commit, working directory clean


Looks like fsGroup bits were initially added here [2]. Of note, the chown is skipped if fsGroup isn't set -- which is probably why this isn't as visible/problematic for Jenkins.

[1a] https://github.com/kubernetes/kubernetes/blob/release-1.10/pkg/volume/flexvolume/mounter.go#L96 
[1b] https://github.com/kubernetes/kubernetes/blob/release-1.9/pkg/volume/volume_linux.go#L38
[2] https://github.com/kubernetes/kubernetes/pull/49988/

Comment 26 Oleksandr Garagatyi 2018-08-31 07:16:01 UTC

Am I correct that changing of flexvolume code will affect only chown operation, but not selinux relabeling? Do we know what part of the time is taken by the relabeling and what by chown operation?

As far as I understand PR https://github.com/kubernetes/kubernetes/pull/49988/ that flag was added to prevent repeated remappings when a volume is already mounted on a node. It wasn't supposed to cover the situation when we need to change GID of files and when not. And, actually, I think that it is not that straightforward operation because we would have to respect chown execution failures and interruptions. 
Here is what I have in mind:
1. On volume mount get the last GID for the volume from some attribute (label or so) for this particular volume
2. If last GID is the same as the current one do not chown.
3. Otherwise (last GID is unknown or not the same):
3.1. clear last GID attribute
3.2. chown files
3.3. set new last GID attribute
In this case interruption of chown, need to change GID, no need to change GID should be respected. 

I'm not sure whether we can do that inside of a script. We might need to implement this in the code because interaction with k8s entities would be needed.
I also don't know where we can store this last GID info about the volume. Can we do that in some volume metadata? This line in git volume looks similar to that https://github.com/kubernetes/kubernetes/blob/7f23a743e8c23ac6489340bbb34fa6f1d392db9d/pkg/volume/git_repo/git_repo.go#L242

Comment 28 Florent Benoit 2018-09-14 11:37:58 UTC

@Oleksandr 

> Am I correct that changing of flexvolume code will affect only chown operation,
> but not selinux relabeling? Do we know what part of the time is taken by the 
> relabeling and what by chown operation?


FYI SELinux relabeling is not performed as driver is explicitly disabling it.

Comment 31 Jan Safranek 2019-02-19 09:24:31 UTC

*** Bug 1677362 has been marked as a duplicate of this bug. ***

Comment 32 Kirsten Newcomer 2019-06-12 11:54:41 UTC

With the introduction of OpenShift 4, Red Hat has delivered or roadmapped a substantial number of features based on feedback by our customers.  Many of the enhancements encompass specific RFEs which have been requested, or deliver a comparable solution to a customer problem, rendering an RFE redundant.

This bz (RFE) has been identified as a feature request not yet planned or scheduled for an OpenShift release and is being closed. 

If this feature is still an active request that needs to be tracked, Red Hat Support can assist in filing a request in the new JIRA RFE system, as well as provide you with updates as the RFE progress within our planning processes. Please open a new support case: https://access.redhat.com/support/cases/#/case/new 

Opening a New Support Case: https://access.redhat.com/support/cases/#/case/new 

As the new Jira RFE system is not yet public, Red Hat Support can help answer your questions about your RFEs via the same support case system.

Comment 37 Red Hat Bugzilla 2023-09-15 00:02:28 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days

Note You need to log in before you can comment on or make changes to this bug.

annelson
aos-bugs
bchilds
chaoyang
clichybi
eparis
fbenoit
fche
fgiloux
hekumar
hgomes
jchevret
jokerman
jstrunk
kbsingh
mbenitez
mmccomas
mmclane
oarribas
ocasalsa
pdwyer
pweil
rkant
rpalathi
sankarshan
sgarciam