Description of problem: Multiple pods (router) get scheduled to a node. Version-Release number of selected component (if applicable): 3.3.0.35 How reproducible: Unable to reproduce. Happens periodically Actual results: secret gets mounted with no rootcontext Expected results: rootcontext gets set to what is set for the openshift volume dir: system_u:object_r:svirt_sandbox_file_t:s0, Additional info: We have seen this issue on a node when scheduling the routers. 2-4 routers (all set up with different ports) will be scheduled to a node at once and sometimes the following happens. The info below is from a single node showing 2 routers running at the same time, same node, with a secret that gets mounted without the context and with the context. $ ls -ladZ /var/lib/openshift/openshift.local.volumes/ drwxr-x---. root root system_u:object_r:svirt_sandbox_file_t:s0 /var/lib/openshift/openshift.local.volumes/ sh-4.2# ls -lZ /etc/pki/tls/private lrwxrwxrwx. root root system_u:object_r:tmpfs_t:s0 tls.crt -> ..data/tls.crt lrwxrwxrwx. root root system_u:object_r:tmpfs_t:s0 tls.key -> ..data/tls.key tmpfs on /var/lib/openshift/openshift.local.volumes/pods/324d6157-b2f3-11e6-8547-fa163efa8a03/volumes/kubernetes.io~secret/server-certificate type tmpfs (rw,relatime,seclabel) sh-4.2# ls -lZ /etc/pki/tls/private lrwxrwxrwx. root root system_u:object_r:svirt_sandbox_file_t:s0 tls.crt -> ..data/tls.crt lrwxrwxrwx. root root system_u:object_r:svirt_sandbox_file_t:s0 tls.key -> ..data/tls.key tmpfs on /var/lib/openshift/openshift.local.volumes/pods/32e81e2f-b2f3-11e6-b515-fa163e19579b/volumes/kubernetes.io~secret/server-certificate type tmpfs (rw,relatime,rootcontext=system_u:object_r:svirt_sandbox_file_t:s0,seclabel) We set the context here: https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/pkg/volume/empty_dir/empty_dir.go#L207
Paul and I were chatting today. He said offhand "This is impossible unless SelinuxEnabled() is lying to us." So I decided to go see if SelinuxEnabled() could be lying to us. Low and behold... The function did this: ``` 1. if valueInitialized 2. return value 3. valueInitialized = true 4. value = determineValue() 5. return value ``` So another caller racing while this call was between lines 3 and 4 could get the wrong value!!! https://github.com/opencontainers/runc/pull/1216 This could also explain why it happens only on startup as once the caller finishes line 4 things will work correctly... Still a lot of steps in the process to get this fixed code propagated everywhere it needs to propagate to get a test build, but it could be explaining the "impossible" problem.
We need patches for origin/master and kube. @decarr, can you track that?
ASSIGNED until it's merged into ose/enterprise-3.3.
I think we need patches for the following: ose/enterprise-3.3 (https://github.com/openshift/ose/pull/501) origin/release-1.4 origin/master kubernetes/release-1.5 kubernetes/master
@chezhang, help verify this bug on 3.3.1
Verify on openshift v3.3.1.7 Steps: 1. Create a rc oc create -f https://raw.githubusercontent.com/mdshuai/testfile-openshift/master/k8s/rc-with-emptdir.yaml 2. Scale rc replicas=5 and wait all pod is running [root@ip-172-18-5-182 ~]# oc scale rc/hello-pod --replicas=5 replicationcontroller "hello-pod" scaled [root@ip-172-18-5-182 ~]# oc get pod NAME READY STATUS RESTARTS AGE hello-pod-77io7 1/1 Running 0 13m hello-pod-gt6r6 1/1 Running 0 13m hello-pod-gvllb 1/1 Running 0 13m hello-pod-vfvm7 1/1 Running 0 14m hello-pod-xu7yq 1/1 Running 0 13m 3. On node check all mounted secrets has correct context [root@ip-172-18-5-182 ~]# mount|grep pods tmpfs on /var/lib/origin/openshift.local.volumes/pods/e55f13ee-c809-11e6-a7e2-0e942c8fa67e/volumes/kubernetes.io~secret/builder-token-81dcw type tmpfs (rw,relatime,rootcontext=system_u:object_r:svirt_sandbox_file_t:s0,seclabel) tmpfs on /var/lib/origin/openshift.local.volumes/pods/e55f13ee-c809-11e6-a7e2-0e942c8fa67e/volumes/kubernetes.io~secret/builder-dockercfg-jo3kj-push type tmpfs (rw,relatime,rootcontext=system_u:object_r:svirt_sandbox_file_t:s0,seclabel) tmpfs on /var/lib/origin/openshift.local.volumes/pods/06dc0b5d-c80a-11e6-a7e2-0e942c8fa67e/volumes/kubernetes.io~secret/default-token-sn9do type tmpfs (rw,relatime,rootcontext="system_u:object_r:svirt_sandbox_file_t:s0:c7,c8",seclabel) tmpfs on /var/lib/origin/openshift.local.volumes/pods/0b10e794-c80a-11e6-a7e2-0e942c8fa67e/volumes/kubernetes.io~secret/default-token-sn9do type tmpfs (rw,relatime,rootcontext="system_u:object_r:svirt_sandbox_file_t:s0:c7,c8",seclabel) tmpfs on /var/lib/origin/openshift.local.volumes/pods/0b0f366b-c80a-11e6-a7e2-0e942c8fa67e/volumes/kubernetes.io~secret/default-token-sn9do type tmpfs (rw,relatime,rootcontext="system_u:object_r:svirt_sandbox_file_t:s0:c7,c8",seclabel) tmpfs on /var/lib/origin/openshift.local.volumes/pods/0b0ee35a-c80a-11e6-a7e2-0e942c8fa67e/volumes/kubernetes.io~secret/default-token-sn9do type tmpfs (rw,relatime,rootcontext="system_u:object_r:svirt_sandbox_file_t:s0:c7,c8",seclabel) tmpfs on /var/lib/origin/openshift.local.volumes/pods/0b13e92c-c80a-11e6-a7e2-0e942c8fa67e/volumes/kubernetes.io~secret/default-token-sn9do type tmpfs (rw,relatime,rootcontext="system_u:object_r:svirt_sandbox_file_t:s0:c7,c8",seclabel) tmpfs on /var/lib/origin/openshift.local.volumes/pods/41b2f1c0-c80a-11e6-a7e2-0e942c8fa67e/volumes/kubernetes.io~secret/default-token-5ze2c type tmpfs (rw,relatime,rootcontext="system_u:object_r:svirt_sandbox_file_t:s0:c2,c8",seclabel) tmpfs on /var/lib/origin/openshift.local.volumes/pods/067587df-c80b-11e6-a7e2-0e942c8fa67e/volumes/kubernetes.io~secret/default-token-5ze2c type tmpfs (rw,relatime,rootcontext="system_u:object_r:svirt_sandbox_file_t:s0:c2,c8",seclabel)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0199