Description of problem: In container env the exposed device-plugin socket "/var/lib/kubelet/device-plugins/kubelet.sock" inside a container, can't be accessed by other container. we need mount a path for it. Version-Release number of selected component (if applicable): openshift v3.9.3 kubernetes v1.9.1+a0ce1bc657 etcd 3.2.16 How reproducible: Always Steps to Reproduce: 1. Enable devicePlugins on node then restart node service # cat /etc/origin/node/node-config.yaml ... kubeletArguments: ... feature-gates: - DevicePlugins=true # systemctl restart atomic-openshift-node 2. Make sure DevicePlugins socket is created on host. ls -l /var/lib/kubelet/device-plugins/kubelet.sock Actual results: 2. The socket inside the container Expected results: 2. It should expose on host. Additional info:
Device plugin is alpha in 3.9 and shouldn't block the release but this should be fixed in 3.10 and potentially 3.9.z. Vikas, can you look at this after we handle https://bugzilla.redhat.com/show_bug.cgi?id=1548358 ?
Origin 3.9 PR to openshift-ansible: https://github.com/openshift/openshift-ansible/pull/7900
FYI, there is no supported containerized install for 3.10. There is a system container install that will only be support on Atomic Host. The change needed in that case is in this PR: https://github.com/openshift/origin/pull/19308#event-1577516652
Verify on node system container env, in host instance node expose the socket. [root@ip-172-18-15-219 ~]# runc list ID PID STATUS BUNDLE CREATED OWNER atomic-openshift-node 18523 running /var/lib/containers/atomic/atomic-openshift-node.0 2018-04-28T08:03:01.053400639Z root [root@ip-172-18-15-219 ~]# oc version oc v3.10.0-0.30.0 kubernetes v1.10.0+b81c8f8 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://ip-172-18-15-219.ec2.internal:443 openshift v3.10.0-0.30.0 kubernetes v1.10.0+b81c8f8 [root@ip-172-18-15-219 ~]# ls /var/lib/kubelet/device-plugins/kubelet.sock /var/lib/kubelet/device-plugins/kubelet.sock
this also appears to be a problem for 'oc cluster up' scenarios # docker ps 686064f1374f registry.access.redhat.com/openshift3/ose:v3.9.25 "/usr/bin/openshift …" 44 seconds ago Up 43 seconds origin # docker exec -it 686064f1374f /bin/bash [root@virt origin]# ls -lart /var/lib/kubelet/device-plugins/ total 4 srwxr-xr-x. 1 root root 0 May 11 23:20 kubelet.sock drwxr-xr-x. 3 root root 28 May 11 23:20 .. drwxr-xr-x. 2 root root 61 May 11 23:21 . -rw-r--r--. 1 root root 48 May 11 23:21 kubelet_internal_checkpoint [root@virt origin]# exit # oc version oc v3.9.25 kubernetes v1.9.1+a0ce1bc657 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://127.0.0.1:8443 openshift v3.9.25 kubernetes v1.9.1+a0ce1bc657 will this be fixed/supported at all ? the use case is running openshift locally connecting to GPU for example.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1816