Description of problem:
We are running our applications on OpenShift 4.6.32 and one of our containers is having issues when running their liveness and readiness probes.
The probes failed to run the following command:
netstat -ap | grep rsyslog | grep '/log/log'
These probes are working on all our nodes but two of them (master01 and worker06), we have investigated and below are our findings:
In the pod events we can see the following log:
Warning Unhealthy 2m1s (x32 over 11m) kubelet Readiness probe failed: netstat: /proc/net/tcp: Permission denied
netstat: /proc/net/tcp6: Permission denied
netstat: /proc/net/udp: Permission denied
netstat: /proc/net/udp6: Permission denied
netstat: /proc/net/raw: Permission denied
netstat: /proc/net/raw6: Permission denied
netstat: /proc/net/unix: Permission denied
If we connect to the node running this container we can see that SELinux is the one blocking the probe (below log was found in /var/log/audit/audit.log):
type=AVC msg=audit(1629200144.726:765654): avc: denied { getattr } for pid=3581173 comm="netstat" path="/proc/133/net/tcp" dev="proc" ino=4026545509 scontext=system_u:system_r:container_t:s0:c558,c946 tcontext=system_u:object_r:devtty_t:s0 tclass=file permissive=1
As you can see the file /proc/133/net/tcp has an invalid context "devtty_t" where "proc_net_t" is expected.
After rebooting the node the file got its valid context again and the probes started working on that node.
We have still one node (master01) that is hitting this issue, we would like to know what caused these folders to be labeled with an invalid SELinux context.
Version-Release number of selected component (if applicable):
4.6.32
How reproducible:
Only reproducible in one node, we don't know root cause so we don't have a reproducer at the moment.
Steps to Reproduce:
1. Schedule a pod with the probes in the description on master01
2. Probes will fail and pod will crashloopback
Actual results:
Folders being labeled with an invalid SELinux context causing probes to fail.
Expected results:
Folders being labeled with a proper SELinux context allowing probes to work.
Additional info:
Comment 2Miciah Dashiel Butler Masters
2021-08-24 16:12:28 UTC
(In reply to Mario Vázquez from comment #0)
> We are running our applications on OpenShift 4.6.32 and one of our
> containers is having issues when running their liveness and readiness probes.
This looks like a general issue with the kubelet's probes, not with the router or DNS. Re-assigning to Node/Kubelet for investigation.
Comment 10Red Hat Bugzilla
2023-09-15 01:13:49 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days
Description of problem: We are running our applications on OpenShift 4.6.32 and one of our containers is having issues when running their liveness and readiness probes. The probes failed to run the following command: netstat -ap | grep rsyslog | grep '/log/log' These probes are working on all our nodes but two of them (master01 and worker06), we have investigated and below are our findings: In the pod events we can see the following log: Warning Unhealthy 2m1s (x32 over 11m) kubelet Readiness probe failed: netstat: /proc/net/tcp: Permission denied netstat: /proc/net/tcp6: Permission denied netstat: /proc/net/udp: Permission denied netstat: /proc/net/udp6: Permission denied netstat: /proc/net/raw: Permission denied netstat: /proc/net/raw6: Permission denied netstat: /proc/net/unix: Permission denied If we connect to the node running this container we can see that SELinux is the one blocking the probe (below log was found in /var/log/audit/audit.log): type=AVC msg=audit(1629200144.726:765654): avc: denied { getattr } for pid=3581173 comm="netstat" path="/proc/133/net/tcp" dev="proc" ino=4026545509 scontext=system_u:system_r:container_t:s0:c558,c946 tcontext=system_u:object_r:devtty_t:s0 tclass=file permissive=1 As you can see the file /proc/133/net/tcp has an invalid context "devtty_t" where "proc_net_t" is expected. After rebooting the node the file got its valid context again and the probes started working on that node. We have still one node (master01) that is hitting this issue, we would like to know what caused these folders to be labeled with an invalid SELinux context. Version-Release number of selected component (if applicable): 4.6.32 How reproducible: Only reproducible in one node, we don't know root cause so we don't have a reproducer at the moment. Steps to Reproduce: 1. Schedule a pod with the probes in the description on master01 2. Probes will fail and pod will crashloopback Actual results: Folders being labeled with an invalid SELinux context causing probes to fail. Expected results: Folders being labeled with a proper SELinux context allowing probes to work. Additional info: