Description of problem: The IBM Cloud COS (S3) driver creates a memory-mapped tmpfs of the exact size required to hold a password - 4 KB. This results in a NodeFilesystemAlmostOutOfSpace critical alert because the filesystem is full and or customers have to create a silence. Version-Release number of selected component (if applicable): 4.6 and higher How reproducible: Always Steps to Reproduce: 1. 2. 3. Actual results: This alert: Filesystem on tmpfs at 53.13.174.22 has only 0.00% available space left. Due to this filesystem: tmpfs 4.0K 4.0K 0 100% /var/lib/ibmc-s3fs/97efabc827b4c933d1b5d3df035a95b50bb07da92bfbe817b79c42aa3e0484ec Expected results: No alert. Additional info: Could the NodeFilesystemAlmostOutOfSpace alert query ignore filesystems (or maybe just tmpfs) below some size? I think the expectation is that the tmpfs are created with no size specified, which defaults to a maximum size of 50% of physical memory. A 4K tmpfs is clearly small enough that 100% full is probably unavoidable.
@jmcmeek.com Thanks for the report, we're trying to figure out what the best exclusion criterion would be. Can you paste the /proc/mounts or the output of the mount command for this file system here please?
@jfajersk Is this what you wanted? sh-4.4# cat /proc/mounts | grep s3fs tmpfs /var/lib/ibmc-s3fs/99ad9dbdeaf708f1ae4818365b393c8c77f83236baf7bf851e66f79de9900615 tmpfs rw,seclabel,relatime,size=4k 0 0 s3fs /var/data/kubelet/pods/25dec7c1-a572-4667-b5e4-836783c48815/volumes/ibm~ibmc-s3fs/pvc-5b3c4dad-9dd1-4097-a1ce-f38f5a09aae7 fuse.s3fs rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other 0 0
Thanks, that's the info I was after. I have a pretty good idea how to improve the alert now and will propose a solution upstream.
I proposed a change to the alert generation upstream: https://github.com/prometheus/node_exporter/pull/2446 This would allow us to us to ignore tmpfs instances under /var/lib/ibmc-s3fs/ for these alerts, while keeping alerts for other tmpfs instances intact. In telemeter, the majority of alerts is related to /var/lib/ibmc-s3fs/ but there are alerts for /run and /var as well, so we want to keep those alerts.
Thanks! An ibmc-s3fs specific solution is fine. Hopefully we (or someone else) won't create another one of these.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.13.0 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:1326