Created attachment 1882222 [details] registry-cephfs-rwx-pvc details page . Description of problem: -------------------------- After the successful fresh deployment of the OCP + ODF cluster with the versions mentioned below , when directed to management console under Home --> Overview , alert is fired which says :- "May 19, 2022, 4:15 PM The PersistentVolume claimed by registry-cephfs-rwx-pvc in Namespace openshift-image-registry only has 0% free inodes." Version-Release number of selected component : ----------------------------------------------- ODF : 4.11.0-75 OCP : 4.11.0-0.nightly-2022-05-18-171831 How reproducible: ------------------ 3/3 Steps to Reproduce: --------------------- 1. Deploy OCP +ODF cluster . 2. Direct to management-console , Home --> Overview Actual results:- ------------------ (a) OCP is firing wrong alert. (b) No inode issue as was able to create files . Expected results:- ------------------- The correct alert should be fired . Additional info: ------------------ Donot see any inode issue I was able to create files. [root@localhost 11-ocp]# oc get pods -n openshift-image-registry NAME READY STATUS RESTARTS AGE cluster-image-registry-operator-78d977c67c-9bgtd 1/1 Running 1 (3d21h ago) 3d21h image-pruner-27551520-tscxc 0/1 Completed 0 2d7h image-pruner-27552960-j922z 0/1 Completed 0 31h image-pruner-27554400-76fhf 0/1 Completed 0 7h33m image-registry-58f4cfb7f5-ldf4l 1/1 Running 0 3d20h node-ca-5gfzd 1/1 Running 0 3d21h node-ca-6xg5k 1/1 Running 0 3d21h node-ca-9sc2c 1/1 Running 0 3d21h node-ca-dljr7 1/1 Running 0 3d21h node-ca-jdxfd 1/1 Running 0 3d21h node-ca-p9xwl 1/1 Running 0 3d21h ================================================= [root@localhost 11-ocp]# oc rsh -n openshift-image-registry image-registry-58f4cfb7f5-ldf4l sh-4.4$ df -i Filesystem Inodes IUsed IFree IUse% Mounted on overlay 62651840 115707 62536133 1% / tmpfs 8243906 17 8243889 1% /dev tmpfs 8243906 17 8243889 1% /sys/fs/cgroup shm 8243906 1 8243905 1% /dev/shm tmpfs 8243906 4864 8239042 1% /etc/passwd 172.30.21.130:6789,172.30.130.222:6789,172.30.208.208:6789:/volumes/csi/csi-vol-a5050c60-d760-11ec-b65a-0a580a800213/77564c2a-db11-4ad3-86eb-681acc0f30c1 1 - - - /registry tmpfs 8243906 7 8243899 1% /etc/secrets /dev/sda4 62651840 115707 62536133 1% /etc/hosts tmpfs 8243906 5 8243901 1% /var/lib/kubelet tmpfs 8243906 5 8243901 1% /run/secrets/openshift/serviceaccount tmpfs 8243906 11 8243895 1% /run/secrets/kubernetes.io/serviceaccount tmpfs 8243906 1 8243905 1% /proc/acpi tmpfs 8243906 1 8243905 1% /proc/scsi tmpfs 8243906 1 8243905 1% /sys/firmware
Hmm, I am not sure why I am the assignee here. Trying to reset
Reassigning to ODF team for further input/triage since the OCP console is only responsible for rendering the alert.
Hi Bipul I have attached the screenshots for the alerts in comment #5 and comment#6. Please let me know if anything else is required . Thanks Mugdha Soni
The mounted cephfs volume does not show any size; df output from the above: Filesystem Inodes IUsed IFree IUse% Mounted on 172.30.21.130:6789,172.30.130.222:6789,172.30.208.208:6789:/volumes/csi/csi-vol-a5050c60-d760-11ec-b65a-0a580a800213/77564c2a-db11-4ad3-86eb-681acc0f30c1 1 - - - /registry Do you know why? It's then reported as zero to Prometheus and it then thinks the volume is full. It looks like a cephfs issue to me. Kubelet should not treat `-` as zero and we should fix that, but the root cause it IMO somewhere else (cephfs? kernel?)
Unfortunately, gRPC / protobuf does not allow kubelet to distinguish between "available is not set" and "available is set to 0", as 0 is the default value of int64 fields: > Note that for scalar message fields, once a message is parsed there's no way of telling whether > a field was explicitly set to the default value (for example whether a boolean was set to false) > or just not set at all: you should bear this in mind when defining your message types. https://developers.google.com/protocol-buffers/docs/proto3#default The CSI driver must be fixed to provide some value of available inodes. MAXINT64 would probably work.
Alternatively, you can report just the free space and don't report any inode counts, it seems that cephfs does not really care about them.
Not a 4.11 blocker
Bug 2132270 has been reported for this issue as well. CephFS will not report inode information anymore once that BZ is closed. *** This bug has been marked as a duplicate of bug 2132270 ***