Please specify the severity of this bug. Severity is defined here: https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.
Patrick, if mds_cache_memory_limit is 1GB, this means that the pod memory limit is 2GB. As far as I can tell the ocs-op resources are set to 8GB back to release-4.2 https://github.com/openshift/ocs-operator/blob/release-4.2/pkg/controller/defaults/resources.go So I'm not sure why that pod got such low memory allocated. Rook simply looks up the memory limit and applies a 50% ratio to it. Here are some notes from our code: // MDS cache memory limit should be set to 50-60% of RAM reserved for the MDS container // MDS uses approximately 125% of the value of mds_cache_memory_limit in RAM. // Eventually we will tune this automatically: http://tracker.ceph.com/issues/36663
mds_cache_memory_limit should be in the "ceph config dump" output. 1GB seems to be the default value of mds_cache_memory_limit. Could you look at the audit logs (from the mons) and grep for "mds_cache_memory_limit", I don't know why but it seems that the mds_cache_memory_limit was removed. Thanks.
Mudit, looks like the doc text is filled already.
(In reply to Sébastien Han from comment #30) > Mudit, looks like the doc text is filled already. That was filled by ceph folks when the initial issue was reported. They fixed it so the doc text type was "Bug Fix", but now this is a rook issue and we have decided not to fix it in 4.7 so we should provide doc text as "Known issue" If the existing doc text is still relevant then its ok but it talks about the ceph fix.
Doc text was updated so removing my needinfo
Merged: https://github.com/openshift/rook/pull/223
Mudit, I edited the doc_text.
Went with OCP 4.3 - OCS 4.2 as initial deployment and continued by upgrading one by one version of OCS and OCP. When I was on OCS 4.6.4 I upgraded OCP to 4.7 and then OCS directly to 4.7.1-403.ci internal build. $ oc get csv -n openshift-storage NAME DISPLAY VERSION REPLACES PHASE lib-bucket-provisioner.v2.0.0 lib-bucket-provisioner 2.0.0 lib-bucket-provisioner.v1.0.0 Succeeded ocs-operator.v4.7.1-403.ci OpenShift Container Storage 4.7.1-403.ci ocs-operator.v4.6.4 Succeeded oc rsh -n openshift-storage rook-ceph-tools-784547f7c7-qxfz7 sh-4.4# ceph config dump|grep mds_cache_memory_limit mds.ocs-storagecluster-cephfilesystem-a basic mds_cache_memory_limit 4294967296 mds.ocs-storagecluster-cephfilesystem-b basic mds_cache_memory_limit 4294967296 So looks OK and will mark as verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Container Storage 4.7.1 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:2449