Bug 2165321

Summary: The ROOK_LOG_LEVEL variable is not updated according to the rook-ceph-operator configmap
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Oded <oviner>
Component: rookAssignee: Travis Nielsen <tnielsen>
Status: CLOSED NOTABUG QA Contact: Neha Berry <nberry>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.12CC: madam, ocs-bugs, odf-bz-bot
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-01-29 21:42:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Oded 2023-01-29 11:32:57 UTC
Description of problem (please be detailed as possible and provide log
snippests):
The  ROOK_LOG_LEVEL variable in rook-ceph-operator pod does not change to debug mode after configure the rook-ceph-operator-config configmap
In addition, I  reseted the rook-ceph-operator pod , however the  ROOK_LOG_LEVEL var is still not updated [on INFO mode and not DEBUG mode]

Version of all relevant components (if applicable):
OCP Version:4.12.0-0.nightly-2023-01-10-062211
ODF Version: 4.12.0-164
Provider: Vmware

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?


Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1.Add Data/ROOK_LOG_LEVEL param to rook-ceph-operator-config configmap:
$ oc -n openshift-storage patch ConfigMap rook-ceph-operator-config -n openshift-storage -p '{"data": {"ROOK_LOG_LEVEL": "DEBUG"}}' --type merge

$ oc get configmaps rook-ceph-operator-config  -o yaml 
apiVersion: v1
data:
  ROOK_LOG_LEVEL: DEBUG
kind: ConfigMap
metadata:
  creationTimestamp: "2023-01-16T06:27:21Z"
  name: rook-ceph-operator-config
  namespace: openshift-storage
  resourceVersion: "10586417"
  uid: bf41c3d2-e22c-4410-bc6c-89f8fb3032c3

2.Check rook-ceph-operator pod env:
$ oc get pod rook-ceph-operator-5cfcc7b6c6-rtcql -o yaml | grep -i ROOK_LOG_LEVEL -B 10 -A 1
spec:
  containers:
  - args:
    - ceph
    - operator
    env:
    - name: ROOK_CURRENT_NAMESPACE_ONLY
      value: "true"
    - name: ROOK_ALLOW_MULTIPLE_FILESYSTEMS
      value: "false"
    - name: ROOK_LOG_LEVEL
      value: INFO


3.Respin rook-ceph-operator pod
$ oc delete pod rook-ceph-operator-5cfcc7b6c6-rtcql 
pod "rook-ceph-operator-5cfcc7b6c6-rtcql" deleted

4.Check rook-ceph-operator pod env:
$ oc get pod rook-ceph-operator-5cfcc7b6c6-xbjfq -o yaml | grep -i ROOK_LOG_LEVEL -B 10 -A 1
spec:
  containers:
  - args:
    - ceph
    - operator
    env:
    - name: ROOK_CURRENT_NAMESPACE_ONLY
      value: "true"
    - name: ROOK_ALLOW_MULTIPLE_FILESYSTEMS
      value: "false"
    - name: ROOK_LOG_LEVEL
      value: INFO

5.Delete OSD pod
$ oc -n openshift-storage delete Pod rook-ceph-osd-1-846f5b7dd-lqfkv

6.Check logs on rook-ceph-operator pod
$  oc -n openshift-storage logs rook-ceph-operator-5cfcc7b6c6-xbjfq
$  oc -n openshift-storage logs rook-ceph-operator-5cfcc7b6c6-xbjfq | grep rook-ceph-osd-1-846f5b7dd-lqfkv -A 10
2023-01-29 10:48:51.086766 D | ceph-crashcollector-controller: "rook-ceph-osd-1-846f5b7dd-lqfkv" is a ceph pod!
2023-01-29 10:48:51.086848 D | ceph-crashcollector-controller: reconciling node: "compute-2"
2023-01-29 10:48:51.087543 D | ceph-spec: ceph version found "16.2.10-94"
2023-01-29 10:48:51.095940 D | ceph-crashcollector-controller: deployment successfully reconciled for node "compute-2". operation: "updated"
2023-01-29 10:48:51.097254 D | op-k8sutil: returning version v1.25.4 instead of v1.25.4+77bec7a
2023-01-29 10:48:51.097275 D | ceph-crashcollector-controller: deleting cronjob if it exists...
2023-01-29 10:48:51.100652 D | ceph-crashcollector-controller: cronJob resource not found. Ignoring since object must be deleted.
2023-01-29 10:48:54.506394 D | ceph-spec: "ceph-object-store-user-controller": CephCluster resource "ocs-storagecluster-cephcluster" found in namespace "openshift-storage"
2023-01-29 10:48:54.506480 I | ceph-spec: ceph-object-store-user-controller: CephCluster "ocs-storagecluster-cephcluster" found but skipping reconcile since ceph health is &{Health:HEALTH_ERR Details:map[OSD_DOWN:{Severity:HEALTH_WARN Message:1 osds down} OSD_HOST_DOWN:{Severity:HEALTH_WARN Message:1 host (1 osds) down} OSD_OUT_OF_ORDER_FULL:{Severity:HEALTH_ERR Message:full ratio(s) out of order} OSD_RACK_DOWN:{Severity:HEALTH_WARN Message:1 rack (1 osds) down}] LastChecked:2023-01-29T10:48:50Z LastChanged:2023-01-16T12:20:25Z PreviousHealth:HEALTH_WARN Capacity:{TotalBytes:1649267441664 UsedBytes:244812259328 AvailableBytes:1404455182336 LastUpdated:2023-01-29T10:48:50Z} Versions:0xc001f5ba00 FSID:dbd95002-a75d-4c08-897e-4917a120fc6a}
2023-01-29 10:48:54.506487 D | ceph-spec: "ceph-object-store-user-controller": CephCluster "openshift-storage" initial reconcile is not complete yet...
2023-01-29 10:48:54.506497 D | ceph-object-store-user-controller: successfully configured CephObjectStoreUser "openshift-storage/noobaa-ceph-objectstore-user"

7.Change log to INFO:
$ oc -n openshift-storage patch ConfigMap rook-ceph-operator-config -n openshift-storage -p '{"data": {"ROOK_LOG_LEVEL": "INFO"}}' --type merge
$ oc get configmaps rook-ceph-operator-config  -o yaml
apiVersion: v1
data:
  ROOK_LOG_LEVEL: INFO
kind: ConfigMap
metadata:
  creationTimestamp: "2023-01-16T06:27:21Z"
  name: rook-ceph-operator-config
  namespace: openshift-storage
  resourceVersion: "10600161"
  uid: bf41c3d2-e22c-4410-bc6c-89f8fb3032c3

8.Check rook-ceph-operator pod env:
$ oc get pod rook-ceph-operator-5cfcc7b6c6-xbjfq -o yaml | grep -i ROOK_LOG_LEVEL -B 10 -A 1
spec:
  containers:
  - args:
    - ceph
    - operator
    env:
    - name: ROOK_CURRENT_NAMESPACE_ONLY
      value: "true"
    - name: ROOK_ALLOW_MULTIPLE_FILESYSTEMS
      value: "false"
    - name: ROOK_LOG_LEVEL
      value: INFO

9. Delete osd-2 pod
$ oc -n openshift-storage delete Pod rook-ceph-osd-2-6f998bd857-4cn4w

10.Check  rook-ceph-operator logs:
$  oc -n openshift-storage logs rook-ceph-operator-5cfcc7b6c6-xbjfq
'2023-01-29 10:57:10.760484 I | clusterdisruption-controller: reconciling osd pdb reconciler as the allowed disruptions in default pdb is 0'


Actual results:
ROOK_LOG_LEVEL variable in rook-ceph-operator pod is INFO although I configured the ROOK_LOG_LEVEL=DEBUG on rook-ceph-operator-config configmap

Expected results:
The ROOK_LOG_LEVEL variable in rook-ceph-operator pod and rook-ceph-operator-config configmap are the same

Additional info:
MG:
http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-2165321.tar.gz

Comment 2 Oded 2023-01-29 21:42:21 UTC
The ENV vars are set by us as a form of default, But if someone tries to override it they will mention the value in the configmap.
The way rook handles these values is Value in configmap has higher precedence then the env var.