Bug 2245068

Summary: [GSS] ODF console topology page is throwing error "TypeError: a metadata.ownerReference is undefined"
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Sonal <sarora>
Component: management-consoleAssignee: Bipul Adhikari <badhikar>
Status: CLOSED ERRATA QA Contact: Daniel Osypenko <dosypenk>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.12CC: badhikar, ebenahar, edonnell, martinsson.patrik, odf-bz-bot, skatiyar, smitra, tdesala
Target Milestone: ---   
Target Release: ODF 4.17.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: 4.17.0-105 Doc Type: Bug Fix
Doc Text:
.Pods created in `openshift-storage` by end users no longer cause errors Previously, when a pod was created in `openshift-storage` by an end user it would cause the console topology page to break. This was because pods without any `ownerReferences` were not considered to be part of the design. With this fix, pods without owner references are filtered out, and only pods with correct `ownerReferences` are shown. This allows for the topology page to work correctly even when pods are arbitrarily added to the `openshift-storage` namespace.
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-10-30 14:25:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2266313    
Bug Blocks: 2281703    

Description Sonal 2023-10-19 14:29:19 UTC
Description of problem (please be detailed as possible and provide log
snippests):

On deploying ODF from manifests, observed Topology page in ODF console(Storage -> Data Foundation -> Topology) is throwing below error. (Refer screenshot)

"TypeError: a metadata.ownerReference is undefined"

ODF Installation via GUI is functional with working statistics in GUI.

Version of all relevant components (if applicable):
ODF 4.12.7
OCP 4.12.34

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
It doesn't look professional 

Is there any workaround available to the best of your knowledge?
No

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
3

Can this issue reproducible?
Yes, in customer's environment

Can this issue reproduce from the UI?
No

If this is a regression, please provide more details to justify this:
Not sure

Steps to Reproduce:
1. Install ODF using manifests. 
2. Once ODF is installed check topology page from ODF GUI.



Actual results:
Topology page is not funcational

Expected results:
Topology page should show the topology view of ODF cluster.

Additional info:
In next private comment

Comment 4 Patrik Martinsson 2023-10-20 08:56:14 UTC
Hi, 

I was the customer that reported this issue as a support case, but I figure I would continue the discussion here. 

So, the problem is as described that the "Topology view" wasn't working, it was crashing and threw the following,

Component trace:
 TypeError : Cannot read properties of undefined (reading '0')
 at ta (https://xxx/api/plugins/odf-console/exposed-topology-chunk.js:1:66082)

Stack trace: 
 TypeError: Cannot read properties of undefined (reading '0')
 at me (https://xxx/api/plugins/odf-console/exposed-topology-chunk.js:1:35027)
 at https://xxx/api/plugins/odf-console/exposed-topology-chunk.js:1:66747

When trying to debug this through the console in the browser, it took me to the following snippet in the js-code, 

 "return a ? a.kind === i.Y2.kind ? a : me(a.metadata.ownerReferences[0].uid, ...t) : null" 

Which suggests that the error is thrown becuase "something" is missing an ownerReference. 
Unfortunately the code doesn't say which resource it is, which made this harder. 
However, after looking at at the file 'odf-console/packages/odf/components/topology/Topology.tsx' on the rows between 62-69, we can get a clue about which resources that you seem to qeury. 

 nodeResource,
 odfDaemonSetResource,
 odfDeploymentsResource,
 odfPodsResource,
 odfReplicaSetResource,
 odfStatefulSetResource,
 storageClusterResource,

Starting from the top, and working my way down with a command something like, 

$ > for resource in $(oc get <RESOURCE-TYPE> | grep -v NAME | cut -d " " -f 1); do
      echo -e "${resource}\n"
      oc get <RESOURCE-TYPE> ${resource} -o json | jq '{name: .metadata.name, ownerReferences: (if .metadata.ownerReferences then .metadata.ownerReferences else "!!!!!!!!! NO REFERENCE !!!!!!!!!" end)}';
    done

I got to the deployment resource and found the following (output is shorted)

{
  "name": "csi-cephfsplugin",
  "ownerReferences": [
    {
      "name": "rook-ceph-operator",
      "uid": "xx"
    }
  ]
}
csi-cephfsplugin-holder-ocs-storagecluster-cephcluster

{
  "name": "csi-cephfsplugin-holder-ocs-storagecluster-cephcluster",
  "ownerReferences": "!!!!!!!!! NO REFERENCE !!!!!!!!! "
}
csi-rbdplugin

{
  "name": "csi-rbdplugin",
  "ownerReferences": [
    {
      "name": "rook-ceph-operator",
      "uid": "xx"
    }
  ]
}
csi-rbdplugin-holder-ocs-storagecluster-cephcluster

{
  "name": "csi-rbdplugin-holder-ocs-storagecluster-cephcluster",
  "ownerReferences": "!!!!!!!!! NO REFERENCE !!!!!!!!! "
}

Hm, so that is interesting, we have two deployments that doesn't have a ownerReference, namely

 csi-rbdplugin-holder-ocs-storagecluster-cephcluster
 csi-cephfsplugin-holder-ocs-storagecluster-cephcluster

These seems to come from, 
- https://github.com/rook/rook/blob/master/pkg/operator/ceph/csi/template/cephfs/csi-cephfsplugin-holder.yaml
- https://github.com/rook/rook/blob/master/pkg/operator/ceph/csi/template/rbd/csi-rbdplugin-holder.yaml

I'm not sure how they end up without an ownerReference, but manually adding one such as the "rook-ceph-operator", makes the topology-page work. 
Not sure if that will cause any side-effects, but I will go with that for now. 

You can of course decide what to do with this information, but I would think that two actionpoints can be taken, 

1 ) Understand why the deployments "csi-rbdplugin-holder-ocs-storagecluster-cephcluster" and "csi-cephfsplugin-holder-ocs-storagecluster-cephcluster" is missing an ownerreference. 
2 ) Improve the logic in the js to actually reveil what kind of resource that actually is the problem here (ie. it would have saved me quite some time if the js-code told me that "it was missing the ownerreference from resource xyz". 

Best regards,
Patrik
Sweden

Comment 6 Sunil Kumar Acharya 2024-05-15 09:21:12 UTC
Moving the non-blocker BZ out of ODF-4.16.0 due to blocker only phase. If this BZ should be considered as blocker, feel free to propose it back with justification note.

Comment 14 Sunil Kumar Acharya 2024-09-18 12:06:54 UTC
Please update the RDT flag/text appropriately.

Comment 19 errata-xmlrpc 2024-10-30 14:25:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.17.0 Security, Enhancement, & Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:8676