Description of problem: Can't gather the endpoint data (metrics, healthz, version...)about the marketplace operator by tool must-gather. Version-Release number of selected component (if applicable): cv:4.2.0-0.nightly-2019-06-30-221852 How reproducible: 100% of the time. Steps to Reproduce: 1. Run must-gather collection tools. ./openshift-must-gather inspect clusteroperator/marketplace --kubeconfig=/home Actual results: The logs about collection: ./openshift-must-gather inspect clusteroperator/marketplace --kubeconfig=/home/jfan/work/env3/kubeconfig E0701 16:44:11.092956 31034 portforward.go:331] an error occurred forwarding 37587 -> 60000: error forwarding port 60000 to pod d84bba70e1924bef41c63e10e1b42e44c4125596754ea46659f0b38f73f9b9e3, uid : exit status 1: 2019/07/01 08:44:10 socat[38586] E connect(5, AF=2 127.0.0.1:60000, 16): Connection refused E0701 16:44:12.777536 31034 portforward.go:331] an error occurred forwarding 37587 -> 60000: error forwarding port 60000 to pod d84bba70e1924bef41c63e10e1b42e44c4125596754ea46659f0b38f73f9b9e3, uid : exit status 1: 2019/07/01 08:44:12 socat[38693] E connect(5, AF=2 127.0.0.1:60000, 16): Connection refused E0701 16:44:14.425401 31034 portforward.go:331] an error occurred forwarding 37587 -> 60000: error forwarding port 60000 to pod d84bba70e1924bef41c63e10e1b42e44c4125596754ea46659f0b38f73f9b9e3, uid : exit status 1: 2019/07/01 08:44:14 socat[38711] E connect(5, AF=2 127.0.0.1:60000, 16): Connection refused 2019/07/01 16:44:14 Gathering data for pod "redhat-operators-7886454bd6-rfvbh" 2019/07/01 16:44:14 Unable to gather previous container logs: previous terminated container "redhat-operators" in pod "redhat-operators-7886454bd6-rfvbh" not found Error: one or more errors ocurred while gathering pod-specific data for namespace: openshift-marketplace [one or more errors ocurred while gathering container data for pod certified-operators-575db8f595-vwflb: [unable to gather container /healthz: Get https://localhost:37587/: tls: first record does not look like a TLS handshake, unable to gather container /version: Get https://localhost:37587/: tls: first record does not look like a TLS handshake, unable to gather container /metrics: Get https://localhost:37587/metrics: tls: first record does not look like a TLS handshake], one or more errors ocurred while gathering container data for pod community-operators-cf58468c6-mbtjc: [unable to gather container /healthz: Get https://localhost:37587/: tls: first record does not look like a TLS handshake, unable to gather container /version: Get https://localhost:37587/: tls: first record does not look like a TLS handshake, unable to gather container /metrics: Get https://localhost:37587/metrics: tls: first record does not look like a TLS handshake], one or more errors ocurred while gathering container data for pod marketplace-operator-774bc7f648-ktghr: [unable to gather container /healthz: Get https://localhost:37587/: EOF, unable to gather container /version: Get https://localhost:37587/: EOF, unable to gather container /metrics: Get https://localhost:37587/metrics: EOF], one or more errors ocurred while gathering container data for pod redhat-operators-7886454bd6-rfvbh: [unable to gather container /healthz: Get https://localhost:37587/: tls: first record does not look like a TLS handshake, unable to gather container /version: Get https://localhost:37587/: tls: first record does not look like a TLS handshake, unable to gather container /metrics: Get https://localhost:37587/metrics: tls: first record does not look like a TLS handshake]] 2) There is no metrics,healthz,version information in the pods dir. ├── pods │ ├── certified-operators-575db8f595-vwflb │ │ ├── certified-operators │ │ │ └── certified-operators │ │ │ ├── healthz │ │ │ └── logs │ │ │ ├── current.log │ │ │ └── previous.log │ │ └── certified-operators-575db8f595-vwflb.yaml Expected results: The collection should no error "2019/07/02 15:52:22 Finished successfully with no errors." 2) there are metrics,healthz,version information in the pods dir like apiserver. ├── apiserver-9tt5f │ ├── apiserver-9tt5f.yaml │ ├── fix-audit-permissions │ │ └── fix-audit-permissions │ │ └── logs │ │ ├── current.log │ │ └── previous.log │ └── openshift-apiserver │ └── openshift-apiserver │ ├── healthz │ │ ├── index │ │ ├── log │ │ ├── ping │ │ ├── poststarthook │ │ │ ├── apiservice-openapi-controller │ │ │ ├── authorization.openshift.io-bootstrapclusterroles │ │ │ ├── authorization.openshift.io-ensureopenshift-infra │ │ │ ├── clientCA-reload │ │ │ ├── generic-apiserver-start-informers │ │ │ ├── image.openshift.io-apiserver-caches │ │ │ ├── openshift.io-restmapperupdater │ │ │ ├── openshift.io-startinformers │ │ │ ├── project.openshift.io-projectauthorizationcache │ │ │ ├── project.openshift.io-projectcache │ │ │ ├── quota.openshift.io-clusterquotamapping │ │ │ ├── requestheader-reload │ │ │ └── security.openshift.io-bootstrapscc │ │ └── ready │ ├── logs │ │ ├── current.log │ │ └── previous.log │ ├── metrics.json │ └── version.json https://github.com/openshift/must-gather/blob/master/pkg/cmd/inspect/resource.go#L148-L164 Additional info:
How does this relate to https://bugzilla.redhat.com/show_bug.cgi?id=1717638
(In reply to Eric Rich from comment #2) > How does this relate to https://bugzilla.redhat.com/show_bug.cgi?id=1717638 That bug is against OLM and not the marketplace operator
must-gather is not looking to capture the liveliness/rediness probes per-say, but the following endpoints that the marketplace operator and operand pods pods do not implement: /healthz -- see https://github.com/kubernetes/apiserver/tree/master/pkg/server/healthz /version -- see https://github.com/kubernetes/apiserver/blob/master/pkg/server/routes/version.go /metrics -- see https://github.com/kubernetes/apiserver/blob/master/pkg/server/routes/metrics.go I've adjusted must-gather to better detect which pods do not support these endpoints.
(In reply to Luis Sanchez from comment #6) > must-gather is not looking to capture the liveliness/rediness probes > per-say, but the following endpoints that the marketplace operator and > operand pods pods do not implement: > > /healthz -- see > https://github.com/kubernetes/apiserver/tree/master/pkg/server/healthz > /version -- see > https://github.com/kubernetes/apiserver/blob/master/pkg/server/routes/ > version.go > /metrics -- see > https://github.com/kubernetes/apiserver/blob/master/pkg/server/routes/ > metrics.go > > I've adjusted must-gather to better detect which pods do not support these > endpoints. Would it not be better to open up new bugs for the pods that don't support these to have them support these endpoints?
It doesn't look like it's going to make 4.2, moving to 4.3.