Hide Forgot
Created attachment 1536117 [details] cmo-kubechart.png Currently, the CMO takes about 6-8m to roll out all the monitoring components. However, it does so serially rather than in parallel (see attached kubechart) Is there a reason for this? If not, lets parallelize this as CMO is deployed late in install/upgrade and is on the critical path to completion of install/upgrade. The telemeter-client is currently the last pod to start in the cluster.
Created attachment 1538613 [details] kubechart-2.png I see the changes but there seems to be a lot of time where nothing is happening now (see new attachment) Basically: t-0 - CMO starts +1m - prom operator starts (>1m image pull time) +2m - prom operator running +3m - everything except prom and prom-adapter starts +6m - prom and prom-adapter start By my observation, things are starting more in parallel, but due to things just sitting around doing nothing, still takes the same amount of time :-/
As per Comment 6, change back to MODIFIED
from the chart, monitoring components are deployed in parallel now, but it took about 12 minutes to roll out all the monitoring components. other products such as openshift-kube-scheduler-operator, openshift-marketplace also took about 12 minutes to roll out all components cluster-monitoring-operator-775cccc768-b7sj7 "2019-04-09T21:22:59.633855813-04:00", "2019-04-09T21:34:03.680993632-04:00" node-exporter-qtj7g "2019-04-09T21:22:59.632973119-04:00", "2019-04-09T21:34:03.680989644-04:00" node-exporter-7r4r4 "2019-04-09T21:22:59.634100905-04:00", "2019-04-09T21:34:03.680997416-04:00" node-exporter-fmgxk "2019-04-09T21:23:14.330596029-04:00", "2019-04-09T21:34:03.680992082-04:00" node-exporter-r6xxk "2019-04-09T21:25:21.89682946-04:00", "2019-04-09T21:34:03.680997923-04:00" prometheus-operator-5ff75f95fc-k854z "2019-04-09T21:25:22.795222761-04:00", "2019-04-09T21:34:03.680995725-04:00" node-exporter-lvt8c "2019-04-09T21:25:32.744990996-04:00", "2019-04-09T21:34:03.680990378-04:00" node-exporter-nnvpc "2019-04-09T21:26:09.737593613-04:00", "2019-04-09T21:34:03.680998403-04:00" telemeter-client-8d885568b-9prt4 "2019-04-09T21:26:29.870418508-04:00", "2019-04-09T21:34:03.680993034-04:00" kube-state-metrics-697cd6f695-wsvmf "2019-04-09T21:26:43.857177907-04:00", "2019-04-09T21:34:03.680997163-04:00" grafana-56879d5757-bbxvg "2019-04-09T21:27:18.222775024-04:00", "2019-04-09T21:34:03.680994595-04:00" alertmanager-main-0 "2019-04-09T21:27:25.66836269-04:00", "2019-04-09T21:34:03.680994115-04:00" alertmanager-main-1 "2019-04-09T21:27:47.949750711-04:00", "2019-04-09T21:34:03.68099098-04:00" alertmanager-main-2 "2019-04-09T21:28:17.606583558-04:00", "2019-04-09T21:34:03.680991418-04:00" prometheus-k8s-1 "2019-04-09T21:28:51.757987889-04:00", "2019-04-09T21:34:03.681027516-04:00" prometheus-k8s-0 "2019-04-09T21:29:53.943373903-04:00", "2019-04-09T21:34:03.681027175-04:00" prometheus-adapter-7cc8fbcbd-4ldtm "2019-04-09T21:30:13.559570559-04:00", "2019-04-09T21:34:03.680995126-04:00" prometheus-adapter-7cc8fbcbd-9ttsx "2019-04-09T21:30:13.559570559-04:00", "2019-04-09T21:34:03.680995126-04:00"
Created attachment 1554009 [details] kubechart -3
payload: 4.0.0-0.nightly-2019-04-05-165550
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758