Bug 1829035

Summary: OLM installation with with a stage build failed due to invalid url (space in url name)
Product: OpenShift Container Platform Reporter: Peter Ruan <pruan>
Component: Metering OperatorAssignee: tflannag
Status: CLOSED ERRATA QA Contact: Peter Ruan <pruan>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.4CC: bparees, btofel, mifiedle, pweil, scuppett, sd-operator-metering, soconcar, tkatarki, vlaad, wsun, wzheng, yapei
Target Milestone: ---Keywords: Reopened, TestBlocker
Target Release: 4.4.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1829072 (view as bug list) Environment:
Last Closed: 2020-05-18 13:35:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1829072    
Bug Blocks:    
Attachments:
Description Flags
ansible output log
none
presto-coordinator-0.log none

Description Peter Ruan 2020-04-28 18:37:06 UTC
Description of problem:

OLM installation with with a stage build failed due to  invalid url (space in url name)
TASK [meteringconfig : Log Events for validating configurations] ***************
task path: /opt/ansible/roles/meteringconfig/tasks/validate.yml:91
Tuesday 28 April 2020  18:09:29 +0000 (0:00:00.365)       0:00:14.305 ********* 
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: MaxRetryError: HTTPConnectionPool(host='localhost', port=8888): Max retries exceeded with url: /api/v1/namespaces/openshift-metering/events/Validate Configuration (Caused by ProtocolError('Connection aborted.', InvalidURL("URL can't contain control characters. '/api/v1/namespaces/openshift-metering/events/Validate Configuration' (found at least ' ')",)))
fatal: [localhost]: FAILED! => {"changed": false, "msg": "HTTPConnectionPool(host='localhost', port=8888): Max retries exceeded with url: /api/v1/namespaces/openshift-metering/events/Validate Configuration (Caused by ProtocolError('Connection aborted.', InvalidURL(\"URL can't contain control characters. '/api/v1/namespaces/openshift-metering/events/Validate Configuration' (found at least ' ')\",)))"}


Version-Release number of selected component (if applicable):
metering operator version 4.4.0-202004261927

How reproducible:
always.

Steps to Reproduce:
1. login cluster and install Metering via Operatorhub
2.
3.

Actual results:
in the ansible pod log 
n exception occurred during task execution. To see the full traceback, use -vvv. The error was: MaxRetryError: HTTPConnectionPool(host='localhost', port=8888): Max retries exceeded with url: /api/v1/namespaces/openshift-metering/events/Validate Configuration (Caused by ProtocolError('Connection aborted.', InvalidURL("URL can't contain control characters. '/api/v1/namespaces/openshift-metering/events/Validate Configuration' (found at least ' ')",)))


Expected results:
install successfully

Additional info:

Comment 3 Peter Ruan 2020-04-29 06:52:57 UTC
This fix did not work.  Although the metering installation got past the initial failure, the installation seems to be stuck in a loop.

Comment 4 Peter Ruan 2020-04-29 06:54:39 UTC
Created attachment 1682729 [details]
ansible output log

Comment 5 Peter Ruan 2020-04-29 08:09:02 UTC
After waiting for a loong while, it seems like the installation got through, but looping in the metering-operator installation to hive pod for almost 40 minutes.

pruan@MacBook-Pro ~/workspace/BushSlicer (fix_ERB_file_ref●)$ oc get pods
NAME                                 READY   STATUS    RESTARTS   AGE
hive-metastore-0                     0/2     Pending   0          14m
hive-server-0                        3/3     Running   0          14m
metering-operator-6dd9dc5d9d-mkpcp   2/2     Running   0          53m

Comment 6 Wenjing Zheng 2020-04-29 10:20:36 UTC
$ oc get pods
NAME                                 READY   STATUS             RESTARTS   AGE
hive-metastore-0                     2/2     Running            0          127m
hive-server-0                        3/3     Running            0          127m
metering-operator-5d5b857d79-djgjr   2/2     Running            0          164m
presto-coordinator-0                 1/2     CrashLoopBackOff   9          25m
reporting-operator-796d6f667-jdk6v   1/2     Running            3          25m

Comment 7 Yadan Pei 2020-04-29 10:50:03 UTC
on the stage cluster, PrestoServer failed to start with error: No factory for connector [prometheus]
# oc get pods -n openshift-metering
NAME                                 READY   STATUS             RESTARTS   AGE
hive-metastore-0                     2/2     Running            0          11m
hive-server-0                        3/3     Running            0          11m
metering-operator-5d5b857d79-djgjr   2/2     Running            0          3h3m
presto-coordinator-0                 1/2     CrashLoopBackOff   6          11m
reporting-operator-796d6f667-7hmfw   1/2     Running            1          11m

# oc logs -f presto-coordinator-0 presto -n openshift-metering
... 
2020-04-29T10:18:21.075Z	INFO	main	io.prestosql.metadata.StaticCatalogStore	-- Loading catalog etc/catalog/tpch.properties --
2020-04-29T10:18:21.082Z	INFO	main	io.prestosql.metadata.StaticCatalogStore	-- Added catalog tpch using connector tpch --
2020-04-29T10:18:21.083Z	INFO	main	io.prestosql.metadata.StaticCatalogStore	-- Loading catalog etc/catalog/tpcds.properties --
2020-04-29T10:18:21.086Z	INFO	main	io.prestosql.metadata.StaticCatalogStore	-- Added catalog tpcds using connector tpcds --
2020-04-29T10:18:21.086Z	INFO	main	io.prestosql.metadata.StaticCatalogStore	-- Loading catalog etc/catalog/prometheus.properties --
2020-04-29T10:18:21.087Z	ERROR	main	io.prestosql.server.PrestoServer	No factory for connector [prometheus].  Available factories: [memory, kudu, blackhole, redis, accumulo, raptor-legacy, jmx, postgresql, elasticsearch, redshift, sqlserver, localfile, tpch, mysql, mongodb, example-http, tpcds, phoenix, system, cassandra, kafka, atop, hive-hadoop2, presto-thrift]
java.lang.IllegalArgumentException: No factory for connector [prometheus].  Available factories: [memory, kudu, blackhole, redis, accumulo, raptor-legacy, jmx, postgresql, elasticsearch, redshift, sqlserver, localfile, tpch, mysql, mongodb, example-http, tpcds, phoenix, system, cassandra, kafka, atop, hive-hadoop2, presto-thrift]
	at com.google.common.base.Preconditions.checkArgument(Preconditions.java:440)
	at io.prestosql.connector.ConnectorManager.createConnection(ConnectorManager.java:178)
	at io.prestosql.metadata.StaticCatalogStore.loadCatalog(StaticCatalogStore.java:88)
	at io.prestosql.metadata.StaticCatalogStore.loadCatalogs(StaticCatalogStore.java:68)
	at io.prestosql.server.PrestoServer.run(PrestoServer.java:127)
	at io.prestosql.server.PrestoServer.main(PrestoServer.java:70)

# oc rsh presto-coordinator-0
Defaulting container name to presto.
Use 'oc describe pod/presto-coordinator-0 -n openshift-metering' to see all of the containers in this pod.
sh-4.2$ cat etc/catalog/prometheus.properties
connector.name=prometheus
prometheus-uri=https://prometheus-k8s.openshift-monitoring.svc:9091/
query-chunk-size-duration=1h
max-query-range-duration=1d
cache-duration=30s
bearer-token-file=/var/run/secrets/kubernetes.io/serviceaccount/token


But on a cluster presto is running successfully, I see it is using the same configuration as above. 
# oc get pods -n openshift-metering
NAME                                  READY   STATUS    RESTARTS   AGE
hive-metastore-0                      2/2     Running   0          40m
hive-server-0                         3/3     Running   0          40m
metering-operator-5888c8fc5c-kl2r2    2/2     Running   0          43m
presto-coordinator-0                  2/2     Running   0          39m
reporting-operator-57ddfdf96d-p8ds8   2/2     Running   0          38m
[root@preserved-qe-ui-rhel-1 operator-metering]# oc logs -f presto-coordinator-0 presto -n openshift-metering | grep prometheus
2020-04-29T10:10:47.741Z	INFO	main	io.prestosql.server.PluginManager	-- Loading plugin /var/presto/data/plugin/prometheus --
2020-04-29T10:10:47.741Z	INFO	main	io.prestosql.server.PluginManager	Classpath for prometheus:
2020-04-29T10:10:47.742Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/animal-sniffer-annotations-1.14.jar
2020-04-29T10:10:47.742Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/aopalliance-1.0.jar
2020-04-29T10:10:47.742Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/bootstrap-0.184.jar
2020-04-29T10:10:47.742Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/bval-jsr-2.0.0.jar
2020-04-29T10:10:47.742Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/cglib-nodep-3.2.5.jar
2020-04-29T10:10:47.742Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/checker-qual-2.8.2.jar
2020-04-29T10:10:47.742Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/commons-codec-1.11.jar
2020-04-29T10:10:47.742Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/configuration-0.184.jar
2020-04-29T10:10:47.742Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/error_prone_annotations-2.3.3.jar
2020-04-29T10:10:47.742Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/guava-26.0-jre.jar
2020-04-29T10:10:47.742Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/guice-4.2.2.jar
2020-04-29T10:10:47.742Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/httpclient-4.5.8.jar
2020-04-29T10:10:47.742Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/httpcore-4.4.11.jar
2020-04-29T10:10:47.742Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/j2objc-annotations-1.1.jar
2020-04-29T10:10:47.742Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/jackson-core-2.10.0.jar
2020-04-29T10:10:47.742Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/jackson-databind-2.10.0.jar
2020-04-29T10:10:47.742Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/jackson-datatype-guava-2.10.0.jar
2020-04-29T10:10:47.743Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/jackson-datatype-jdk8-2.10.0.jar
2020-04-29T10:10:47.743Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/jackson-datatype-joda-2.10.0.jar
2020-04-29T10:10:47.743Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/jackson-datatype-jsr310-2.10.0.jar
2020-04-29T10:10:47.743Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/jackson-module-parameter-names-2.10.0.jar
2020-04-29T10:10:47.743Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/javax.annotation-api-1.3.1.jar
2020-04-29T10:10:47.743Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/javax.inject-1.jar
2020-04-29T10:10:47.743Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/jcl-over-slf4j-1.7.25.jar
2020-04-29T10:10:47.743Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/jmxutils-1.21.jar
2020-04-29T10:10:47.743Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/json-0.184.jar
2020-04-29T10:10:47.743Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/jsr305-3.0.2.jar
2020-04-29T10:10:47.743Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/log-0.184.jar
2020-04-29T10:10:47.743Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/log-manager-0.184.jar
2020-04-29T10:10:47.743Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/log4j-over-slf4j-1.7.25.jar
2020-04-29T10:10:47.743Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/logback-core-1.2.3.jar
2020-04-29T10:10:47.743Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/okhttp-3.9.0.jar
2020-04-29T10:10:47.743Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/okio-1.13.0.jar
2020-04-29T10:10:47.743Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/presto-prometheus-322.jar
2020-04-29T10:10:47.743Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/slf4j-api-1.7.25.jar
2020-04-29T10:10:47.743Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/slf4j-jdk14-1.7.25.jar
2020-04-29T10:10:47.743Z	INFO	main	io.prestosql.server.PluginManager	    /var/presto/data/plugin/prometheus/validation-api-2.0.1.Final.jar
2020-04-29T10:10:47.753Z	INFO	main	io.prestosql.server.PluginManager	Installing io.prestosql.plugin.prometheus.PrometheusPlugin
2020-04-29T10:10:47.763Z	INFO	main	io.prestosql.server.PluginManager	Registering connector prometheus
2020-04-29T10:10:47.765Z	INFO	main	io.prestosql.server.PluginManager	-- Finished loading plugin /var/presto/data/plugin/prometheus --
2020-04-29T10:10:50.855Z	INFO	main	io.prestosql.metadata.StaticCatalogStore	-- Loading catalog etc/catalog/prometheus.properties --
2020-04-29T10:10:51.481Z	INFO	main	Bootstrap	prometheus-uri             http://localhost:9090  https://prometheus-k8s.openshift-monitoring.svc:9091/  Where to find Prometheus coordinator host
2020-04-29T10:10:51.817Z	INFO	main	io.prestosql.metadata.StaticCatalogStore	-- Added catalog prometheus using connector prometheus --
[root@preserved-qe-ui-rhel-1 operator-metering]# oc rsh presto-coordinator-0
Defaulting container name to presto.
Use 'oc describe pod/presto-coordinator-0 -n openshift-metering' to see all of the containers in this pod.
sh-4.2$ cat etc/catalog/prometheus.properties 
connector.name=prometheus
prometheus-uri=https://prometheus-k8s.openshift-monitoring.svc:9091/
query-chunk-size-duration=1h
max-query-range-duration=1d
cache-duration=30s
bearer-token-file=/var/run/secrets/kubernetes.io/serviceaccount/token

Comment 8 Yadan Pei 2020-04-29 11:01:10 UTC
the log for the pods on stage cluster(which is not running correctly)

It looks like prestosql.server.PluginManager didn't load prometheus plugin

$ oc logs -f presto-coordinator-0  presto -n openshift-metering | grep prometheus
2020-04-29T10:57:54.322Z	INFO	main	io.prestosql.metadata.StaticCatalogStore	-- Loading catalog etc/catalog/prometheus.properties --
2020-04-29T10:57:54.323Z	ERROR	main	io.prestosql.server.PrestoServer	No factory for connector [prometheus].  Available factories: [memory, kudu, blackhole, redis, accumulo, raptor-legacy, jmx, postgresql, elasticsearch, redshift, sqlserver, localfile, tpch, mysql, mongodb, example-http, tpcds, phoenix, system, cassandra, kafka, atop, hive-hadoop2, presto-thrift]
java.lang.IllegalArgumentException: No factory for connector [prometheus].  Available factories: [memory, kudu, blackhole, redis, accumulo, raptor-legacy, jmx, postgresql, elasticsearch, redshift, sqlserver, localfile, tpch, mysql, mongodb, example-http, tpcds, phoenix, system, cassandra, kafka, atop, hive-hadoop2, presto-thrift]


full log will be attached

Comment 9 Yadan Pei 2020-04-29 11:01:54 UTC
Created attachment 1682841 [details]
presto-coordinator-0.log

Comment 16 Mike Fiedler 2020-04-29 19:13:16 UTC
Moving back to assigned for fix in z stream

Comment 18 Wei Sun 2020-04-30 05:26:47 UTC
Per #comment16, change the targeted release version to 4.4.z

Comment 27 Peter Ruan 2020-05-11 17:59:37 UTC
Verified with build 4.4.0-0.nightly-2020-05-08-224132.  Which has version metering-operator.4.4.0-202005081957. Metering installation was successful and reports can be generated.  

sh-4.2$ ansible --version
ansible 2.9.7
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/usr/share/ansible/openshift']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, Sep 26 2019, 13:23:47) [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]
sh-4.2$ 

pruan@MacBook-Pro ~/workspace/v3-testfiles/metering/rollup_reports (master●)$ oc get installplan
NAME            CSV                                    APPROVAL    APPROVED
install-ntfxv   metering-operator.4.4.0-202005081957   Automatic   true

pruan@MacBook-Pro ~/workspace/v3-testfiles/metering/rollup_reports (master●)$ oc get reports
NAME                              QUERY                         SCHEDULE   RUNNING    FAILED   LAST REPORT TIME       AGE
namespace-memory-usage-now        namespace-memory-usage                   Finished            2020-12-30T23:59:59Z   29m
persistentvolumeclaim-usage-now   persistentvolumeclaim-usage              Finished            2020-12-30T23:59:59Z   10s

will test again with an official stage build.

Comment 28 Peter Ruan 2020-05-15 06:18:37 UTC
verified again with stage build metering-operator payload 4.4.0-202005121717

Comment 30 errata-xmlrpc 2020-05-18 13:35:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2133