Bug 1763308

Summary: Metering tables are stored in textformat instead of ORC
Product: OpenShift Container Platform Reporter: Chance Zibolski <chancez>
Component: Metering OperatorAssignee: Chance Zibolski <chancez>
Status: CLOSED ERRATA QA Contact: Peter Ruan <pruan>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.2.0CC: pruan, sd-operator-metering, talessio
Target Milestone: ---   
Target Release: 4.2.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1763306 Environment:
Last Closed: 2019-11-19 13:49:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1763306    
Bug Blocks:    

Description Chance Zibolski 2019-10-18 17:56:46 UTC
+++ This bug was initially created as a clone of Bug #1763306 +++

Description of problem: Metering tables are being stored in textfile format by default, rather than ORC which is more efficient to query and store.


Version-Release number of selected component (if applicable): 4.2.0


How reproducible: Very


Steps to Reproduce:
1. Exec into the presto pod: `kubectl exec presto-coordinator-0 -c presto -i -t --namespace openshift-metering -- /usr/local/bin/presto-cli --server https://presto:8080 --catalog hive --schema default --user reporting-operator --keystore-path /opt/presto/tls/keystore.pem`
2. Run the query `show create table hive.metering.datasource_openshift_metering_pod_usage_memory_bytes;`
3. See the output indicating the fileformat:

                                   Create Table
----------------------------------------------------------------------------------
 CREATE TABLE hive.metering.datasource_openshift_metering_pod_usage_memory_bytes (
    amount double,
    timestamp timestamp,
    timeprecision double,
    labels map(varchar, varchar),
    dt varchar
 )
 WITH (
    format = 'TEXTFILE',
    partitioned_by = ARRAY['dt']
 )
(1 row)

Actual results: Table files are stored in textformat, not ORC.


Expected results: Table files are stored in ORC format.


Additional info:

Comment 3 Peter Ruan 2019-11-06 05:36:32 UTC
verified against PR, will test again once it's merged.
pruan@MacBook-Pro ~/workspace/BushSlicer (master)$ kubectl exec presto-coordinator-0 -c presto -i -t --namespace openshift-metering -- /usr/local/bin/presto-cli --server https://presto:8080 --catalog hive --schema default --user reporting-operator --keystore-path /opt/presto/tls/keystore.pem
presto:default>
presto:default> show create table hive.metering.datasource_openshift_metering_pod_usage_memory_bytes;
                                   Create Table
-----------------------------------------------------------------------------------
 CREATE TABLE hive.metering.datasource_openshift_metering_pod_usage_memory_bytes (
    amount double,
    timestamp timestamp,
    timeprecision double,
    labels map(varchar, varchar),
    dt varchar
 )
 WITH (
    format = 'ORC',
    partitioned_by = ARRAY['dt']
 )
(1 row)

Query 20191106_053458_00306_m4n85, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0:00 [0 rows, 0B] [0 rows/s, 0B/s]

Comment 5 errata-xmlrpc 2019-11-19 13:49:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3869