Bug 1763306 - Metering tables are stored in textformat instead of ORC
Summary: Metering tables are stored in textformat instead of ORC
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Metering Operator
Version: 4.3.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.3.0
Assignee: Emily Moss
QA Contact: Peter Ruan
URL:
Whiteboard:
Depends On:
Blocks: 1763308
TreeView+ depends on / blocked
 
Reported: 2019-10-18 17:55 UTC by Chance Zibolski
Modified: 2020-01-23 11:08 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1763308 (view as bug list)
Environment:
Last Closed: 2020-01-23 11:08:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github operator-framework operator-metering pull 988 0 'None' 'closed' 'bug 1763306: Fix setting default fileformat for Hive tables' 2019-11-25 23:10:06 UTC
Red Hat Product Errata RHBA-2020:0062 0 None None None 2020-01-23 11:08:40 UTC

Description Chance Zibolski 2019-10-18 17:55:41 UTC
Description of problem: Metering tables are being stored in textfile format by default, rather than ORC which is more efficient to query and store.


Version-Release number of selected component (if applicable): 4.3.0


How reproducible: Very


Steps to Reproduce:
1. Exec into the presto pod: `kubectl exec presto-coordinator-0 -c presto -i -t --namespace openshift-metering -- /usr/local/bin/presto-cli --server https://presto:8080 --catalog hive --schema default --user reporting-operator --keystore-path /opt/presto/tls/keystore.pem`
2. Run the query `show create table hive.metering.datasource_openshift_metering_pod_usage_memory_bytes;`
3. See the output indicating the fileformat:

                                   Create Table
----------------------------------------------------------------------------------
 CREATE TABLE hive.metering.datasource_openshift_metering_pod_usage_memory_bytes (
    amount double,
    timestamp timestamp,
    timeprecision double,
    labels map(varchar, varchar),
    dt varchar
 )
 WITH (
    format = 'TEXTFILE',
    partitioned_by = ARRAY['dt']
 )
(1 row)

Actual results: Table files are stored in textformat, not ORC.


Expected results: Table files are stored in ORC format.


Additional info:

Comment 1 Peter Ruan 2019-10-22 23:51:58 UTC
verified with 4.3.0-0.nightly-2019-10-22-101148 and `metering` master branch

1. delete existing bucket or empty contents to avoid duplications from previous installs
2. git checkout master and sync to latest
3. run ./hack/openshift-install.sh
4. presto:default> show create table hive.metering.datasource_openshift_metering_pod_usage_memory_bytes;
                                   Create Table
-----------------------------------------------------------------------------------
 CREATE TABLE hive.metering.datasource_openshift_metering_pod_usage_memory_bytes (
    amount double,
    timestamp timestamp,
    timeprecision double,
    labels map(varchar, varchar),
    dt varchar
 )
 WITH (
    format = 'ORC',
    partitioned_by = ARRAY['dt']
 )
(1 row)

Comment 4 errata-xmlrpc 2020-01-23 11:08:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062


Note You need to log in before you can comment on or make changes to this bug.