Created attachment 1026627 [details] sosreport Description of problem: Here is part of log information: ~~~ # cat /var/log/ovirt-engine-reports/jasperserver.log Caused by: org.postgresql.util.PSQLException: Connection refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections. (<------Here is the error information about network) at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:134) at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:64) at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:123) at org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:28) at org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:19) at org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:29) at org.postgresql.jdbc4.Jdbc4Connection.<init>(Jdbc4Connection.java:22) at org.postgresql.Driver.makeConnection(Driver.java:391) at org.postgresql.Driver.connect(Driver.java:265) at org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.getLocalManagedConnection(LocalManagedConnectionFactory.java:303) ... 15 more Caused by: java.net.ConnectException: Connection refused ~~~
This means that postgresql is either down\not install or blocked by firewall. Is that the case?
Hi Yaniv, Thanks for your reminding,I will check that for customer.
Hi Yaniv, 1.Here is the output of "ps": ~~~ # less ps|grep post postfix 461 0.0 0.0 80936 3372 ? S 12:49 0:00 pickup -l -t fifo -u root 1713 0.0 0.0 80856 2548 ? Ss Mar26 0:22 /usr/libexec/postfix/master postfix 1721 0.0 0.0 81108 2672 ? S Mar26 0:04 qmgr -l -t fifo -u postgres 7311 0.1 0.5 269852 84976 ? Ss May21 10:08 postgres: engine engine 127.0.0.1(59064) idle postgres 12177 0.0 0.0 217340 5780 ? S May07 1:10 /usr/bin/postmaster -p 5432 -D /var/lib/pgsql/data postgres 12179 0.0 0.0 178984 1132 ? Ss May07 1:54 postgres: logger process postgres 12181 0.0 0.2 217600 35272 ? Ss May07 19:59 postgres: writer process postgres 12182 0.0 0.0 217340 1400 ? Ss May07 2:54 postgres: wal writer process postgres 12183 0.0 0.0 217980 1812 ? Ss May07 1:52 postgres: autovacuum launcher process postgres 12184 0.0 0.0 179588 1512 ? Ss May07 8:08 postgres: stats collector process postgres 12188 0.0 0.0 221020 8212 ? Ss May07 1:54 postgres: engine engine ::1(37920) idle ovirt 16573 0.4 1.7 6781972 287884 ? Sl May07 108:20 ovirt-engine-dwhd -Dorg.ovirt.engine.dwh.settings=/tmp/tmpfYoZgi/settings.properties -verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -classpath /usr/share/ovirt-engine-dwh/lib/*::/usr/share/java/dom4j-eap6/dom4j-1.6.1.redhat-6-sources.jar:/usr/share/java/dom4j-eap6/dom4j-1.6.1.redhat-6.jar:/usr/share/java/commons-collections.jar:/usr/share/java/postgresql-jdbc.jar ovirt_engine_dwh.historyetl_3_5.HistoryETL --context=Default postgres 16587 0.0 0.0 218828 11180 ? Ss May07 3:59 postgres: engine engine 127.0.0.1(43981) idle postgres 16588 0.0 0.3 233412 49328 ? Ss May07 5:00 postgres: engine engine 127.0.0.1(43982) idle postgres 16589 0.0 0.2 222816 41736 ? Ss May07 10:23 postgres: ovirt_engine_history ovirt_engine_history 127.0.0.1(43983) idle postgres 16590 0.0 0.2 219652 33144 ? Ss May07 0:03 postgres: ovirt_engine_history ovirt_engine_history 127.0.0.1(43984) idle in transaction postgres 16591 0.0 0.0 219632 10916 ? Ss May07 0:00 postgres: ovirt_engine_history ovirt_engine_history 127.0.0.1(43985) idle in transaction postgres 16592 0.0 0.2 220028 39316 ? Ss May07 2:17 postgres: ovirt_engine_history ovirt_engine_history 127.0.0.1(43986) idle in transaction postgres 16923 0.0 0.2 235828 33192 ? Ss May07 0:12 postgres: ovirt_engine_reports ovirt_engine_reports 127.0.0.1(44014) idle postgres 16929 0.0 0.1 226712 24744 ? Ss May07 0:12 postgres: ovirt_engine_reports ovirt_engine_reports 127.0.0.1(44016) idle postgres 16946 0.0 0.1 227328 24280 ? Ss May07 0:12 postgres: ovirt_engine_reports ovirt_engine_reports 127.0.0.1(44025) idle postgres 16947 0.0 0.1 226732 23068 ? Ss May07 0:12 postgres: ovirt_engine_reports ovirt_engine_reports 127.0.0.1(44026) idle postgres 16949 0.0 0.1 234120 31284 ? Ss May07 0:12 postgres: ovirt_engine_reports ovirt_engine_reports 127.0.0.1(44027) idle postgres 16950 0.0 0.1 227324 23424 ? Ss May07 0:12 postgres: ovirt_engine_reports ovirt_engine_reports 127.0.0.1(44028) idle postgres 21357 0.1 0.6 293920 108492 ? Ss May20 12:54 postgres: engine engine 127.0.0.1(58545) idle postgres 21477 0.1 0.6 292228 109840 ? Ss May20 12:49 postgres: engine engine 127.0.0.1(58547) idle postgres 21484 0.1 0.5 276516 93424 ? Ss May20 12:50 postgres: engine engine 127.0.0.1(58553) idle postgres 22608 0.1 0.5 276456 93596 ? Ss May20 12:37 postgres: engine engine 127.0.0.1(58596) idle postgres 29055 0.0 0.0 221844 11052 ? Ss May11 0:00 postgres: ovirt_engine_history ovirt_engine_history 127.0.0.1(46346) idle postgres 29801 0.1 0.5 269344 83576 ? Ss May20 11:24 postgres: engine engine 127.0.0.1(58867) idle ~~~ 2.Here is the installed packages: ~~~ # less installed-rpms |grep sql mysql-libs-5.1.73-3.el6_5.x86_64 Fri Mar 20 08:24:42 2015 postgresql-8.4.20-1.el6_5.x86_64 Wed Mar 25 23:45:01 2015 postgresql-jdbc-8.4.704-2.el6.noarch Thu Apr 9 15:04:56 2015 postgresql-libs-8.4.20-1.el6_5.x86_64 Wed Mar 25 23:44:44 2015 postgresql-server-8.4.20-1.el6_5.x86_64 Wed Mar 25 23:45:02 2015 qt-sqlite-4.6.2-28.el6_5.x86_64 Fri Mar 20 08:24:40 2015 sqlite-3.6.20-1.el6.x86_64 Fri Mar 20 08:23:41 2015 ~~~ 3.No firewall in this machine. 4.Here is part of log: ~~~ 2015-05-07 16:10:18,255 WARN [org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory] (quartzScheduler_QuartzSchedulerThread) Destroying connection that is n ot valid, due to the following exception: org.postgresql.jdbc4.Jdbc4Connection@67fa4139: org.postgresql.util.PSQLException: FATAL: terminating connection due to admini strator command at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2094) [postgresql.jar:] at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1827) [postgresql.jar:] at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255) [postgresql.jar:] at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:508) [postgresql.jar:] at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:370) [postgresql.jar:] at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:362) [postgresql.jar:] at org.jboss.jca.adapters.jdbc.CheckValidConnectionSQL.isValidConnection(CheckValidConnectionSQL.java:74) at org.jboss.jca.adapters.jdbc.BaseWrapperManagedConnectionFactory.isValidConnection(BaseWrapperManagedConnectionFactory.java:1100) at org.jboss.jca.adapters.jdbc.BaseWrapperManagedConnection.checkValid(BaseWrapperManagedConnection.java:513) at org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.matchManagedConnections(LocalManagedConnectionFactory.java:513) at org.jboss.jca.core.connectionmanager.pool.mcp.SemaphoreArrayListManagedConnectionPool.getConnection(SemaphoreArrayListManagedConnectionPool.java:314) at org.jboss.jca.core.connectionmanager.pool.AbstractPool.getSimpleConnection(AbstractPool.java:406) at org.jboss.jca.core.connectionmanager.pool.AbstractPool.getConnection(AbstractPool.java:378) at org.jboss.jca.core.connectionmanager.AbstractConnectionManager.getManagedConnection(AbstractConnectionManager.java:354) at org.jboss.jca.core.connectionmanager.AbstractConnectionManager.getManagedConnection(AbstractConnectionManager.java:327) at org.jboss.jca.core.connectionmanager.AbstractConnectionManager.allocateConnection(AbstractConnectionManager.java:491) at org.jboss.jca.adapters.jdbc.WrapperDataSource.getConnection(WrapperDataSource.java:143) at org.springframework.scheduling.quartz.LocalDataSourceJobStore$2.getConnection(LocalDataSourceJobStore.java:129) [spring-context-support-3.1.0.RELEASE.jar:3.1.0.RELEASE] at org.quartz.utils.DBConnectionManager.getConnection(DBConnectionManager.java:108) [quartz-2.1.2.jar:] at org.quartz.impl.jdbcjobstore.JobStoreCMT.getNonManagedTXConnection(JobStoreCMT.java:165) [quartz-2.1.2.jar:] at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3807) [quartz-2.1.2.jar:] at org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTriggers(JobStoreSupport.java:2751) [quartz-2.1.2.jar:] at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:264) [quartz-2.1.2.jar:] ~~~ 5.pls check the attachment of sosreport. 6.Cu has destroyed this environment,it will be impossible to collect any data.
Hi, 1. Did you try restarting the posrtgresql service? and ovirt-engine-reportsd service ? 2. Please attach the logs under /var/log/ovirt-engine/ and /var/log/ovirt-engine-reports/.
Hi, The attachment is the tarall of rhevm's log. Thanks Victor
Hi, The attached sos reports is not the logs I require. Please provide the logs I referred to in comment #4. Also, please reply if you tried restarting the posrtgresql service? and ovirt-engine-reportsd service ?
Hi 1.I'm sorry but the log you required is contained in the attachment. ~~~ [root@victor newlog]# tar xf rhevm.log.gz <------It's the attachment [root@victor newlog]# ls rhevm-2015050514501430808611 rhevm.log.gz [root@victor newlog]# cd rhevm-2015050514501430808611/var/log/ovirt-engine ovirt-engine/ ovirt-engine-dwh/ ovirt-engine-reports/ [root@victor newlog]# cd rhevm-2015050514501430808611/var/log/ovirt-engine [root@victor ovirt-engine]# ls boot.log engine.log host-deploy ovirt-iso-uploader redhat-support-plugin-rhev.log setup console.log engine.log-20150505 notifier ovirt-log-collector server.log [root@victor ovirt-engine]# cd ../ovirt-engine-reports/ [root@victor ovirt-engine-reports]# ls boot.log console.log jasperserver.log reports.log server.log ~~~ >>>please reply if you tried restarting the posrtgresql service? and ovirt-engine-reportsd service ? cu tried that. Regards Victor
>>>1. Is the jasper server on the same machine as the engine? The Jasper report is combined with RHEVM in the same virtual machine >>>2. Is the fqdn of the machine fully qualified? there is local DNS resolve in /etc/hosts. ~~~ # cat etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 10.2.22.165 rhevm.lab.com 10.2.22.119 rhevh.lab.com # cat etc/sysconfig/network NETWORKING=yes HOSTNAME=rhevm.lab.com NTPSERVERARGS=iburst ~~~ Thanks Victor
I've reproduced this issue successfully. Jasper-report is newly installed on this machine(last night). How to reproduce: ~~~ # yum install rhevm-dwh rhevm-reports # engine-setup Configure Data Warehouse on this host (Yes, No) [Yes]:Yes Configure Reports on this host (Yes, No) [Yes]:Yes Setup can automatically configure the firewall on this system. Note: automatic configuration of the firewall may overwrite current settings. Do you want Setup to configure the firewall? (Yes, No) [Yes]:No ... (Just as the steps of the RHEV3.5 Installation Guide) ~~~ 1.Firewall status ~~~ # /etc/init.d/iptables status Table: filter Chain INPUT (policy ACCEPT) num target prot opt source destination Chain FORWARD (policy ACCEPT) num target prot opt source destination Chain OUTPUT (policy ACCEPT) num target prot opt source destination ~~~ 2.Selinux status ~~~ # getenforce Disabled ~~~ 3.FQDN status ~~~ [root@rhevm ovirt-engine]# hostname rhevm.cchen.redhat.com [root@rhevm ovirt-engine]# hostname -s rhevm [root@rhevm ovirt-engine]# hostname -f rhevm.cchen.redhat.com ~~~ Thanks Victor
Hi , BTW, I checked my database: Here is detailed steps: ~~~ -bash-4.1$ psql psql (8.4.20) Type "help" for help. postgres=# \l List of databases Name | Owner | Encoding | Collation | Ctype | Access privileges -------------------------------------+-------------------------------------+----------+-------------+-------------+----------------------- engine | engine | UTF8 | en_US.UTF-8 | en_US.UTF-8 | ovirt_engine_history | ovirt_engine_history | UTF8 | en_US.UTF-8 | en_US.UTF-8 | ovirt_engine_history_20150603102707 | ovirt_engine_history_20150603102707 | UTF8 | en_US.UTF-8 | en_US.UTF-8 | ovirt_engine_reports | ovirt_engine_reports | UTF8 | en_US.UTF-8 | en_US.UTF-8 | ovirt_engine_reports_20150603102714 | ovirt_engine_reports_20150603102714 | UTF8 | en_US.UTF-8 | en_US.UTF-8 | postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | rhevm_history | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres : postgres=CTc/postgres template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres : postgres=CTc/postgres (9 rows) postgres=# \c ovirt_engine_history psql (8.4.20) You are now connected to database "ovirt_engine_history". ovirt_engine_history=# \d List of relations Schema | Name | Type | Owner --------+----------------------------------------------------------+----------+---------------------- public | calendar | table | ovirt_engine_history public | cluster_configuration | table | ovirt_engine_history public | configuration_seq | sequence | ovirt_engine_history public | datacenter_configuration | table | ovirt_engine_history public | datacenter_daily_history | table | ovirt_engine_history public | datacenter_history_seq1 | sequence | ovirt_engine_history public | datacenter_history_seq2 | sequence | ovirt_engine_history ovirt_engine_history=# \d+ datacenter_daily_history Table "public.datacenter_daily_history" Column | Type | Modifiers | Storage | Description ----------------------------------+--------------+---------------------------------------------------------------+---------+------------- history_id | bigint | not null default nextval('datacenter_history_seq3'::regclass) | plain | history_datetime | date | not null | plain | datacenter_id | uuid | not null | plain | datacenter_status | smallint | not null | plain | minutes_in_status | numeric(7,2) | not null default 1 | main | datacenter_configuration_version | integer | not null | plain | Indexes: "datacenter_daily_history_pkey" PRIMARY KEY, btree (history_id) "datacenter_daily_history_datacenter_id_idx" btree (datacenter_id) "idx_datacenter_configuration_version_daily" btree (datacenter_configuration_version) "idx_datacenter_history_datetime_daily" btree (history_datetime) Foreign-key constraints: "datacenter_daily_history_datacenter_configuration_version_fkey" FOREIGN KEY (datacenter_configuration_version) REFERENCES datacenter_configuration(history_id) Has OIDs: yes ovirt_engine_history=# SELECT * from datacenter_daily_history ovirt_engine_history-# ; history_id | history_datetime | datacenter_id | datacenter_status | minutes_in_status | datacenter_configuration_version ------------+------------------+---------------+-------------------+-------------------+---------------------------------- (0 rows) ~~~
1. Are the databases on the same machine as the engine,dwh and reports? 2. You should log to ovirt_engine_history_20150603102707 which is the db used currently on your setup and run the sql query again. 3. All dns addresses must be fully resolvable. We do not support hacks on /etc/hosts. 4. Please provide dwh log after the service restart.
sorry, please provide again all the logs. not only dwh.
Hi I can reproduce that every time when I reinstall it. So did my colleagues and my customers. So I'm wondering if it is a known bug for : ~~~ couldn't get the status of "Uptime"/"Planned Downtime"/"Unplanned Downtime" of "Single Host Uptime(BR8)" in "Jasper-report". ~~~ It might happens also in "Single Host Resource Usage" of "jasper-report" It'd be better if you could try to install and run jasper-report to check that. Thanks Victor
Hi Shirly, Thank you for your work on this bug. I work with victor and currently we are co-working on this hot issue together. This is a hot issue because our partner and SA are delivering PoC to the customer but we failed to get some items in BRXXX Library. This issue is causing quite business impact and I hope we can co-operate with each other to figure it out. Let's take BR8 as an example: If my understanding is correct, the UI page fills in the Uptime/Planned Downtime/Unplanned Downtime data using the SQL query as following (if I'm wrong please kindly correct me.): # vim /usr/share/ovirt-engine-reports/ovirt-reports/resources/Reports/Service_level/Hosts/single_host_uptime_br8_files/single_host_uptime_br8_jrxml.data SELECT SUM ( CASE WHEN v3_5_statistics_hosts_resources_usage_$P!{table_name}.host_status = 2 /* If "Period" equals to "Daily" then "table_name" parameter equals to "hourly" else "daily" */ THEN COALESCE ( v3_5_statistics_hosts_resources_usage_$P!{table_name}.minutes_in_status, 0 ) ELSE 0 END ) AS planned_downtime_mins, SUM ( CASE WHEN v3_5_statistics_hosts_resources_usage_$P!{table_name}.host_status = 3 THEN COALESCE ( v3_5_statistics_hosts_resources_usage_$P!{table_name}.minutes_in_status, 0 ) ELSE 0 END ) AS unplanned_downtime_mins, SUM ( CASE WHEN v3_5_statistics_hosts_resources_usage_$P!{table_name}.host_status = 1 THEN COALESCE ( v3_5_statistics_hosts_resources_usage_$P!{table_name}.minutes_in_status, 0 ) ELSE 0 END ) AS uptime_mins, SUM ( v3_5_statistics_hosts_resources_usage_$P!{table_name}.minutes_in_status ) AS total FROM v3_5_statistics_hosts_resources_usage_$P!{table_name} Two hours ago I fresh installed a RHEV 3.5 and added a host into it. engine=# SELECT host_name,vds_name from vds ; host_name | vds_name ---------------+---------- 10.66.192.136 | h1 (1 row) I had a look at v3_5_statistics_hosts_resources_usage_XXXXXX tables but there seems to be no data in those tables. ovirt_engine_history=# SELECT count(*) from v3_5_statistics_hosts_resources_usage_hourly ; count ------- 0 (1 row) ovirt_engine_history=# SELECT count(*) from v3_5_statistics_hosts_resources_usage_daily ; count ------- 0 (1 row) I'm not sure whether this is the cause that all the three items are showing "Null". As victor mentioned, all of us (on-site SA, partner, customers, victor and I) are encountering the same issue. Please help us to confirm whether that is the cause (or whether we are misunderstanding the usage of BRXXX). We are awaiting for your feedback. My test environment: ( if you need it ) 10.66.219.175 root/redhat admin/redhat All the databases are deployed in the same box and hostnames can be resolved correctly using a DNS. Best Regards, Chen
Created attachment 1035274 [details] Some items are showing "null"
Sorry the correct screenshot is attachment 1035274 [details].
Hi, I now understand the issue. The aggregation is not working due to an issue with the date formatting. There is a bug already open for this issue. #1195395 It is fixed for master and I'm currently working on fixing it for 3.5. If you take a look at history_configuration table in the history database you will see that the lastHourAggr is incorrect.
Hi Shirly, Thank you for your reply. I tried the following steps in my test environment to workaround this issue and indeed the NPE has gone but I still can not get those data ( still shows as NULL ). ( If anything wrong please kindly let me know ) Here is what I have done: 1. Backed up /usr/share/pgsql/timezonesets/Default and changed CST as the following: CST 28800 ( China is in GMT+8:00 so 3600*8=28800 # date Mon Jun 8 10:51:26 CST 2015 ) 2. Restarted a set of services in the following sequence: # service ovirt-engine stop # service ovirt-engine-dwhd stop # service postgresql restart # service ovirt-engine start # service ovirt-engine-dwhd start 3. The call trace of java NPE has gone but I still can not get the data. Are these two bugs the same cause ? The v3_5_statistics_hosts_resources_usage_* tables are still empty. >If you take a look at history_configuration table in the history database you will see that the lastHourAggr is incorrect. Here is what lastHourAggr: # SELECT * from history_configuration where var_name='lastHourAggr' ; var_name | var_value | var_datetime --------------+-----------+------------------------ lastHourAggr | | 2015-06-08 23:00:00+08 (1 row) Would you please teach me what a correct lasHourAggr would look like ? >It is fixed for master and I'm currently working on fixing it for 3.5. If any proposal patch is welcome as we should come over PoC first. Thank you ! Best Regards, Chen
(In reply to Chen from comment #21) > Hi Shirly, > > Thank you for your reply. > > I tried the following steps in my test environment to workaround this issue > and indeed the NPE has gone but I still can not get those data ( still shows > as NULL ). ( If anything wrong please kindly let me know ) > > Here is what I have done: > > 1. Backed up /usr/share/pgsql/timezonesets/Default and changed CST as the > following: > > CST 28800 > > ( China is in GMT+8:00 so 3600*8=28800 > # date > Mon Jun 8 10:51:26 CST 2015 > ) > In order to deal with this issue at the moment you will need to switch the server where the database is installed to UTC timezone. Not CST. "Mon Jun 8 10:51:26 CST 2015" is equal to "Mon Jun 8 3:51:26 UTC 2015" You can also set once the lastHourAggr to beginning of the month (01-06-2015). so data that is already in the samples table will be aggregated after an hour. But this is not mandatory and if you wait until the time that is already set there the data will be aggregated as well. > 2. Restarted a set of services in the following sequence: > > # service ovirt-engine stop > # service ovirt-engine-dwhd stop > # service postgresql restart > # service ovirt-engine start > # service ovirt-engine-dwhd start > > 3. The call trace of java NPE has gone but I still can not get the data. > > Are these two bugs the same cause ? The > v3_5_statistics_hosts_resources_usage_* tables are still empty. > > >If you take a look at history_configuration table in the history database you will see that the lastHourAggr is incorrect. > > Here is what lastHourAggr: > > # SELECT * from history_configuration where var_name='lastHourAggr' ; > var_name | var_value | var_datetime > --------------+-----------+------------------------ > lastHourAggr | | 2015-06-08 23:00:00+08 > (1 row) > > Would you please teach me what a correct lasHourAggr would look like ? > After hourly aggregation the lastHourAggr should be set to the hour before last. It means samples data was aggregated until that hour. > >It is fixed for master and I'm currently working on fixing it for 3.5. > > If any proposal patch is welcome as we should come over PoC first. > > Thank you ! > > Best Regards, > Chen
Hi Shirly, I followed your advice. 1. Switch the database box timezone to UTC. # ln -sf /usr/share/zoneinfo/UTC /etc/localtime 2. I rebooted the database box. 3. Change the lastHourAggr to the beginning of this month. # UPDATE history_configuration SET var_datetime = '2015-06-01 20:00:00+00' where var_name='lastHourAggr'; After that, I found that lastHourAggr is changing back quickly by 7 hours. # SELECT * from history_configuration where var_name='lastHourAggr'; var_name | var_value | var_datetime --------------+-----------+------------------------ lastHourAggr | | 2015-04-09 18:00:00+00 (1 row) # SELECT * from history_configuration where var_name='lastHourAggr'; var_name | var_value | var_datetime --------------+-----------+------------------------ lastHourAggr | | 2015-04-09 11:00:00+00 (1 row) # SELECT * from history_configuration where var_name='lastHourAggr'; var_name | var_value | var_datetime --------------+-----------+------------------------ lastHourAggr | | 2015-04-09 04:00:00+00 (1 row) Except this, nothing has changed. The aggregation still didn't start and daily and hourly tables are still empty. In order to save time and make things more clear, you are welcome to login to my test box to have a look. 10.66.219.175 all the credentials are redhat. >But this is not mandatory and if you wait until the time that is already set there the data will be aggregated as well. BTW, what is the "time that is already set there" ? Can I check that time in the database ? Best Regards, Chen
(In reply to Chen from comment #24) > Hi Shirly, > > I followed your advice. > > 1. Switch the database box timezone to UTC. > > # ln -sf /usr/share/zoneinfo/UTC /etc/localtime > > 2. I rebooted the database box. > > 3. Change the lastHourAggr to the beginning of this month. > > # UPDATE history_configuration SET var_datetime = '2015-06-01 20:00:00+00' > where var_name='lastHourAggr'; > > After that, I found that lastHourAggr is changing back quickly by 7 hours. > > # SELECT * from history_configuration where var_name='lastHourAggr'; > var_name | var_value | var_datetime > --------------+-----------+------------------------ > lastHourAggr | | 2015-04-09 18:00:00+00 > (1 row) > > # SELECT * from history_configuration where var_name='lastHourAggr'; > var_name | var_value | var_datetime > --------------+-----------+------------------------ > lastHourAggr | | 2015-04-09 11:00:00+00 > (1 row) > > # SELECT * from history_configuration where var_name='lastHourAggr'; > var_name | var_value | var_datetime > --------------+-----------+------------------------ > lastHourAggr | | 2015-04-09 04:00:00+00 > (1 row) > > Except this, nothing has changed. The aggregation still didn't start and > daily and hourly tables are still empty. > > In order to save time and make things more clear, you are welcome to login > to my test box to have a look. > > 10.66.219.175 all the credentials are redhat. > > >But this is not mandatory and if you wait until the time that is already set there the data will be aggregated as well. > > BTW, what is the "time that is already set there" ? Can I check that time in > the database ? > > Best Regards, > Chen What is this environment? I see that when hourly aggregation runs then lasHourAgg is update backwards instead of forwords. That is very strange. Did you try and to change anything else in this environment?
All I did was in comment#24. This morning, while trial and error (changed many times lastHourAggr and rebooted), finally lastHourAggr stopped changing and there comes one line in hourly table.. ( But I'm not sure which one exactly solves this ) # SELECT count(*) from v3_5_statistics_hosts_resources_usage_hourly ; count ------- 1 (1 row) And the reports can show data. So we are sure about two things here: 1. Reports did read from hourly table to fill in the web page. 2. Wrong lastHourAggr is stopping the aggregation, which leaves hourly and daily table empty. Currently I'm reinstalling an environment which is using UTC timezone, hopefully this solves it. Any other good advice Shirly ? Best Regards, Chen
I created a new RHEV 3.5 ( based on an existent template ). 1. Change TZ to UTC # ln -sf /usr/share/zoneinfo/UTC /etc/localtime 2. Reboot 3. Check lastHourAggr. lastHourAggr started to move backwards by 7 hours.... Wondering where could change lastHourAggr backwards...
Hi Shirly, I've tried many times (switched the TZ to UTC) and the results are almost the same: lastHourAggr begines to move back by 7 hours quickly several minutes later. 1. When is the aggregation triggered? On every o'clock ? ( I think the logic should be: a. On every o'clock, aggregation is triggered and dwh aggregates the last hour's samples to hourly table by comparing history_datetime and lastHourAggr. b. After step a, Update lastHourAggr + 1 Would you please confirm that ? ) 2. Is there any method to trace the aggregation such as enable debug log ? I'm pleased to enable debug log to see when aggregation is executed and some details. 3. If lastHourAggr keeps moving backwards, newer data would not be aggregated... Best Regards, Chen
Hi Shirly, The newest update is that, I re-installed the whole box using Etc/UTC TZ and by now things all look good. Best Regards, Chen
That is good news indeed. Please close this bug if it is fixed. We will release a fix for z-stream for this issue.
This issue was fixed within another bug BZ#1195395, I suggest closing this as duplicate as no patches were required in this bug. What do you think?
Done. *** This bug has been marked as a duplicate of bug 1195395 ***