877741 – [Scalability] rhevm machine with 250 fake hosts and 250 vms consumes ~99% IO due to database r/w

Bug 877741 - [Scalability] rhevm machine with 250 fake hosts and 250 vms consumes ~99% IO due to database r/w

Summary: [Scalability] rhevm machine with 250 fake hosts and 250 vms consumes ~99% IO ...

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	ovirt-engine
Sub Component:
Version:	3.1.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Target Release:	3.3.0
Assignee:	Liran Zelkha
QA Contact:	Rami Vaknin
Docs Contact:
URL:
Whiteboard:	infra
Depends On:
Blocks:	rhev_scalability
TreeView+	depends on / blocked

Reported:	2012-11-18 13:23 UTC by Rami Vaknin
Modified:	2016-02-10 19:04 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2013-07-22 13:45:15 UTC
oVirt Team:	Infra
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Threads dump + PostgreSQL queries (1.02 MB, application/x-tar) 2012-12-06 15:25 UTC, Rami Vaknin	no flags	Details
sorted queries (4.32 KB, text/plain) 2012-12-13 15:02 UTC, Alissa	no flags	Details
postgresql queries log (5.06 MB, application/x-tar) 2013-01-02 16:58 UTC, Rami Vaknin	no flags	Details
View All

Description Rami Vaknin 2012-11-18 13:23:15 UTC

Scenario:
1. Install rhevm on a bare metal machine, use local database in rhevm-config
2. Install 250 fake hosts and 250 vms, connect the fake hosts data center to one or two storage domains

Results:
rhevm machine will consume a lot of IO (~99%-100% quite constantly) and rhevm will slow down, several hosts that were in Up status will switch to Non-Opeartional/Connecting.

Local disk on rhevm machine:
Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0005ae21


Note that I connected an iscsi device from fast storage server and moved the database directory into that device, the IO utilization changed from ~99% to ~2%-4%:

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb               0.00    22.00    0.00   85.00     0.00   856.00    10.07     0.03    0.40   0.26   2.20

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          49.08    0.00    1.50    0.17    0.00   49.25

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb               0.00    52.00    0.00  126.00     0.00  1424.00    11.30     0.05    0.41   0.29   3.70

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          68.34    0.00    3.18    0.00    0.00   28.48

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb               0.00    52.00    0.00   98.00     0.00  1200.00    12.24     0.05    0.54   0.38   3.70

Comment 1 Itamar Heim 2012-11-21 21:30:48 UTC

the specific use case we saw was a dramatic increase in write to db once the storage domain was activated

Comment 2 mkublin 2012-11-27 09:38:05 UTC

These bug is not related directly to storage, the problem is one or couple of queries that runs at VdsManager.OnTimer().
In order to solve a bug, please provide a thread dump of jboss and dump of queries that are running during scenario.
Meanwhile the only connection to storage is that DB running on storage.

Comment 3 Rami Vaknin 2012-12-06 15:24:04 UTC

It seems that there plenty of getvmsrunningonvds queries for each of the 250 fake hosts (I think they run every 2 seconds), each query takes ~200-280 ms * 250 hosts = ~1 minute for these queries to return. Note that these fake hosts does not run vms at all, when they will run vms - they queries will take much longer.

The threads dump also show a lot of threads that run UpdateStatisticsData.


LOG:  duration: 217.248 ms  execute S_4: select * from  getvmsrunningonvds($1)
DETAIL:  parameters: $1 = '53065bb2-3f6b-11e2-bbba-a3d50ab65223'
LOG:  duration: 228.764 ms  execute S_7: select * from  getvmsrunningonvds($1)
DETAIL:  parameters: $1 = '8edf4bd0-3f6b-11e2-9dcd-875525b2edc8'
LOG:  duration: 191.306 ms  execute S_5: select * from  getvmsrunningonvds($1)
DETAIL:  parameters: $1 = 'c6f49520-3f6b-11e2-8543-b37eff26b321'
LOG:  duration: 253.054 ms  execute S_4: select * from  getvmsrunningonvds($1)
DETAIL:  parameters: $1 = '45b6965c-3f6b-11e2-a12b-afd940a6611f'
LOG:  duration: 236.694 ms  execute S_3: select * from  getvmsrunningonvds($1)
DETAIL:  parameters: $1 = '122f1606-3f6b-11e2-835d-77b0ef4b9a96'
LOG:  duration: 232.669 ms  execute S_4: select * from  getvmsrunningonvds($1)
DETAIL:  parameters: $1 = '38703700-3f6b-11e2-b7da-5bef0c610678'
LOG:  duration: 228.571 ms  execute S_12: select * from  getvmsrunningonvds($1)
DETAIL:  parameters: $1 = '5e2406ca-3f6b-11e2-b4c1-ef441df35c27'
LOG:  duration: 231.445 ms  execute S_5: select * from  getvmsrunningonvds($1)
DETAIL:  parameters: $1 = '2e510d80-3f6b-11e2-b4ad-b39e1d5b277d'
LOG:  duration: 270.552 ms  execute S_15: select * from  getvmsrunningonvds($1)
DETAIL:  parameters: $1 = '1f9d104a-3f6b-11e2-b36b-ef6e7aa5d32d'
LOG:  duration: 260.334 ms  execute S_12: select * from  getvmsrunningonvds($1)
DETAIL:  parameters: $1 = '4921f282-3f6b-11e2-8307-4302eb42c08a'
LOG:  duration: 268.628 ms  execute S_7: select * from  getvmsrunningonvds($1)
DETAIL:  parameters: $1 = 'a4529b2a-3f6b-11e2-9612-77ddad49a208'
LOG:  duration: 244.372 ms  execute S_6: select * from  getvmsrunningonvds($1)
DETAIL:  parameters: $1 = '78c1736e-3f6b-11e2-a8fb-13a61ef95cc3'
LOG:  duration: 220.345 ms  execute S_7: select * from  getvmsrunningonvds($1)
DETAIL:  parameters: $1 = '91d2d348-3f6b-11e2-b7bc-376390f55ab9'
LOG:  duration: 238.756 ms  execute S_12: select * from  getvmsrunningonvds($1)
DETAIL:  parameters: $1 = '84bcf1de-3f6b-11e2-b399-df2ce17c216f'
LOG:  duration: 280.359 ms  execute S_5: select * from  getvmsrunningonvds($1)
DETAIL:  parameters: $1 = '8f97b1ca-3f6b-11e2-baef-a3b3ba5799f8'
LOG:  duration: 266.672 ms  execute S_13: select * from  getvmsrunningonvds($1)
DETAIL:  parameters: $1 = '3fda20dc-3f6b-11e2-8b49-1f0483613a86'
LOG:  duration: 283.233 ms  execute S_13: select * from  getvmsrunningonvds($1)
DETAIL:  parameters: $1 = 'c9df429e-3f6b-11e2-89ed-d36157640b21'
LOG:  duration: 234.994 ms  execute S_12: select * from  getvmsrunningonvds($1)
DETAIL:  parameters: $1 = '6f36513e-3f6b-11e2-942c-734a2e51f256'

================================================================

"QuartzScheduler_Worker-18" prio=10 tid=0x00007fb24474b800 nid=0x6337 runnable [0x00007fb2774f3000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:150)
        at java.net.SocketInputStream.read(SocketInputStream.java:121)
        at org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:135)
        at org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:104)
        at org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:73)
        at org.postgresql.core.PGStream.ReceiveChar(PGStream.java:259)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1620)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
        - locked <0x00000000c6bdd300> (a org.postgresql.core.v3.QueryExecutorImpl)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:479)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:367)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:360)
        at org.jboss.jca.adapters.jdbc.CachedPreparedStatement.execute(CachedPreparedStatement.java:297)
        at org.jboss.jca.adapters.jdbc.WrappedPreparedStatement.execute(WrappedPreparedStatement.java:404)
        at org.springframework.jdbc.core.JdbcTemplate$6.doInCallableStatement(JdbcTemplate.java:1066)
        at org.springframework.jdbc.core.JdbcTemplate$6.doInCallableStatement(JdbcTemplate.java:1)
        at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:1014)
        at org.springframework.jdbc.core.JdbcTemplate.call(JdbcTemplate.java:1064)
        at org.springframework.jdbc.core.simple.AbstractJdbcCall.executeCallInternal(AbstractJdbcCall.java:388)
        at org.springframework.jdbc.core.simple.AbstractJdbcCall.doExecute(AbstractJdbcCall.java:351)
        at org.springframework.jdbc.core.simple.SimpleJdbcCall.execute(SimpleJdbcCall.java:181)
        at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeImpl(SimpleJdbcCallsHandler.java:124)
        at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeModification(SimpleJdbcCallsHandler.java:37)
        at org.ovirt.engine.core.dao.VdsStatisticsDAODbFacadeImpl.update(VdsStatisticsDAODbFacadeImpl.java:102)
        at org.ovirt.engine.core.dao.VdsStatisticsDAODbFacadeImpl.update(VdsStatisticsDAODbFacadeImpl.java:20)
        at org.ovirt.engine.core.vdsbroker.VdsManager.UpdateStatisticsData(VdsManager.java:387)
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.SaveDataToDb(VdsUpdateRunTimeInfo.java:124)
        at org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo.Refresh(VdsUpdateRunTimeInfo.java:297)
        at org.ovirt.engine.core.vdsbroker.VdsManager$1.runInTransaction(VdsManager.java:234)
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:168)
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:107)
        at org.ovirt.engine.core.vdsbroker.VdsManager.OnTimer(VdsManager.java:215)
        - locked <0x00000000c5001ff0> (a java.lang.Object)
        at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:64)
        at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557)
        - locked <0x00000000c1db2b40> (a org.quartz.simpl.SimpleThreadPool$WorkerThread)

Comment 4 Rami Vaknin 2012-12-06 15:25:18 UTC

Created attachment 658812 [details]
Threads dump + PostgreSQL queries

Comment 5 Alissa 2012-12-13 15:01:18 UTC

(In reply to comment #4)
> Created attachment 658812 [details]
> Threads dump + PostgreSQL queries

investigating this with Allon...
Top queries that are ran are:
for each host (VDS) - those queries are performed:
15784  select * from  getvdsbyvdsid($1, $2, $3) 
12783  select * from  getvmsrunningonvds($1)

They are done every 2 seconds separately for each host - this is by design.
The only thing left to check is the execution plan of the queries

Please reproduce and attach queries execution plan:

this is what needs to be added to postgresql.conf


# postgresql.conf
shared_preload_libraries = 'auto_explain'
custom_variable_classes = 'auto_explain'
auto_explain.log_min_duration = '1s'
auto_explain.log_nested_statements='true'

Please restart postgres in order to apply the settings.
thanks

Comment 6 Alissa 2012-12-13 15:02:41 UTC

Created attachment 663010 [details]
sorted queries

Comment 7 Rami Vaknin 2013-01-02 16:58:38 UTC

Created attachment 671557 [details]
postgresql queries log

Comment 13 Barak 2013-01-27 11:28:18 UTC

There are 2 separate issues described in this BZ:

- There are some inefficiencies in the amount of queries that we perform.
- The throughput that we expect the DB to hold (IO throughput on postgres as 
  seen on the OS per disk configuration)

Comment 18 Itamar Heim 2013-07-08 14:53:50 UTC

batch stat updates should improve this as well
(liran - note the stats in comment 0 - they are all write stats)

Comment 19 Andrew Cathrow 2013-07-22 13:45:15 UTC

3.2 and 3.3 included major re-work here.
We believe this issue to be closed.
3.3 scale testing is in progress so we can revisit if this is uncovered in testing

Note You need to log in before you can comment on or make changes to this bug.