Bug 1381721 - Overcloud Controller Disk is exhausted with few instances and default settings
Summary: Overcloud Controller Disk is exhausted with few instances and default settings
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-swift
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: rc
: 10.0 (Newton)
Assignee: RHOS Maint
QA Contact: nlevinki
URL:
Whiteboard:
: 1387129 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-10-04 19:57 UTC by Alex Krzos
Modified: 2016-12-14 16:08 UTC (History)
25 users (show)

Fixed In Version: puppet-swift-9.4.2-1.el7ost openstack-swift-2.10.0-4.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-12-14 16:08:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Graphs showing disk,cpu,memory usage of an OSP10beta cloud (1.12 MB, application/x-gzip)
2016-10-04 19:57 UTC, Alex Krzos
no flags Details
Controller0 - swift, gnocchi, ceilometer logs (7.38 MB, application/x-gzip)
2016-10-05 14:10 UTC, Alex Krzos
no flags Details
Controller1 - Ceilometer, Gnocchi, Swift Logs (8.72 MB, application/x-gzip)
2016-10-06 23:27 UTC, Alex Krzos
no flags Details
Controller2 - Ceilometer, Gnocchi, Swift Logs (8.56 MB, application/x-gzip)
2016-10-06 23:28 UTC, Alex Krzos
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1631352 0 None None None 2016-10-07 12:37:29 UTC
Launchpad 1631359 0 None None None 2016-10-07 13:05:03 UTC
OpenStack gerrit 383707 0 None MERGED Throttle update_auditor_status calls 2020-10-29 11:05:34 UTC
OpenStack gerrit 383724 0 None MERGED Set concurrency to 1 for auditor/replicator/updater 2020-10-29 11:05:49 UTC
OpenStack gerrit 389351 0 None ABANDONED Add Swift best practices 2020-10-29 11:05:34 UTC
Red Hat Bugzilla 1383268 0 urgent CLOSED Deployed Swift rings use way too high partition power 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHEA-2016:2948 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 10 enhancement update 2016-12-14 19:55:27 UTC

Internal Links: 1281556

Description Alex Krzos 2016-10-04 19:57:33 UTC
Created attachment 1207337 [details]
Graphs showing disk,cpu,memory usage of an OSP10beta cloud

Description of problem:

Deployed new overcloud with OSP10 Beta via Director and boot several instances, view disk IO utilization on Controllers and see that they are at or nearly 100% utilized and swift is producing errors.  Disks are nearly pegged out, deleting instances does not "remove" the IO usage on the disks.  Controllers are showing numerous errors from swift.


Previous versions I was able to boot many more instances (>200 tiny idle instances) before disk IO utilization was a problem, even on these old machines.


Version-Release number of selected component (if applicable):
OSP10 Beta 
Build 2016-09-20.2

How reproducible:
Always with this build

Steps to Reproduce:
1. Deploy Overcloud w/ 3 Controllers, 2 Computes
2. Boot 20 Instances
3. Delete 20 Instances

Actual results:
Instances do boot, however disks are almost completely exhausted and swift is generating a lot of error noise in swift's log file.

Expected results:
To be able to boot many more instances before the disks becomes a problem.

Additional info:
Previous builds, Gnocchi was using a file storage backend and this was not a problem at this small scale.  

Attached are graphs detailing all 3 controller's cpu, memory, and disk utilization.  The disk graphs are annotated to show when images were uploaded (spikes in disk usage begin), instances booted as well as when the instances were deleted.  The cpu graphs show much of the time the cpu is in wait state.  The disk utilization graphs show how the utilization does not go down after instances are deleted.  

Addition graphs are included to show when the instances were booted (via qemu-kvm processes on both controllers, stacked show 20 total qemu-kvm processes).  Also graphs displaying the rate of " ERROR " in log files for the various services.  This allows a quick glance to see if one service is wedged or producing a lot of errors.

I filed this bug under openstack-tripleo as I suspect a misconfiguration by director

Comment 2 Alex Krzos 2016-10-05 14:10:53 UTC
Created attachment 1207614 [details]
Controller0 - swift, gnocchi, ceilometer logs

Comment 3 Julien Danjou 2016-10-05 16:04:30 UTC
Reading the log it seems that object audit server from Swift is doing a lot of work checking the file objects regularly.

Can you try to disable it or lower its interval to something high (e.g. 3600s). The default in the code seems to be 30s and the log seems to indicate that it's what it's using.

Comment 4 Alex Krzos 2016-10-05 23:54:58 UTC
(In reply to Julien Danjou from comment #3)
> Reading the log it seems that object audit server from Swift is doing a lot
> of work checking the file objects regularly.
> 
> Can you try to disable it or lower its interval to something high (e.g.
> 3600s). The default in the code seems to be 30s and the log seems to
> indicate that it's what it's using.

I adjusted as per Julien's suggestion and have noticed a large decrease in Disk utilization across all three controllers when adjusting the swift-object's auditor, updater, and replicator intervals to 3600s  There is still some spikes in disk utilization and this will require further investigation.  If there is any resident swift experts I could be connected with that would be excellent.  Obviously, we want the best defaults out of the box.

Comment 5 Julien Danjou 2016-10-06 08:24:50 UTC
That's good news Alex. I'm adding Christian Schwede, my favorite Swift expert, to this bug so he can weight in.

Comment 9 Marian Krcmarik 2016-10-06 13:23:07 UTC
I've hit the problem too on a baremetal based deployment  with swift set as backend of gnocchi - controllers disk (single SATA 7200rpm) are utilized to 100% with high load.

I tried to adjust some conf values as suggested in previous comments but It did not have that significant effect on the utilization.

The utilization goes to the normal once I stop gnocchi and openstack-swift-object-auditor|replicator services.

Comment 11 Julien Danjou 2016-10-06 21:19:12 UTC
Gnocchi already batches everything that can be batched, and same goes for Ceilometer when sending data to Gnocchi.

3 writes per second is not really that high, but Gnocchi does create a lot of small files, so that might not help the auditor – which is what seems to be consuming most of the I/O after the "normal" operation.

Comment 12 Alex Krzos 2016-10-06 23:27:11 UTC
Created attachment 1208058 [details]
Controller1 - Ceilometer, Gnocchi, Swift Logs

Comment 13 Alex Krzos 2016-10-06 23:28:10 UTC
Created attachment 1208059 [details]
Controller2 - Ceilometer, Gnocchi, Swift Logs

Comment 14 Alex Krzos 2016-10-06 23:39:53 UTC
(In reply to Marian Krcmarik from comment #9)
> I've hit the problem too on a baremetal based deployment  with swift set as
> backend of gnocchi - controllers disk (single SATA 7200rpm) are utilized to
> 100% with high load.
> 
> I tried to adjust some conf values as suggested in previous comments but It
> did not have that significant effect on the utilization.
> 
> The utilization goes to the normal once I stop gnocchi and
> openstack-swift-object-auditor|replicator services.

Further investigation so far has shown the disk utilization only reduced after I deleted instances, as soon as I booted 20 instances again the high disk utilization resumed.  This pretty much makes it impossible for me to run some simple benchmarks(with Browbeat) using instances in my cloud due to services timing out.

Comment 16 Pete Zaitcev 2016-10-07 05:13:22 UTC
Alex provided me with an access, so I examined the system in question.
Overall it appears that nothing is wrong, just the load average is
very high. However, the system is not thrashing. Probing an object
server with curl returns under 10 seconds - average about 5.
Here's the iostat:

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00   104.40    0.00  183.20     0.00  4940.80    53.94     3.11   16.97    0.00   16.97   5.42  99.36

Interestingly, if I add a bit of concurrency to storage servers
(especially the container server), the loadavg increases to 35,
but the request processing times go down and the timeout errors for
updaters almost disappear.

I have a test cluster, which does not have gnocchi hammering it, but
exhibit a similar load pattern (with default mount options):

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     2.00    0.00    0.40     0.00     9.60    48.00     0.00    6.00    0.00    6.00   6.00   0.24
sdb               0.00     0.20    0.00  153.80     0.00  6510.40    84.66     1.03    6.73    0.00    6.73   6.20  95.30
sdc               0.00     0.00   18.60  149.80  2351.20  4582.10    82.34     0.98    5.80    0.98    6.40   5.48  92.22
sdd               0.00     0.00    0.00   10.20     0.00    56.00    10.98     0.07    6.65    0.00    6.65   6.65   6.78

As you can see, there's quite a bit of writes. It appears that all of
the data is cached. There's only 3.8 GB of everything in each of the
storage nodes. So all the reads are satisfied from page cache, and
all the see are writes (from unknown sources).

I experimented with noatime a bit, on the target system and on other
test systems and although it helps, writes do not go away. They may
be legitimate writes.

One note about gnocci: it does store a bunch of objects in one
container, and the resulting DB is about 20 MB:

-rw-------. 1 swift swift 21201920 Oct  6 17:11 73fb8d3bffe1e5e81dc11f5d19c04156.db

Not insignificant, but would not boggle a dedicated Swift node either.

I think the insane load averages we're seeing occur because all the
other OpenStack services also do I/O and then Swift crowds them out,
so that all of them end in iowait. Or some such explanation.

It would be interesting, if it's feasible, to re-install the controller
nodes, but convince Director to use a set size of the disk (e.g. 20 GB),
and use the rest as an XFS volume for Swift. Then we could separate
the I/O accounting (although not performance obviously).

Comment 28 Christian Schwede (cschwede) 2016-11-02 13:37:46 UTC
*** Bug 1387129 has been marked as a duplicate of this bug. ***

Comment 29 Elise Gafford 2016-11-02 14:29:39 UTC
Documentation efforts ongoing in parallel to built code fixes openstack-swift-2.10.0-4.el7ost and puppet-swift-9.4.2-1.el7ost, which is ready for test.

Comment 30 Christian Schwede (cschwede) 2016-11-02 15:18:07 UTC
Follow up bug for the documentation part: https://bugzilla.redhat.com/show_bug.cgi?id=1391111

Comment 34 errata-xmlrpc 2016-12-14 16:08:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2948.html


Note You need to log in before you can comment on or make changes to this bug.