Description of problem: Change default autovacuuming settings, by default it's execute too often (on 50 tuples update), which makes postgresql process very busy. We should change the default for all newly created db in engine-setup and probably update doc how to do that for existing ones Version-Release number of selected component (if applicable): Actual results: currently in postgresql.conf the defaults are: autovacuum_vacuum_threshold = 50 autovacuum_analyze_threshold = 50 Expected results: should be changed to : autovacuum_vacuum_threshold = 5000 autovacuum_analyze_threshold = 10000 Additional info:
We have to change this for all new automatically provisioned databases, but I'd also add this to "db customization" documentation (if we have any) and also show warning about that during each engine-setup execution.
Any issue doing that for existing dbs? I would expect large existing deployments actually suffer the most right now.
(In reply to Michal Skrivanek from comment #2) > Any issue doing that for existing dbs? I would expect large existing > deployments actually suffer the most right now. Generally speaking, we do not change postgresql.conf on upgrade, only on provisioning. We try to behave as if postgresql (and its conf) is not owned by us, as if the "normal" behavior is using a remote db. So we confine changes to configuration to be only during provisioning. Also, we do not have currently code for "warning", although that's easier to add. If we only do the minimum, we'll only change this during provisioning and error during upgrade. See as an example the change for bug 1331168. Of course, everything is possible...
I'd say that changing this pro newly provisioned db and display a warning (or error?) during engine-setup execution, if not set properly, is enough (I'd assume that large customers may have already changed postgresql.conf to fit their loads by tuning some other parameters).
autovacuum is a very important process for the database and raising the global threshold is a no-go: 1. we have variance in our table usage - some update heavy, rest isn't. large global value will starve those tables and they won't be cleaned. 2. there is no single value which is good. The system can grow. We should be more dynamic 3. co-hosted DWH is contending on vacuum workers What I suggest is a per-table tuning for the update-heavy tables: - vds_cpu_statistics (should be removed?) - vds_interface_statistics - vds_statistics - vm_dynamic - vm_statistics - vm_interface_statistics This will set the scale factor to 1: ``` ALTER table NAME SET (autovacuum_vacuum_scale_factor=1); ``` The formula [1] to determine if to run vacuum for a table is: ``` threshold = vacuum base threshold + vacuum scale factor * number of tuples ``` So using defaults threshold and scale factor (50 and 0.2 respectively) and vm table have 1000 records then the effective threshold is: 50 + (1000 * 0.2) = 250 I took load factor =1 because most of the vms and host should be running then most of the records would be updated regularly. [1] https://www.postgresql.org/docs/9.1/static/routine-vacuuming.html#AUTOVACUUM
Moving back to infra, as discussed with Roy we can create an upgrade script for relevant tables and alter autovacuum_vacuum_scale_factor for them. Customer can adapt this setting later if needed
A recent change on a production system we made plus more research and tests I made shows that the factor should be actually smaller. The smaller amount of work the vacuum has to do the better it behaves. The vacuum algorithm also has the cost limit factor, when it reached the worker stops working and waits for the next time with tuples waiting to be collected. The vacuum factor is going to set to 0.01 to be fairly aggressive and still support table growth. threshold will stay default.
Moving to 4.1.1 as we most probably won't be able to finish that before RC build
Removing 71889 because it's not really required for this bug. It (or a variation) will only be needed if/when dwh wants to add its own extra config items.
engine-setup checks by upgrade the postgres configuration and correctly writes changes that should be made Found the following problems in PostgreSQL configuration for the Engine database: autovacuum_vacuum_scale_factor required to be at most 0.01 autovacuum_max_workers required to be at least 6 maintenance_work_mem required to be at least 65536 Please set: autovacuum_vacuum_scale_factor = 0.01 autovacuum_max_workers = 6 maintenance_work_mem = 65536 but when I chose to Fix automatically it also changed the setting that was already correct #autovacuum_analyze_scale_factor = 0.1 ... autovacuum_analyze_scale_factor = 0.05 --> #autovacuum_analyze_scale_factor = 0.1 ... # autovacuum_analyze_scale_factor = 0.05 autovacuum_analyze_scale_factor = 0.075 tested in ovirt-engine-setup-4.1.1.2-0.1.el7.noarch
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Roy, could you please take a look?
(In reply to Martin Perina from comment #16) > Roy, could you please take a look? ping?
Roy - what's the status here?
Now it doesn't rewrite already good values, but there is a problem with maintenance_work_mem when value is set with units. maintenance_work_mem = 16MB engine-setup ends with error [ ERROR ] Failed to execute stage 'Misc configuration': invalid literal for int() with base 10: '16MB' 2017-05-02 16:22:49 DEBUG otopi.context context._executeMethod:142 method exception Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/otopi/context.py", line 132, in _executeMethod method['method']() File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/db/configuration.py", line 222, in _updatePGConf transaction=localtransaction, File "/usr/share/ovirt-engine/setup/ovirt_engine_setup/engine_common/postgres.py", line 284, in _updatePostgresConf needUpdate, content = dbovirtutils.getUpdatedPGConf(content) File "/usr/share/ovirt-engine/setup/ovirt_engine_setup/engine_common/database.py", line 1004, in getUpdatedPGConf expected=item['expected'] File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/db/configuration.py", line 128, in <lambda> int(current) >= int(expected) ValueError: invalid literal for int() with base 10: '16MB' tested in ovirt-engine-setup-4.1.2-0.1.el7.noarch
Roy, can you please take a look?
(In reply to Lucie Leistnerova from comment #19) > Now it doesn't rewrite already good values, but there is a problem with > maintenance_work_mem when value is set with units. > > maintenance_work_mem = 16MB We don't support units and our scripts never did - i.e you are exposing a limitation that was already there from the beginning. I suggest we should VERIFY this RFE and open a new bug for this specific issue since this RFE didn't introduce that.
Alright. I created Bug 1447624 verified in ovirt-engine-setup-4.1.2-0.1.el7.noarch