Description of problem: While deploying, customer added the following YAML contents to an overcloud update: parameter_defaults: NovaSchedulerAvailableFilters: RetryFilter,AvailabilityZoneFilter,RamFilter,DiskFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter NovaSchedulerDefaultFilters: RetryFilter,AvailabilityZoneFilter,RamFilter,DiskFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter The update failed as it could not start nova-scheduler. The nova.conf was updated with the following information: 4676a4677,4685 > available_filters=RetryFilter > available_filters=AvailabilityZoneFilter > available_filters=RamFilter > available_filters=DiskFilter > available_filters=ComputeFilter > available_filters=ComputeCapabilitiesFilter > available_filters=ImagePropertiesFilter > available_filters=ServerGroupAntiAffinityFilter > available_filters=ServerGroupAffinityFilter 4703a4713 > enabled_filters=RetryFilter,AvailabilityZoneFilter,RamFilter,DiskFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter After removing the lines from the YAML for the filters, they re-ran the update and it still failed. Checking nova.conf on the controllers still had the changes. Manually removed the lines from nova.conf, restarted scheduler successfully. Ran the update again (without the filter lines) and it still updated nova.conf adding those available_filters and enabled_filters lines and of course the deployment failed. The only way to work around the issue was to force the default by adding a YAML with: parameter_defaults: nova::scheduler::filter::scheduler_available_filters: ['nova.scheduler.filters.all_filters'] Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Successfully deploy an overcloud 2. Create a YAML with: parameter_defaults: NovaSchedulerAvailableFilters: RetryFilter,AvailabilityZoneFilter,RamFilter,DiskFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter NovaSchedulerDefaultFilters: RetryFilter,AvailabilityZoneFilter,RamFilter,DiskFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter 3. Re-deploy including the YAML to update the Overcloud 4. Deployment breaks trying to start nova-scheduler 5. Re-run deployment without the Filter YAML 6. Deployment still fails Actual results: Expected results: Running the deployment without the Filters should force it back to the default, not persist the change. Additional info:
After the update successfully ran with the over-ride YAML: parameter_defaults: nova::scheduler::filter::scheduler_available_filters: ['nova.scheduler.filters.all_filters'] I removed that and re-ran the update and it appeared to revert back to the broken state, adding the available_filters and enabled_filters lines.
We were able to "force clear" the setting by running the update with the following: parameter_defaults: NovaSchedulerAvailableFilters: '' NovaSchedulerDefaultFilters: '' nova::scheduler::filter::scheduler_available_filters: ['nova.scheduler.filters.all_filters'] ControllerExtraConfig: nova::scheduler::filter::scheduler_available_filters: ['nova.scheduler.filters.all_filters'] The deployment/update ran to completion and we were then able to remove those lines and further updates ran cleanly after that. So it looks like NovaSchedulerAvailableFilters and NovaSchedulerDefaultFilters only get updated when explicitly setting them otherwise they keep whatever the previous setting was.
This is working as designed - we do PATCH updates to the overcloud stack, because (a long time ago) folks encountered issues where any environment file accidentally ommitted on update could cause very bad misconfiguration. So instead we update the stack where the provided environment overrides any previously applied configuration, which causes the behavior here but means there is less chance of, say, omitting ComputeCount and deletinfg all your compute nodes ;) I suggest this be converted to a docs bug, if we can confirm the current docs don't clarify this point sufficiently, otherwise it should probably be closed as not a bug - the workaround in comment #3 is the correct and recommended approach to reset previously provided parameter_defaults.
(In reply to Steven Hardy from comment #4) > This is working as designed - we do PATCH updates to the overcloud stack, > because (a long time ago) folks encountered issues where any environment > file accidentally ommitted on update could cause very bad misconfiguration. > > So instead we update the stack where the provided environment overrides any > previously applied configuration, which causes the behavior here but means > there is less chance of, say, omitting ComputeCount and deletinfg all your > compute nodes ;) > > I suggest this be converted to a docs bug, if we can confirm the current > docs don't clarify this point sufficiently, otherwise it should probably be > closed as not a bug - the workaround in comment #3 is the correct and > recommended approach to reset previously provided parameter_defaults. Steve, does setting those values to '' reset them to the defaults or do we need to actually set it to the defaults commented in the nova.conf as such: NovaSchedulerDefaultFilters: ['RetryFilter', 'AvailabilityZoneFilter', 'RamFilter', 'DiskFilter', 'ComputeFilter', 'ComputeCapabilitiesFilter', 'ImagePropertiesFilter', 'ServerGroupAntiAffinityFilter', 'ServerGroupAffinityFilter']
Resetting to the default value (which is actually []) should set it to the nova defaults https://github.com/openstack/puppet-nova/blob/ab51c308e24ff5a94a46a1376f59c51a9d576c6a/manifests/scheduler/filter.pp#L129