Bug 1803150
Summary: | Hint for nova-scheduler seems to be ignored | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Filip Hubík <fhubik> | ||||||||||
Component: | openstack-tempest | Assignee: | Chandan Kumar <chkumar> | ||||||||||
Status: | CLOSED ERRATA | QA Contact: | Martin Kopec <mkopec> | ||||||||||
Severity: | low | Docs Contact: | |||||||||||
Priority: | low | ||||||||||||
Version: | 13.0 (Queens) | CC: | apevec, dasmith, eglynn, jhakimra, kchamart, lhh, lyarwood, mkopec, sbauza, sgordon, slinaber, udesale, vromanso, wznoinsk | ||||||||||
Target Milestone: | --- | Keywords: | Reopened, Triaged, ZStream | ||||||||||
Target Release: | --- | ||||||||||||
Hardware: | Unspecified | ||||||||||||
OS: | Unspecified | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | openstack-tempest-18.0.0-14.el7ost | Doc Type: | No Doc Update | ||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2020-06-24 11:41:26 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Created attachment 1663138 [details]
tempest.log
Created attachment 1663570 [details]
tempest.conf
Yes, this smells like Tempest issue to me too, just few notes: Afaik we were not adding custom scheduler filters to Tempest (see tempest.conf attached) in CI in past. From Tempest code tempest/common/compute.py: def is_scheduler_filter_enabled(filter_name): """Check the list of enabled compute scheduler filters from config. This function checks whether the given compute scheduler filter is available and configured in the config file. If the scheduler_available_filters option is set to 'all' (Default value. which means default filters are configured in nova) in tempest.conf then, this function returns True with assumption that requested filter 'filter_name' is one of available filter in nova ("nova.scheduler.filters.all_filters"). """ and also tempest/api/compute/admin/test_servers_on_multinodes.py: @decorators.idempotent_id('26a9d5df-6890-45f2-abc4-a659290cb130') @testtools.skipUnless( compute.is_scheduler_filter_enabled("SameHostFilter"), 'SameHostFilter is not available.') def test_create_servers_on_same_host(self): hints = {'same_host': self.server01} From https://docs.openstack.org/tempest/latest/sampleconf.html I read that nova configuration is taken by default - I assume that means all available filters in this case. Should we really change Tempest config explicitely if it tries to gather this information from nova? Also from nova.conf on controllers, we have default config, these options commented out: #available_filters=nova.scheduler.filters.all_filters #enabled_filters=RetryFilter,AvailabilityZoneFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter but SameHostFilter doesn't seem to be part of "default" "enabled_filters" and must be enabled explicitely. I tried to add it at the end of "enabled_filters" (/var/lib/config-data/puppet-generated/nova/etc/nova/nova.conf, docker restart nova_scheduler on controllers), and mentioned test passed. This leads me to 2 conclusions: 1) Tempest shoudln't detect that "SameHostFilter" is enabled since it is not part of "filters enabled by default" - making this Tempest bug 2) Nova should include it into "filters enabled by default" since maybe it is ommited from default set unintentionally? - nova bug/or I misunderstood Tempest/nova documentation? (In reply to Filip Hubík from comment #5) > Yes, this smells like Tempest issue to me too, just few notes: > Afaik we were not adding custom scheduler filters to Tempest (see > tempest.conf attached) in CI in past. From Tempest code > tempest/common/compute.py: > > def is_scheduler_filter_enabled(filter_name): > """Check the list of enabled compute scheduler filters from config. > > This function checks whether the given compute scheduler filter is > available and configured in the config file. If the > scheduler_available_filters option is set to 'all' (Default value. which > means default filters are configured in nova) in tempest.conf then, this > function returns True with assumption that requested filter 'filter_name' > is one of available filter in nova > ("nova.scheduler.filters.all_filters"). > """ > > and also tempest/api/compute/admin/test_servers_on_multinodes.py: > > @decorators.idempotent_id('26a9d5df-6890-45f2-abc4-a659290cb130') > @testtools.skipUnless( > compute.is_scheduler_filter_enabled("SameHostFilter"), > 'SameHostFilter is not available.') > def test_create_servers_on_same_host(self): > hints = {'same_host': self.server01} > > > From https://docs.openstack.org/tempest/latest/sampleconf.html I read that > nova configuration is taken by default - I assume that means all available > filters in this case. Should we really change Tempest config explicitely if > it tries to gather this information from nova? > > Also from nova.conf on controllers, we have default config, these options > commented out: > #available_filters=nova.scheduler.filters.all_filters > #enabled_filters=RetryFilter,AvailabilityZoneFilter,ComputeFilter, > ComputeCapabilitiesFilter,ImagePropertiesFilter, > ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter > > but SameHostFilter doesn't seem to be part of "default" "enabled_filters" > and must be enabled explicitely. I tried to add it at the end of > "enabled_filters" > (/var/lib/config-data/puppet-generated/nova/etc/nova/nova.conf, docker > restart nova_scheduler on controllers), and mentioned test passed. > > This leads me to 2 conclusions: > 1) Tempest shoudln't detect that "SameHostFilter" is enabled since it is not > part of "filters enabled by default" - making this Tempest bug > 2) Nova should include it into "filters enabled by default" since maybe it > is ommited from default set unintentionally? - nova bug/or I misunderstood > Tempest/nova documentation? We've been through this before - although I can't find anything besides the patch I did upstream [1]. As you can read in that commit message, the problem was that the Nova and Tempest defaults for "what scheduler filters are enabled in this deployment" didn't match (and the confusingly named tempest.conf option `scheduler_available_filters` - `available` doesn't mean anything, a filter is either enabled or it isn't). So I did [1] upstream to fix this, and IIRC there was something done in InfraRed or THT or some other non-Tempest and non-Nova code to make Nova and Tempest agree on what filters are enabled in the deployment - but as I said, I can't find any written trace of that. [1] could be backported to 13, or the deployment tooling could make sure to match Nova's and Tempest's configuration of enabled filters. [1] https://review.opendev.org/#/c/570207/ Ok, after discussion with Tempest maintainer (mkopec) it seems like best possible solution is ask for backport, since in current state it means Tempest is doing wrong assumption, assuming that with default config all filters are enabled to use, but in fact only these 6 are https://github.com/openstack/nova/blob/master/nova/conf/scheduler.py#L320 by nova. Reopened, re-targeted against Tempest now. (In reply to Filip Hubík from comment #7) > Ok, after discussion with Tempest maintainer (mkopec) it seems > like best possible solution is ask for backport, since in current state it > means Tempest is doing wrong assumption, assuming that with default config > all filters are enabled to use, but in fact only these 6 are > https://github.com/openstack/nova/blob/master/nova/conf/scheduler.py#L320 by > nova. Thanks Filip, apologies for missing your previous Trello ping about this. ACK to backporting the change but remember this is for OSP 13 so use the following list: https://github.com/openstack/nova/blob/stable/queens/nova/conf/scheduler.py#L268-L277 This can also change depending on the value of the NovaSchedulerDefaultFilters parameter in TripleO envs: https://github.com/openstack/tripleo-heat-templates/blob/24fa8936738c9b45eb7dd7e96506c27a8abe5cd5/puppet/services/nova-scheduler.yaml#L37-L43 For example within the undercloud we set the following: https://github.com/openstack/tripleo-heat-templates/blob/24fa8936738c9b45eb7dd7e96506c27a8abe5cd5/environments/undercloud.yaml#L24 Created attachment 1694018 [details]
verification output
The fixed in version package contains the fix, scheduler_enabled_filters is not set by default to 'all' since the openstack-tempest-18.0.0-14 - thanks to that the failing test is skipped by default. When enabled_filters option in nova.conf on all nodes is set so that it contains SameHostFilter filter and compute_feature_enabled].scheduler_available_filters in tempest.conf contains SameHostFilter as well, the test is passing.
The output from testing is attached.
The BZ is VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2719 |
Created attachment 1663137 [details] nova_logs_all Description of problem: Deployment uses 2 compute nodes, Tempest tries to spawn 2 VMs on the same node. Second VM is always scheduled to the another node though - even when we explicitely ask scheduler using hints. Running Tempest test, OSP13 OC environment: def test_create_servers_on_same_host(self): hints = {'same_host': self.server01} server02 = self.create_test_server(scheduler_hints=hints, wait_until='ACTIVE')['id'] host02 = self._get_host(server02) self.assertEqual(self.host01, host02) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/tempest/api/compute/admin/test_servers_on_multinodes.py", line 80, in test_create_servers_on_same_host self.assertEqual(self.host01, host02) File "/usr/lib/python2.7/site-packages/testtools/testcase.py", line 350, in assertEqual self.assertThat(observed, matcher, message) File "/usr/lib/python2.7/site-packages/testtools/testcase.py", line 435, in assertThat raise mismatch_error testtools.matchers._impl.MismatchError: u'compute-1.redhat.local' != u'compute-0.redhat.local' Ran 1 test in 44.117s FAILED (failures=1) --- Scheduler seems to be ignoring these hints, provided by tempest test(s), I tried both "same_host" and "different_host" (see attached logs). enabled_filters in nova.conf is kept default: ... # Deprecated group;name - DEFAULT;scheduler_default_filters #enabled_filters=RetryFilter,AvailabilityZoneFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter python2-tempestconf-2.4.0-1.el7ost.noarch python2-tempest-18.0.0-12.el7ost.noarch openstack-tempest-18.0.0-12.el7ost.noarch openstack-nova-scheduler-17.0.12-1.el7ost.noarch (in nova_scheduler container) Puddle: 2020-02-06.2 Attached: nova logs from 3 controllers and 2 compute nodes isolated just during this testcase tempest.log