| Summary: | <key_value/> uses JOIN instead of EXISTS, causing many duplicate rows to be returned | ||
|---|---|---|---|
| Product: | [Retired] Beaker | Reporter: | Bill Peck <bpeck> |
| Component: | scheduler | Assignee: | Dan Callaghan <dcallagh> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 0.6 | CC: | bpeck, dcallagh, mcsontos, rmancy, stl |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2011-09-08 02:42:53 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Bill Peck
2011-08-08 21:15:08 UTC
Bug 590723 has more details about the MySQL problem. I think the most recent scheduler failure is actually not related to bug 590723 and those optimizer_search_depth shenanigans. The recipe which caused it has this <hostRequires/>: <hostRequires> <and> <key_value key="MEMORY" value="1400" op=">"></key_value> <key_value key="DISK" value="7999" op=">"></key_value> <key_value key="MEMORY" value="1400" op=">"></key_value> <key_value key="DISK" value="7999" op=">"></key_value> </and> <and> <key_value key="MEMORY" value="1400" op=">"></key_value> <key_value key="DISK" value="7999" op=">"></key_value> <key_value key="MEMORY" value="1400" op=">"></key_value> <key_value key="DISK" value="7999" op=">"></key_value> </and> </hostRequires> the duplicates being caused by bug 706838. This produces a query like the following: 2011-08-09 15:42:32,060 sqlalchemy.engine.base.Engine.0x..10 INFO SELECT count(1) AS count_1 FROM system_arch_map AS system_arch_map_1, system LEFT OUTER JOIN key_value_int AS key_value0 ON key_value0.key_id = %s AND system.id = key_value0.system_id LEFT OUTER JOIN key_value_int AS key_value1 ON key_value1.key_id = %s AND system.id = key_value1.system_id LEFT OUTER JOIN key_value_int AS key_value2 ON key_value2.key_id = %s AND system.id = key_value2.system_id LEFT OUTER JOIN key_value_int AS key_value3 ON key_value3.key_id = %s AND system.id = key_value3.system_id LEFT OUTER JOIN key_value_int AS key_value4 ON key_value4.key_id = %s AND system.id = key_value4.system_id LEFT OUTER JOIN key_value_int AS key_value5 ON key_value5.key_id = %s AND system.id = key_value5.system_id LEFT OUTER JOIN key_value_int AS key_value6 ON key_value6.key_id = %s AND system.id = key_value6.system_id LEFT OUTER JOIN key_value_int AS key_value7 ON key_value7.key_id = %s AND system.id = key_value7.system_id LEFT OUTER JOIN system_group AS system_group_1 ON system.id = system_group_1.system_id LEFT OUTER JOIN tg_group AS tg_group_1 ON tg_group_1.group_id = system_group_1.group_id LEFT OUTER JOIN user_group AS user_group_1 ON tg_group_1.group_id = user_group_1.group_id LEFT OUTER JOIN tg_user AS tg_user_1 ON tg_user_1.user_id = user_group_1.user_id INNER JOIN lab_controller ON lab_controller.id = system.lab_controller_id WHERE key_value0.key_value > %s AND key_value1.key_value > %s AND key_value2.key_value > %s AND key_value3.key_value > %s AND key_value4.key_value > %s AND key_value5.key_value > %s AND key_value6.key_value > %s AND key_value7.key_value > %s AND %s = system.status_id AND (system.owner_id = %s OR system.loan_id = %s OR system.shared = %s AND NOT (EXISTS (SELECT 1 FROM system_group WHERE system.id = system_group.system_id)) OR system.shared = %s AND tg_user_1.user_id = %s) AND system.id = system_arch_map_1.system_id AND %s = system_arch_map_1.arch_id AND NOT (EXISTS (SELECT 1 FROM exclude_osmajor WHERE system.id = exclude_osmajor.system_id AND %s = exclude_osmajor.osmajor_id AND %s = exclude_osmajor.arch_id)) AND NOT (EXISTS (SELECT 1 FROM exclude_osversion WHERE system.id = exclude_osversion.system_id AND %s = exclude_osversion.osversion_id AND %s = exclude_osversion.arch_id)) 2011-08-09 15:42:32,060 sqlalchemy.engine.base.Engine.0x..10 INFO [12L, 9L, 12L, 9L, 12L, 9L, 12L, 9L, u'1400', u'7999', u'1400', u'7999', u'1400', u'7999', u'1400', u'7999', 1L, 1L, 1L, 1, 1, 1L, 1L, 1L, 1L, 1L, 1L] MySQL runs forever trying to answer that query, so I tried cutting down the number of joins from 8 to 4 and the count came back as 926127 rows(!). Worse still, after issuing the COUNT(1) statement beakerd would then fetch all those rows. The problem here is that we are using a JOIN instead of an EXISTS clause, giving us a huge number of duplicate rows in the result set. The number of duplicate rows will explode exponentially with every <key_value/> in the job. So this is a bad query we are generating, it's not a MySQL bug. The good news is, we can fix it. Here is a test case that shows the problem:
--- a/IntegrationTests/src/bkr/inttest/server/test_model.py
+++ b/IntegrationTests/src/bkr/inttest/server/test_model.py
@@ -568,6 +568,30 @@ def test_key_notequal(self):
self.assert_(with_cciss not in systems)
self.assert_(without_cciss in systems)
+ # https://bugzilla.redhat.com/show_bug.cgi?id=729156
+ def test_keyvalue_does_not_cause_duplicate_rows(self):
+ system = data_setup.create_system(arch=u'i386', shared=True,
+ lab_controller=self.lc)
+ disk_key = Key.by_name(u'DISK')
+ system.key_values_int.extend([
+ Key_Value_Int(disk_key, 30718),
+ Key_Value_Int(disk_key, 140011),
+ Key_Value_Int(disk_key, 1048570)])
+ session.flush()
+ query = self.distro.systems_filter(self.user, """
+ <hostRequires>
+ <and>
+ <hostname op="=" value="%s" />
+ <key_value key="DISK" op=">" value="9000" />
+ </and>
+ </hostRequires>
+ """ % system.fqdn)
+ self.assertEquals(len(query.all()), 1)
+ # with the bug this count comes out as 3 instead of 1,
+ # which doesn't sound so bad...
+ # but when it's 926127 instead of 278, that's bad
+ self.assertEquals(query.count(), 1)
+
# https://bugzilla.redhat.com/show_bug.cgi?id=714974
def test_hypervisor(self):
baremetal = data_setup.create_system(arch=u'i386', shared=True,
Beaker 0.7.1 has been released. |