Bug 1022279
Summary: | Errata queries constantly running | ||||||
---|---|---|---|---|---|---|---|
Product: | [Community] Spacewalk | Reporter: | JDavis4102 | ||||
Component: | Server | Assignee: | Michael Mráka <mmraka> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Red Hat Satellite QA List <satqe-list> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 2.1 | CC: | akrherz, JDavis4102, jfenal, ldelouw, mhuth, mmello, xdmoon | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | spacewalk-java-2.1.64-1 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1025636 (view as bug list) | Environment: | |||||
Last Closed: | 2014-03-04 13:06:43 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1025636, 1069560 | ||||||
Attachments: |
|
Description
JDavis4102
2013-10-22 22:15:54 UTC
Created attachment 817650 [details]
query and query plan
I've attached the offending query (took 1600 seconds) and am experimenting with adding various indices to remove those massive sequence scans! Well, even setting "set enable_seqscan='off';" did not help. Looking at the query, it seems the SNV subquery is materializing a ton of data! The outer query limit of SNV.channel_id = and SNV.errata_id = does not make it down into the subquery. For a test, I modified the sub query and added the explicit (and rhnChannelErrata.errata_id = 19411). The query returned in a second or so.... I am not sure if that is the right thing to do.... See also: similar / possibly related bug currently being investigated for Red Hat Satellite 5.6 (the highlighted queries are different): Bug 1023557 - Satellite 5.6 postgres takes a long time to satellite-sync This seems to be (again) issue with PostgreSQL 8.4 optimizer which can't "unfold" inner sub-queries so it starts execution with cartesian joins in these sub-queries and then later on filtering out majority of the rows... From explain plan problematic parts, according to number of rows, are a) -> Hash (cost=488538.14..488538.14 rows=31777694 width=60) -> Hash Join (cost=31433.37..488538.14 rows=31777694 width=60) Hash Cond: (csc.channel_id = ocnp.channel_id) which corresponds to the third "not exits" sub-query (-- rule out records where other channel provides the same package name with higher evr), b) -> Materialize (cost=0.00..284594.16 rows=11626006 width=26) -> Nested Loop (cost=0.00..168334.10 rows=11626006 width=26) -> Index Scan using rhn_sc_sid_cid_uq on rhnserverchannel csc (cost=0.00..237.98 rows=3229 width=18) -> Index Scan using rhn_cp_cp_uq on rhnchannelpackage cp (cost=0.00..32.42 rows=1571 width=15) Index Cond: (cp.channel_id = csc.channel_id) which corresponds to the first "not exits" sub-query (-- rule out records where channels with lower id exist == get min) and c) -> Hash (cost=130497.67..130497.67 rows=5737384 width=60) -> Hash Join (cost=754.39..130497.67 rows=5737384 width=60) Hash Cond: (sp.package_arch_id = puac.package_upgrade_arch_id) -> Hash Join (cost=750.91..37615.95 rows=946784 width=60) Hash Cond: (sp.evr_id = sevr.id) which corresponds to the second "not exits" sub-query (-- rule out records where the server has the same package name with higher evr). For a) and b) we can add condition to filter out just necessary channels, i.e. AND CP.CHANNEL_ID = :channel_id resp. AND CSC.CHANNEL_ID = :channel_id The changes from comment #6 actually make huge difference on my test system from original QUERY PLAN --------------------------------------------------------------- Unique (cost=2172993.26..3044634.48 rows=1 width=18) (actual time=673.244..673.244 rows=0 loops=1) to new QUERY PLAN -------------------------------------------- Unique (cost=711198.84..711198.85 rows=1 width=18) (actual time=22.454..22.454 rows=0 loops=1) which is 3 times lower cost / faster. The c) is more tricky because there's no channel_id nor errata_id we can use to downsize query results. So we actually have to add some more tables to filter out redundant packages. E.g. -- rule out records where the server has the same package name with higher evr select 1 from rhnServerPackage sp, rhnPackageEVR sevr, rhnPackageUpgradeArchCompat puac +, RHNERRATAPACKAGE JEP, RHNPACKAGE JP where rhnServerPackage.server_id = sp.server_id +AND JEP.ERRATA_ID = :errata_id +AND JEP.PACKAGE_ID = JP.ID +AND JP.NAME_ID = SP.NAME_ID and rhnServerPackage.name_id = sp.name_id and sp.evr_id = sevr.id and rhnPackageEVR.evr <= sevr.evr and rhnServerPackage.package_arch_id = puac.package_arch_id and puac.package_upgrade_arch_id = sp.package_arch_id After comment #8 change explain plan shows even QUERY PLAN --------------------------------------------------------------- Unique (cost=431174.58..431174.59 rows=1 width=18) (actual time=626.219..626.219 rows=0 loops=1) which is 5 times better than original one. Fixed in spacewalk master with commit ccd106a5ca81fb7a45ddcbe9c01c6e7dcb4bcb40 1022279 - modified query to work better with PostgreSQL 8.4 optimizer commit e431dc48db2d5197a1ef1888c6ef14fcfe869730 1022279 - added hints for PostgreSQL 8.4 optimizer Issues seems to not be fixed Still having issues with the following Errata query: SELECT DISTINCT + | | S.id AS server_id, S.org_id org_id + | | FROM rhnServer S + | | JOIN ( + | | -- + | | select rhnChannelErrata.errata_id, rhnChannelErrata.channel_id, rhnServerChannel.server_id, rhnErrataPackage.package_id+ | | from rhnChannelErrata, rhnErrataPackage, rhnChannelNewestPackage, rhnPackageEVR, + | | rhnServerChannel, rhnServerPackage, rhnPackageUpgradeArchCompat + | | where rhnChannelErrata.errata_id = rhnErrataPackage.errata_id + | | -- + | | and rhnChannelErrata.channel_id = rhnChannelNewestPackage.channel_id + | | and rhnErrataPackage.package_id = rhnChannelNewestPackage.package_id + | | -- + | | and rhnChannelErrata.channel_id = rhnServerChannel.channel_id + | | and rhnChannelNewestPackage.name_id = rhnServerPackage.name_id + | | and rhnServerChannel.server_id = rhnServerPackage.server_id + | | -- + | | and rhnChannelNewestPackage.evr_id = rhnPackageEVR.id + | | -- + | | and rhnServerPackage.package_arch_id = rhnPackageUpgradeArchCompat.package_arch_id + | | and rhnPackageUpgradeArchCompat.package_upgrade_arch_id = rhnCh Switching MODIFIED Spacewalk bugs to ON_QA before 2.1 release. This bug is not resolved and needs further research. As the issue is not fixed. This bug needs to be reverted back to open and not modified. Spacewalk 2.1 has been released. https://fedorahosted.org/spacewalk/wiki/ReleaseNotes21 Spacewalk 2.1 has been released. https://fedorahosted.org/spacewalk/wiki/ReleaseNotes21 |