Bug 1011251

Summary: Sat 5.6 on postgres sometimes is slow listing packages for the system
Product: Red Hat Satellite 5 Reporter: Stephen Herr <sherr>
Component: WebUIAssignee: Tomas Lestach <tlestach>
Status: CLOSED CURRENTRELEASE QA Contact: Ales Dujicek <adujicek>
Severity: medium Docs Contact:
Priority: high    
Version: 560CC: adujicek, cperry, mmraka
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-01-13 09:54:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1127641    

Description Stephen Herr 2013-09-23 21:35:40 UTC
Description of problem:
Sometimes the package-list page for the system details takes a long time to load, like 40 or 60 seconds. 

Version-Release number of selected component (if applicable):
5.6

How reproducible:
sometimes

Steps to Reproduce:
1. register a system
2. go to Systems -> [system] -> Software -> Packages -> List / Remove

Actual results:
Takes like a minute to load.

Expected results:
Takes like 2 seconds to load.

Additional info:
This is very strange. Some systems seem to exhibit this behavior and some don't. Furthermore I have seen some systems start out with this problem, and then magically fix themselves later. Executing just the query that that page is doing directly against the database yields very different times between servers that are experiencing this problem and those that aren't. Doing an explain plan on the two otherwise identical queries (except for system_id) yields very different explain plans. This may be related in a general sense to bug 999453, but only in a "postgres 8 query optimizer is not very smart" kind of way. Query and explain plans follow below. It's interesting that the estimated minimum cost of the fast plan is a lot higher than the estimated minimum cost of the slow plan. Other maybe-relevant information is that I ran a 'db-control gather-stats' on this database a couple days ago, and the profiles that I've seen exhibit the problem are newer.

Query in question:
  SELECT scp.package_id,                                                                 pn.id || '|' || pe.id || '|' || pa.id as id_combo,
         pn.name || '-' || pe.version || '-' || pe.release as nvre,
         pn.name || '-' || evr_t_as_vre_simple(pe.evr) || '.' || pa.label as nvrea,
         pn.name as name,
         pe.version as version,
         pe.release as release,
         pe.epoch as epoch,
         pn.id as name_id,
         pe.id as evr_id,
         pa.label as arch,
         sp.installTime
    FROM rhnServerPackage sp
         LEFT OUTER JOIN (select sc.server_id,
                                 cp.package_id,
                                 p.name_id,
                                 p.evr_id,
                                 p.package_arch_id
                            FROM rhnPackage p,
                                 rhnServerChannel sc
                          JOIN rhnChannelPackage cp
                              ON (sc.channel_id = cp.channel_id)
                           WHERE cp.package_id = p.id
                         ) scp on (scp.server_id = sp.server_id and
                                   sp.name_id = scp.name_id and
                                   sp.evr_id = scp.evr_id and
                                   sp.package_arch_id = scp.package_arch_id)
         LEFT OUTER JOIN rhnPackageName pn ON (pn.id = sp.name_id)
         LEFT OUTER JOIN rhnPackageArch pa ON (pa.id = sp.package_arch_id)
         LEFT OUTER JOIN rhnPackageEvr pe ON (pe.id = sp.evr_id)
   WHERE sp.server_id = :sid
   ORDER BY nvre;

Explain plan for fast servers:
 Sort  (cost=14254.97..14257.86 rows=1157 width=277)
   Sort Key: ((((((pn.name)::text || '-'::text) || (pe.version)::text) || '-'::text) || (pe.release)::text))
   ->  Hash Left Join  (cost=13636.45..14196.10 rows=1157 width=277)
         Hash Cond: (sp.name_id = pn.id)
         ->  Merge Left Join  (cost=12693.01..12885.31 rows=1157 width=260)
               Merge Cond: ((sp.name_id = p.name_id) AND (sp.evr_id = p.evr_id) AND (sp.package_arch_id = p.package_arch
_id))
               Join Filter: (sc.server_id = sp.server_id)
               ->  Sort  (cost=1130.91..1133.80 rows=1157 width=277)
                     Sort Key: sp.name_id, sp.evr_id, sp.package_arch_id
                     ->  Hash Left Join  (cost=739.75..1072.04 rows=1157 width=277)
                           Hash Cond: (sp.evr_id = pe.id)
                           ->  Hash Left Join  (cost=67.24..370.61 rows=1157 width=218)
                                 Hash Cond: (sp.package_arch_id = pa.id)
                                 ->  Bitmap Heap Scan on rhnserverpackage sp  (cost=65.23..352.69 rows=1157 width=40)
                                       Recheck Cond: (server_id = 1000010019::numeric)
                                       ->  Bitmap Index Scan on rhn_sp_snep_uq  (cost=0.00..64.94 rows=1157 width=0)
                                             Index Cond: (server_id = 1000010019::numeric)
                                 ->  Hash  (cost=1.45..1.45 rows=45 width=178)
                                       ->  Seq Scan on rhnpackagearch pa  (cost=0.00..1.45 rows=45 width=178)
                           ->  Hash  (cost=422.78..422.78 rows=19978 width=59)
                                 ->  Seq Scan on rhnpackageevr pe  (cost=0.00..422.78 rows=19978 width=59)
               ->  Sort  (cost=11562.10..11607.28 rows=18072 width=40)
                     Sort Key: p.name_id, p.evr_id, p.package_arch_id
                     ->  Hash Join  (cost=7146.63..10284.28 rows=18072 width=40)
                           Hash Cond: (cp.package_id = p.id)
                           ->  Merge Join  (cost=0.00..2075.03 rows=18072 width=19)
                                 Merge Cond: (sc.channel_id = cp.channel_id)
                                 ->  Index Scan using rhn_sc_cid_idx on rhnserverchannel sc  (cost=0.00..12.85 rows=2 width=18)
                                       Filter: (server_id = 1000010019::numeric)
                                 ->  Index Scan using rhn_cp_cp_uq on rhnchannelpackage cp  (cost=0.00..7108.73 rows=106
270 width=15)
                           ->  Hash  (cost=5891.17..5891.17 rows=64917 width=29)
                                 ->  Seq Scan on rhnpackage p  (cost=0.00..5891.17 rows=64917 width=29)
         ->  Hash  (cost=542.64..542.64 rows=32064 width=24)
               ->  Seq Scan on rhnpackagename pn  (cost=0.00..542.64 rows=32064 width=24)

Explain plan for slow servers:
 Sort  (cost=9203.00..9203.01 rows=1 width=277)
   Sort Key: ((((((pn.name)::text || '-'::text) || (pe.version)::text) || '-'::text) || (pe.release)::text))
   ->  Nested Loop Left Join  (cost=7278.58..9202.99 rows=1 width=277)
         ->  Nested Loop Left Join  (cost=7278.58..9194.41 rows=1 width=260)
               Join Filter: (pa.id = sp.package_arch_id)
               ->  Nested Loop Left Join  (cost=7278.58..9192.40 rows=1 width=89)
                     ->  Nested Loop Left Join  (cost=7278.58..9184.12 rows=1 width=37)
                           Join Filter: ((sc.server_id = sp.server_id) AND (sp.name_id = p.name_id) AND (sp.evr_id = p.e
vr_id) AND (sp.package_arch_id = p.package_arch_id))
                           ->  Index Scan using rhn_sp_snep_uq on rhnserverpackage sp  (cost=0.00..8.28 rows=1 width=40)
                                 Index Cond: (server_id = 1000010072::numeric)
                           ->  Hash Join  (cost=7278.58..8995.12 rows=9036 width=40)
                                 Hash Cond: (cp.package_id = p.id)
                                 ->  Nested Loop  (cost=131.95..1095.17 rows=9036 width=19)
                                       ->  Index Scan using rhn_sc_sid_cid_uq on rhnserverchannel sc  (cost=0.00..8.27 r
ows=1 width=18)                  rhnServerChannel sc
                                             Index Cond: (server_id = 1000010072::numeric)
                                       ->  Bitmap Heap Scan on rhnchannelpackage cp  (cost=131.95..1035.82 rows=4087 wid
th=15)ER BY nvre;r_id = 1000010072eEvr pe ON (pe.id = sp.evr_id)e_arch_id)d)
                                             Recheck Cond: (cp.channel_id = sc.channel_id)
                                             ->  Bitmap Index Scan on rhn_cp_cp_uq  (cost=0.00..130.93 rows=4087 width=0
)
                                                   Index Cond: (cp.channel_id = sc.channel_id)
                                 ->  Hash  (cost=5891.17..5891.17 rows=64917 width=29)
                                       ->  Seq Scan on rhnpackage p  (cost=0.00..5891.17 rows=64917 width=29)
                     ->  Index Scan using rhn_pe_id_pk on rhnpackageevr pe  (cost=0.00..8.27 rows=1 width=59)
                           Index Cond: (pe.id = sp.evr_id)
               ->  Seq Scan on rhnpackagearch pa  (cost=0.00..1.45 rows=45 width=178)
         ->  Index Scan using rhn_pn_id_pk on rhnpackagename pn  (cost=0.00..8.27 rows=1 width=24)
               Index Cond: (pn.id = sp.name_id)

Comment 1 Clifford Perry 2014-11-13 10:26:30 UTC
We think this is fixed with 5.7, moving to postgreSQL 9.2. Moving to ON_QA with no code changes. If it fails QA, we will re-evaluate and maybe not fix till post 5.7 GA.

Cliff

Comment 3 Clifford Perry 2015-01-13 09:54:41 UTC
With the release of Red Hat Satellite 5.7 on January 12th 2015 this bug is
being moved to a Closed Current Release state. 

The Satellite 5.7 GA Errata:
 - https://rhn.redhat.com/errata/RHSA-2015-0033.html 

Satellite 5.7 Release Notes:
 -
https://access.redhat.com/documentation/en-US/Red_Hat_Satellite/5.7/html-single/Release_Notes/index.html

Satellite Customer Portal Blog announcement for release:
 - https://access.redhat.com/blogs/1169563/posts/1315743 

Cliff