Bug 859547

Summary: Candlepin 500 Internal server error at login and RH subscription page
Product: Red Hat Satellite Reporter: Aaron Weitekamp <aweiteka>
Component: Subscription ManagementAssignee: Eric Helms <ehelms>
Status: CLOSED UPSTREAM QA Contact: Corey Welton <cwelton>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.0.1CC: bkearney, cpelland, inecas, jturner, mmccune, omaciel
Target Milestone: UnspecifiedKeywords: Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-09-19 18:11:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Candlepin 500 Internal server error
none
katello-debug logs
none
catalina.out
none
katello debug from second repro on separate machine none

Description Aaron Weitekamp 2012-09-21 19:41:53 UTC
Created attachment 615591 [details]
Candlepin 500 Internal server error

Description of problem:
A huge Candlepin 500 Internal server error is displaying when logging in and navigating to the Red Hat Subscriptions page. Cannot view dashboard or access subscription menu. It appears to be related to RHN content RHEL High Availability for RHEL Server 5.6 i386, which is not sync'ing.

Version-Release number of selected component (if applicable):
1.1
[root@qeblade41 log]# rpm -qa |grep katello
katello-cli-common-1.1.8-4.el6cf.noarch
katello-1.1.12-7.el6cf.noarch
katello-glue-pulp-1.1.12-7.el6cf.noarch
katello-all-1.1.12-7.el6cf.noarch
katello-qpid-broker-key-pair-1.0-1.noarch
katello-candlepin-cert-key-pair-1.0-1.noarch
katello-common-1.1.12-7.el6cf.noarch
katello-selinux-1.1.1-1.el6cf.noarch
katello-configure-1.1.9-3.el6cf.noarch
katello-certs-tools-1.1.8-1.el6cf.noarch
katello-glue-candlepin-1.1.12-7.el6cf.noarch
katello-qpid-client-key-pair-1.0-1.noarch
katello-cli-1.1.8-4.el6cf.noarch

How reproducible:
Investigating conditions

Comment 1 Aaron Weitekamp 2012-09-21 19:45:08 UTC
Created attachment 615592 [details]
katello-debug logs

Comment 3 Aaron Weitekamp 2012-09-21 19:58:07 UTC
All content spontaneously queued for re-sync.

Comment 4 James Laska 2012-09-24 19:53:39 UTC
Unclear why, but it appears that restarting the tomcat6 service clears the candlepin failure and allows katello web operations to proceed as normal.  This may only be temporary, as the error notice eventually returned later in the day.

Comment 5 Aaron Weitekamp 2012-10-10 13:14:35 UTC
Created attachment 624863 [details]
catalina.out

Repo'd. catalina.out attached.

All services running:
[root@qeblade40 katello]# katello-service status
tomcat6 (pid 16301) is running...[  OK  ]
httpd (pid  4470) is running...
mongod (pid 4395) is running...
httpd (pid  4470) is running...
qpidd (pid  7616) is running...
elasticsearch (pid  7483) is running...
katello (8300) is running.
katello (8310) is running.
katello (8316) is running.
katello (8350) is running.
katello (8378) is running.
katello (8406) is running.
katello (8442) is running.
katello (8470) is running.
katello (8498) is running.
katello (8534) is running.
katello (8562) is running.
katello (8566) is running.
katello (8602) is running.
katello (8654) is running.
katello (8690) is running.
katello (8718) is running.
katello (8771) is running.
delayed_job is running.
delayed_job_monitor is running.

Comment 6 James Bowes 2012-10-10 13:52:47 UTC
is postgres running?

The logs seem to indicate that it failed at some point yesterday.

A log of the exceptions in the log stem from that. The unitofwork one is a seperate bug (that is probably manifesting from this), and has already been fixed in master.

Comment 7 Aaron Weitekamp 2012-10-10 16:52:29 UTC
postgre is going crazy:
[root@qeblade40 katello-configure]# ps -ef |grep postgre
postgres  4450     1  0 Oct09 ?        00:00:04 /usr/bin/postmaster -p 5432 -D /var/lib/pgsql/data
postgres  4466  4450  0 Oct09 ?        00:00:05 postgres: writer process                          
postgres  4467  4450  0 Oct09 ?        00:00:04 postgres: wal writer process                      
postgres  4468  4450  0 Oct09 ?        00:00:03 postgres: autovacuum launcher process             
postgres  4469  4450  0 Oct09 ?        00:00:10 postgres: stats collector process                 
postgres  4637  4450  0 Oct09 ?        00:00:02 postgres: aeolus conductor ::1(53006) idle        
postgres  4684  4450  0 Oct09 ?        00:00:05 postgres: aeolus conductor ::1(53018) idle        
postgres  4688  4450  0 Oct09 ?        00:00:05 postgres: aeolus conductor ::1(53019) idle        
postgres  5117  4450  0 Oct09 ?        00:00:04 postgres: katellouser katelloschema ::1(53111) idle
postgres  5125  4450  0 Oct09 ?        00:00:00 postgres: katellouser katelloschema ::1(53134) idle
postgres  5127  4450  0 Oct09 ?        00:00:00 postgres: katellouser katelloschema ::1(53137) idle
postgres  5133  4450  0 Oct09 ?        00:00:00 postgres: katellouser katelloschema ::1(53142) idle
postgres  5137  4450  0 Oct09 ?        00:00:00 postgres: katellouser katelloschema ::1(53147) idle
postgres  5139  4450  0 Oct09 ?        00:00:00 postgres: katellouser katelloschema ::1(53150) idle
postgres  5140  4450  0 Oct09 ?        00:00:00 postgres: katellouser katelloschema ::1(53152) idle
postgres  5142  4450  0 Oct09 ?        00:00:00 postgres: katellouser katelloschema ::1(53159) idle
root      6357  2679  0 12:52 pts/3    00:00:00 grep postgre
postgres 22431  4450  0 05:03 ?        00:00:00 postgres: katellouser katelloschema ::1(40342) idle
postgres 27343  4450  0 08:41 ?        00:00:00 postgres: katellouser katelloschema ::1(45071) idle
postgres 27351  4450  0 08:42 ?        00:00:00 postgres: katellouser katelloschema ::1(45099) idle
postgres 27530  4450  0 08:43 ?        00:00:00 postgres: katellouser katelloschema ::1(45104) idle
postgres 27535  4450  0 08:44 ?        00:00:00 postgres: katellouser katelloschema ::1(45108) idle
postgres 27538  4450  0 08:44 ?        00:00:00 postgres: katellouser katelloschema ::1(45115) idle
postgres 28368  4450  0 08:45 ?        00:00:00 postgres: katellouser katelloschema ::1(45231) idle
postgres 28371  4450  0 08:45 ?        00:00:00 postgres: katellouser katelloschema ::1(45236) idle
postgres 28372  4450  0 08:45 ?        00:00:00 postgres: katellouser katelloschema ::1(45239) idle
postgres 28678  4450  0 08:55 ?        00:00:00 postgres: katellouser katelloschema ::1(45454) idle

Comment 8 Aaron Weitekamp 2012-10-11 12:36:53 UTC
Created attachment 625499 [details]
katello debug from second repro on separate machine

Comment 10 Mike McCune 2012-10-23 14:26:59 UTC
I've not been able to reproduce this issue lately, does anyone have a recent build and system where it is occurring?

Comment 11 James Laska 2012-10-23 15:33:59 UTC
I'm unable to reproduce this issue and we are unclear what caused the problem.  It's clearly a bug, but given the difficulty in reproducing the problem, and that when the problem occurs, it does not block adminstration of the katello installation/content ... I'm recommending we move this out to 2.0.0? and continue trying to reproduce the problem.

Should we manage to pinpoint a consistent reproducer, and better understand the impact, we can pull this back into a release.

> 11:15:47   jlaska: mmccune: it's definitely something we've been seeing, and it seems to resurface .... seems we've been unable to pinpoint the cause.  I'd prefer moving it out to 2.0.0? as we continue to retest.  Should we manage to pinpoint the reproducer, and understand the cause better ... we can determine whether something needs to pull back into 1.1.  How's that?
> 11:19:59   mmccune: jlaska: sounds good
> 11:20:09   jlaska: mmccune: want me to toss that in the bz?
> 11:20:15   mmccune: jlaska: please!

Comment 12 James Laska 2012-10-24 18:49:07 UTC
Adjusting the severity based on comment#11

Comment 14 Eric Helms 2013-05-13 20:51:11 UTC
The codebase has undergone significant changes since this was first reported. I am requesting a re-examination to decide if this bug is valid or not.

Comment 15 Mike McCune 2013-08-16 17:52:55 UTC
getting rid of 6.0.0 version since that doesn't exist

Comment 16 Mike McCune 2013-09-19 18:11:43 UTC
These bugs have been resolved in upstream projects for a period of months so I'm mass-closing them as CLOSED:UPSTREAM.  If this is a mistake feel free to re-open.