Bug 859547
Summary: | Candlepin 500 Internal server error at login and RH subscription page | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Satellite | Reporter: | Aaron Weitekamp <aweiteka> | ||||||||||
Component: | Subscription Management | Assignee: | Eric Helms <ehelms> | ||||||||||
Status: | CLOSED UPSTREAM | QA Contact: | Corey Welton <cwelton> | ||||||||||
Severity: | medium | Docs Contact: | |||||||||||
Priority: | unspecified | ||||||||||||
Version: | 6.0.1 | CC: | bkearney, cpelland, inecas, jturner, mmccune, omaciel | ||||||||||
Target Milestone: | Unspecified | Keywords: | Triaged | ||||||||||
Target Release: | Unused | ||||||||||||
Hardware: | Unspecified | ||||||||||||
OS: | Unspecified | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2013-09-19 18:11:43 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Created attachment 615592 [details]
katello-debug logs
All content spontaneously queued for re-sync. Unclear why, but it appears that restarting the tomcat6 service clears the candlepin failure and allows katello web operations to proceed as normal. This may only be temporary, as the error notice eventually returned later in the day. Created attachment 624863 [details]
catalina.out
Repo'd. catalina.out attached.
All services running:
[root@qeblade40 katello]# katello-service status
tomcat6 (pid 16301) is running...[ OK ]
httpd (pid 4470) is running...
mongod (pid 4395) is running...
httpd (pid 4470) is running...
qpidd (pid 7616) is running...
elasticsearch (pid 7483) is running...
katello (8300) is running.
katello (8310) is running.
katello (8316) is running.
katello (8350) is running.
katello (8378) is running.
katello (8406) is running.
katello (8442) is running.
katello (8470) is running.
katello (8498) is running.
katello (8534) is running.
katello (8562) is running.
katello (8566) is running.
katello (8602) is running.
katello (8654) is running.
katello (8690) is running.
katello (8718) is running.
katello (8771) is running.
delayed_job is running.
delayed_job_monitor is running.
is postgres running? The logs seem to indicate that it failed at some point yesterday. A log of the exceptions in the log stem from that. The unitofwork one is a seperate bug (that is probably manifesting from this), and has already been fixed in master. postgre is going crazy: [root@qeblade40 katello-configure]# ps -ef |grep postgre postgres 4450 1 0 Oct09 ? 00:00:04 /usr/bin/postmaster -p 5432 -D /var/lib/pgsql/data postgres 4466 4450 0 Oct09 ? 00:00:05 postgres: writer process postgres 4467 4450 0 Oct09 ? 00:00:04 postgres: wal writer process postgres 4468 4450 0 Oct09 ? 00:00:03 postgres: autovacuum launcher process postgres 4469 4450 0 Oct09 ? 00:00:10 postgres: stats collector process postgres 4637 4450 0 Oct09 ? 00:00:02 postgres: aeolus conductor ::1(53006) idle postgres 4684 4450 0 Oct09 ? 00:00:05 postgres: aeolus conductor ::1(53018) idle postgres 4688 4450 0 Oct09 ? 00:00:05 postgres: aeolus conductor ::1(53019) idle postgres 5117 4450 0 Oct09 ? 00:00:04 postgres: katellouser katelloschema ::1(53111) idle postgres 5125 4450 0 Oct09 ? 00:00:00 postgres: katellouser katelloschema ::1(53134) idle postgres 5127 4450 0 Oct09 ? 00:00:00 postgres: katellouser katelloschema ::1(53137) idle postgres 5133 4450 0 Oct09 ? 00:00:00 postgres: katellouser katelloschema ::1(53142) idle postgres 5137 4450 0 Oct09 ? 00:00:00 postgres: katellouser katelloschema ::1(53147) idle postgres 5139 4450 0 Oct09 ? 00:00:00 postgres: katellouser katelloschema ::1(53150) idle postgres 5140 4450 0 Oct09 ? 00:00:00 postgres: katellouser katelloschema ::1(53152) idle postgres 5142 4450 0 Oct09 ? 00:00:00 postgres: katellouser katelloschema ::1(53159) idle root 6357 2679 0 12:52 pts/3 00:00:00 grep postgre postgres 22431 4450 0 05:03 ? 00:00:00 postgres: katellouser katelloschema ::1(40342) idle postgres 27343 4450 0 08:41 ? 00:00:00 postgres: katellouser katelloschema ::1(45071) idle postgres 27351 4450 0 08:42 ? 00:00:00 postgres: katellouser katelloschema ::1(45099) idle postgres 27530 4450 0 08:43 ? 00:00:00 postgres: katellouser katelloschema ::1(45104) idle postgres 27535 4450 0 08:44 ? 00:00:00 postgres: katellouser katelloschema ::1(45108) idle postgres 27538 4450 0 08:44 ? 00:00:00 postgres: katellouser katelloschema ::1(45115) idle postgres 28368 4450 0 08:45 ? 00:00:00 postgres: katellouser katelloschema ::1(45231) idle postgres 28371 4450 0 08:45 ? 00:00:00 postgres: katellouser katelloschema ::1(45236) idle postgres 28372 4450 0 08:45 ? 00:00:00 postgres: katellouser katelloschema ::1(45239) idle postgres 28678 4450 0 08:55 ? 00:00:00 postgres: katellouser katelloschema ::1(45454) idle Created attachment 625499 [details]
katello debug from second repro on separate machine
I've not been able to reproduce this issue lately, does anyone have a recent build and system where it is occurring? I'm unable to reproduce this issue and we are unclear what caused the problem. It's clearly a bug, but given the difficulty in reproducing the problem, and that when the problem occurs, it does not block adminstration of the katello installation/content ... I'm recommending we move this out to 2.0.0? and continue trying to reproduce the problem.
Should we manage to pinpoint a consistent reproducer, and better understand the impact, we can pull this back into a release.
> 11:15:47 jlaska: mmccune: it's definitely something we've been seeing, and it seems to resurface .... seems we've been unable to pinpoint the cause. I'd prefer moving it out to 2.0.0? as we continue to retest. Should we manage to pinpoint the reproducer, and understand the cause better ... we can determine whether something needs to pull back into 1.1. How's that?
> 11:19:59 mmccune: jlaska: sounds good
> 11:20:09 jlaska: mmccune: want me to toss that in the bz?
> 11:20:15 mmccune: jlaska: please!
Adjusting the severity based on comment#11 The codebase has undergone significant changes since this was first reported. I am requesting a re-examination to decide if this bug is valid or not. getting rid of 6.0.0 version since that doesn't exist These bugs have been resolved in upstream projects for a period of months so I'm mass-closing them as CLOSED:UPSTREAM. If this is a mistake feel free to re-open. |
Created attachment 615591 [details] Candlepin 500 Internal server error Description of problem: A huge Candlepin 500 Internal server error is displaying when logging in and navigating to the Red Hat Subscriptions page. Cannot view dashboard or access subscription menu. It appears to be related to RHN content RHEL High Availability for RHEL Server 5.6 i386, which is not sync'ing. Version-Release number of selected component (if applicable): 1.1 [root@qeblade41 log]# rpm -qa |grep katello katello-cli-common-1.1.8-4.el6cf.noarch katello-1.1.12-7.el6cf.noarch katello-glue-pulp-1.1.12-7.el6cf.noarch katello-all-1.1.12-7.el6cf.noarch katello-qpid-broker-key-pair-1.0-1.noarch katello-candlepin-cert-key-pair-1.0-1.noarch katello-common-1.1.12-7.el6cf.noarch katello-selinux-1.1.1-1.el6cf.noarch katello-configure-1.1.9-3.el6cf.noarch katello-certs-tools-1.1.8-1.el6cf.noarch katello-glue-candlepin-1.1.12-7.el6cf.noarch katello-qpid-client-key-pair-1.0-1.noarch katello-cli-1.1.8-4.el6cf.noarch How reproducible: Investigating conditions