Bug 1890487
| Summary: | Candlepin services gets down after upgrade | ||
|---|---|---|---|
| Product: | Red Hat Satellite | Reporter: | Devendra Singh <desingh> |
| Component: | Infrastructure | Assignee: | satellite6-bugs <satellite6-bugs> |
| Status: | CLOSED NOTABUG | QA Contact: | Lukas Pramuk <lpramuk> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 6.8.0 | CC: | bcourt, ehelms, inecas, smallamp, zhunting |
| Target Milestone: | 6.9.0 | Keywords: | Regression, Triaged, Upgrades |
| Target Release: | Unused | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-01-18 15:43:00 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
adding regression keyword. snap 19 looked good. Was this fixed in any candlepin version? The most common cause of this is the system hitting an 'Out of memory' exception, tomcat is the first service to be killed when that happens on a Satellite. Looking through the sosreport I see that condition: var/log/messages-20201115:Nov 14 05:01:27 qe-sat6-upgrade-rhel7 kernel: Out of memory: Kill process 30358 (java) score 84 or sacrifice child Devendra, can you re-test this and if you run into the same Candlepin connection failure check to see if an OOM condition was encountered? If so, then I would opt to close this not a bug. (In reply to Eric Helms from comment #7) > The most common cause of this is the system hitting an 'Out of memory' > exception, tomcat is the first service to be killed when that happens on a > Satellite. Looking through the sosreport I see that condition: > > var/log/messages-20201115:Nov 14 05:01:27 qe-sat6-upgrade-rhel7 kernel: Out > of memory: Kill process 30358 (java) score 84 or sacrifice child > > > Devendra, can you re-test this and if you run into the same Candlepin > connection failure check to see if an OOM condition was encountered? If so, > then I would opt to close this not a bug. I re-tested but didn't see the OOM problem, last time I saw it in 6.8.0 Snap20. If this issue is not seen since then is this still an issue or can it be reproduced still? (In reply to Zach Huntington-Meath from comment #9) > If this issue is not seen since then is this still an issue or can it be > reproduced still? No, The issue is intermittent, I saw it in 6.8 Snap20 and 6.8.1 Snap3(not seen in 6.8.1 snap1, snap2, and snap 4) I tested the upgrade with a recent 6.8.1 snap(6.7.4 -->6.8.1 Snap4 and 6.8.0 -->6.8.1 Snap4) but didn't see this issue. |
Description of problem: Candlepin services get down after upgrade Version-Release number of selected component (if applicable): 6.8 Snap20 How reproducible: 1/1 Steps to Reproduce: 1. Upgrade Satellite and Capsule from 6.7.4 to 6.8.0 Snap20 2. Execute Test Suit to validate the components. 3. During the execution of the test cases Candlepin services get downstate. candlepin: Status: FAIL Server Response: Message: Failed to open TCP connection to localhost:8443 (Connection refused - connect(2) for "localhost" port 8443) candlepin_events: Status: FAIL message: Not running Server Response: Duration: 6ms candlepin_auth: Status: FAIL Server Response: Message: A backend service [ Candlepin ] is unreachable Actual results: Candlepin services get down after upgrade. Expected results: Candlepin services should not go down after the upgrade. Additional info: