Bug 1751967

Summary: Candlepin crash after updating java to latest
Product: [Community] Candlepin Reporter: Birkir Freyr <birkir.freyr.hjartarson>
Component: candlepinAssignee: candlepin-bugs
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 2.6CC: baptiste.agasse, bcourt, csnyder, ekohlvan, jkrajice, kkohli, nmoumoul, redakkan, skallesh
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: candlepin-2.6.8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-09-20 16:14:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
sosreport none

Description Birkir Freyr 2019-09-13 10:05:25 UTC
Description of problem:
After updating java to versions:
java-1.8.0-openjdk-1:1.8.0.222.b10-1.el7_6.x86_64
java-1.8.0-openjdk-headless-1:1.8.0.222.b10-1.el7_6.x86_64

Candlepin tomcat crashes on registering a host.

Version-Release number of selected component (if applicable):
Candlepin 2.6.5
using Foreman 1.22.1 with katello 3.12

How reproducible:
Very

Steps to Reproduce:
1. Update to java-1.8.0-openjdk-1:1.8.0.222.b10-1.el7_6.x86_64
2. Register host via subscription-manager
3. Crash

Actual results:
Candlepin tomcat crashes

Expected results:
Host registers without error.

Additional info:
Downgrading java fixes issue: yum downgrade java-1.8.0-openjdk-1:1.8.0.222.b10-0.el7_6.x86_64 java-1.8.0-openjdk-headless-1:1.8.0.222.b10-0.el7_6.x86_64
There were also some SELinux lines in audit.log:

type=ANOM_ABEND msg=audit(1568229368.306:507): auid=4294967295 uid=53 gid=53 ses=4294967295 subj=system_u:system_r:tomcat_t:s0 pid=17703 comm="http-bio-8443-e" reason="memory violation" sig=6
type=SERVICE_STOP msg=audit(1568229368.348:508): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=tomcat comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'

I think i managed to fix these but it seemingly had no effect, so either I only managed to suppress the error or it's completely unrelated.

Original post on Foreman community forum with others having same problem:
https://community.theforeman.org/t/candlepin-crash-after-upgrade-to-1-22-1/15386

Comment 1 Chris Snyder 2019-09-16 18:13:05 UTC
Are there any logs you can provide so that we can get a bit more detail on what candlepin was doing leading up to the crash?

The following would be helpful! Thank you!

/var/log/tomcat/catalina*.log

/var/log/candlepin/candlepin.log

Comment 2 Ewoud Kohl van Wijngaarden 2019-09-18 21:28:33 UTC
Created attachment 1616448 [details]
sosreport

It looks like we can reproduce this in our CI. I'm attaching the sosreport from the run at https://ci.centos.org/job/foreman-katello-nightly-test/568/

The last line from candlepin.log is:

2019-09-18 16:48:06,734 [thread=http-bio-8443-exec-2] [req=4a7ae4e4-097c-4528-941e-77cbcf7933fe, org=, csid=] INFO  org.candlepin.common.filter.LoggingFilter - Request: verb=POST, uri=/candlepin/owners/Default_Organization/uebercert

In the Foreman installer we can also see this is where it dies. During database seeding it disconnects:

[ WARN 2019-09-18T16:48:08 verbose]  /Stage[main]/Foreman::Database/Foreman::Rake[db:seed]/Exec[foreman-rake-db:seed]/returns: ForemanTasks::TaskError: Task 027ebb30-7f35-49f0-9cc5-4cbfaeb8326a: RestClient::ServerBrokeConnection: Katello::Resources::Candlepin::Owner: Server broke connection  (POST /candlepin/owners/Default_Organization/uebercert)

In /var/log/messages:

Sep 18 16:47:59 pipeline-katello-server-nightly-centos7 server: INFO: Server startup in 37227 ms
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 server: #
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 server: # A fatal error has been detected by the Java Runtime Environment:
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 server: #
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 server: #  SIGSEGV (0xb) at pc=0x00007f20f98cda8f, pid=2334, tid=0x00007f2115633700
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 server: #
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 server: # JRE version: OpenJDK Runtime Environment (8.0_222-b10) (build 1.8.0_222-b10)
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 server: # Java VM: OpenJDK 64-Bit Server VM (25.222-b10 mixed mode linux-amd64 compressed oops)
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 server: # Problematic frame:
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 server: # C  [libplds4.so+0x1a8f]  PL_HashTableLookupConst+0xf
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 server: #
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 server: # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 server: #
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 server: # An error report file with more information is saved as:
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 server: # /usr/share/tomcat/hs_err_pid2334.log
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 server: #
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 server: # If you would like to submit a bug report, please visit:
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 server: #   http://bugreport.java.com/bugreport/crash.jsp
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 server: # The crash happened outside the Java Virtual Machine in native code.
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 server: # See problematic frame for where to report the bug.
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 server: #
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 systemd: tomcat.service: main process exited, code=killed, status=6/ABRT
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 systemd: Unit tomcat.service entered failed state.
Sep 18 16:48:08 pipeline-katello-server-nightly-centos7 systemd: tomcat.service failed.

It looks like we don't capture the hs_err_pid2334.log file because it's in /usr/share/tomcat.

Comment 3 Nikos Moumoulidis 2019-09-19 07:43:08 UTC
This looks very similar to https://bugzilla.redhat.com/show_bug.cgi?id=1724115 which was already fixed in candlepin-2.6.8.