Created attachment 1453756 [details] Excerpt from server.log showing failed RHQScheduler_Worker-1 thread Description of problem: Baseline calculation is failing resulting in the transaction being rolled back. It is not clear what causes this or at what point it happens. Log excerpt is attached which includes relevant debug output for the baseline calc job thread. The stack trace: 18:01:59,829 WARN [com.arjuna.ats.arjuna] (RHQScheduler_Worker-1) ARJUNA012125: TwoPhaseCoordinator.beforeCompletion - failed for SynchronizationImple< 0:ffff0a7bca12:-3211a2d0:1c3d1a7c:835f1, org.hibernate.engine.transaction.synchronization.internal.RegisteredSynchronization@b5146ac2 >: java.lang.IllegalStateException: org.hibernate.TransientPropertyValueException: object references an unsaved transient instance - save the transient instance before flushing: org.rhq.core.domain.measurement.MeasurementSchedule.baseline -> org.rhq.core.domain.measurement.MeasurementBaseline at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1384) [hibernate-entitymanager-4.2.27.Final-redhat-1.jar:4.2.27.Final-redhat-1] at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1310) [hibernate-entitymanager-4.2.27.Final-redhat-1.jar:4.2.27.Final-redhat-1] ... at org.rhq.enterprise.server.measurement.MeasurementBaselineManagerLocal$$$view11.saveNewBaselines(Unknown Source) [rhq-server.jar:4.12.0.JON330GA-redhat-9] at org.rhq.enterprise.server.measurement.MeasurementBaselineManagerBean.saveNewBaselinesChunk(MeasurementBaselineManagerBean.java:311) [rhq-server.jar:4.12.0.JON330GA-redhat-9] at org.rhq.enterprise.server.measurement.MeasurementBaselineManagerBean.saveNewBaselines(MeasurementBaselineManagerBean.java:301) [rhq-server.jar:4.12.0.JON330GA-redhat-9] at org.rhq.enterprise.server.measurement.MeasurementBaselineManagerBean.calculateBaselines(MeasurementBaselineManagerBean.java:280) [rhq-server.jar:4.12.0.JON330GA-redhat-9] ... at org.rhq.enterprise.server.measurement.MeasurementBaselineManagerLocal$$$view11.calculateBaselines(Unknown Source) [rhq-server.jar:4.12.0.JON330GA-redhat-9] at org.rhq.enterprise.server.measurement.MeasurementBaselineManagerBean.calculateAutoBaselines(MeasurementBaselineManagerBean.java:206) [rhq-server.jar:4.12.0.JON330GA-redhat-9] ... at org.rhq.enterprise.server.measurement.MeasurementBaselineManagerLocal$$$view11.calculateAutoBaselines(Unknown Source) [rhq-server.jar:4.12.0.JON330GA-redhat-9] at org.rhq.enterprise.server.measurement.MeasurementBaselineManagerBean.calculateAutoBaselines(MeasurementBaselineManagerBean.java:134) [rhq-server.jar:4.12.0.JON330GA-redhat-9] ... at org.rhq.enterprise.server.measurement.MeasurementBaselineManagerLocal$$$view11.calculateAutoBaselines(Unknown Source) [rhq-server.jar:4.12.0.JON330GA-redhat-9] at org.rhq.enterprise.server.scheduler.jobs.DataCalcJob.calculateAutoBaselines(DataCalcJob.java:116) [rhq-server.jar:4.12.0.JON330GA-redhat-9] at org.rhq.enterprise.server.scheduler.jobs.DataCalcJob.executeJobCode(DataCalcJob.java:84) [rhq-server.jar:4.12.0.JON330GA-redhat-9] at org.rhq.enterprise.server.scheduler.jobs.AbstractStatefulJob.execute(AbstractStatefulJob.java:48) [rhq-server.jar:4.12.0.JON330GA-redhat-9] at org.quartz.core.JobRunShell.run(JobRunShell.java:202) [quartz-1.6.5.jar:1.6.5] at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:525) [quartz-1.6.5.jar:1.6.5] Caused by: org.hibernate.TransientPropertyValueException: object references an unsaved transient instance - save the transient instance before flushing: org.rhq.core.domain.measurement.MeasurementSchedule.baseline -> org.rhq.core.domain.measurement.MeasurementBaseline Version-Release number of selected component (if applicable): 3.3.10
Fixed in the master (assuming I guessed the correct reason based on this quite limited information): commit 75211fae4e10f846023158091023b3e87677ea4a (HEAD -> master, upstream/master) Author: Michael Burman <miburman> Date: Mon Jun 25 13:28:25 2018 +0300 [BZ 1594305] Do not change schedule instance ownership if the baseline includes NaNs. Move the NaN detection before the set methods (instead of only avoiding persist)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2930