Bug 736439 - async uninventory job can fail due to timeout of overlord session when uninventorying lots of Resources
Summary: async uninventory job can fail due to timeout of overlord session when uninve...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: RHQ Project
Classification: Other
Component: Core Server
Version: 4.1
Hardware: Unspecified
OS: Unspecified
medium
urgent
Target Milestone: ---
: ---
Assignee: Ian Springer
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On:
Blocks: jon3
TreeView+ depends on / blocked
 
Reported: 2011-09-07 17:15 UTC by Ian Springer
Modified: 2013-08-06 00:40 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)

Description Ian Springer 2011-09-07 17:15:29 UTC
I uninventoried a platform with around 1500 descendant Resources. I then observed a slew of "PermissionException: The session ID for user [admin] is invalid!" warnings in the Server log (full stack trace below). It looks like the issue is that AsyncResourceDeleteJob looks up the overlord once during its instantiation and then tries to use that same overlord instance for the entire job execution. The overlord instance is only usable for 2 minutes, at which point its session expires (see SessionManager.OVERLORD_TIMEOUT). And uninventorying a large number of Resources can easily take more than 2 minutes. We should change the code to look up a fresh overlord instance from the session manager for each Resource being uninventoried.

---

12:53:42,468 WARN  [AsyncResourceDeleteJob] Failed to get jobs for a resource being deleted [11073]; will not attempt to unschedule anything
org.rhq.enterprise.server.authz.PermissionException: The session ID for user [admin] is invalid!: invocation: method=public java.util.List<org.rhq.core.domain.operation.bean.ResourceOperationSchedule> org.rhq.enterprise.server.operation.OperationManagerBean.findScheduledResourceOperations(org.rhq.core.domain.auth.Subject,int) throws java.lang.Exception,context-data={}
	at org.rhq.enterprise.server.authz.RequiredPermissionsInterceptor.buildPermissionException(RequiredPermissionsInterceptor.java:164)
	at org.rhq.enterprise.server.authz.RequiredPermissionsInterceptor.buildPermissionException(RequiredPermissionsInterceptor.java:160)
	at org.rhq.enterprise.server.authz.RequiredPermissionsInterceptor.checkRequiredPermissions(RequiredPermissionsInterceptor.java:100)
	at sun.reflect.GeneratedMethodAccessor114.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.jboss.ejb3.interceptor.InvocationContextImpl.proceed(InvocationContextImpl.java:118)
	at org.jboss.ejb3.interceptor.EJB3InterceptorsInterceptor.invoke(EJB3InterceptorsInterceptor.java:63)
	at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
	at org.jboss.ejb3.entity.TransactionScopedEntityManagerInterceptor.invoke(TransactionScopedEntityManagerInterceptor.java:54)
	at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
	at org.jboss.ejb3.AllowedOperationsInterceptor.invoke(AllowedOperationsInterceptor.java:47)
	at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
	at org.jboss.aspects.tx.TxPolicy.invokeInOurTx(TxPolicy.java:79)
	at org.jboss.aspects.tx.TxInterceptor$Required.invoke(TxInterceptor.java:191)
	at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
	at org.jboss.aspects.tx.TxPropagationInterceptor.invoke(TxPropagationInterceptor.java:95)
	at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
	at org.jboss.ejb3.stateless.StatelessInstanceInterceptor.invoke(StatelessInstanceInterceptor.java:62)
	at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
	at org.jboss.aspects.security.AuthenticationInterceptor.invoke(AuthenticationInterceptor.java:77)
	at org.jboss.ejb3.security.Ejb3AuthenticationInterceptor.invoke(Ejb3AuthenticationInterceptor.java:110)
	at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
	at org.jboss.ejb3.ENCPropagationInterceptor.invoke(ENCPropagationInterceptor.java:46)
	at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
	at org.jboss.ejb3.asynchronous.AsynchronousInterceptor.invoke(AsynchronousInterceptor.java:106)
	at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
	at org.jboss.ejb3.stateless.StatelessContainer.localInvoke(StatelessContainer.java:240)
	at org.jboss.ejb3.stateless.StatelessContainer.localInvoke(StatelessContainer.java:210)
	at org.jboss.ejb3.stateless.StatelessLocalProxy.invoke(StatelessLocalProxy.java:84)
	at $Proxy431.findScheduledResourceOperations(Unknown Source)
	at org.rhq.enterprise.server.scheduler.jobs.AsyncResourceDeleteJob.unscheduleJobs(AsyncResourceDeleteJob.java:110)
	at org.rhq.enterprise.server.scheduler.jobs.AsyncResourceDeleteJob.uninventoryResource(AsyncResourceDeleteJob.java:95)
	at org.rhq.enterprise.server.scheduler.jobs.AsyncResourceDeleteJob.executeJobCode(AsyncResourceDeleteJob.java:68)
	at org.rhq.enterprise.server.scheduler.jobs.AbstractStatefulJob.execute(AbstractStatefulJob.java:48)
	at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:525)

Comment 1 Ian Springer 2011-09-07 17:24:51 UTC
Note, besides passing the potentially expired overlord to unscheduleJobs(), AsyncResourceDeleteJob also passes it to the below SLSB call:

  resourceManager.uninventoryResourceAsyncWork(overlord, doomedResourceId);

which would cause the actual uninventorying to fail.

Comment 2 Ian Springer 2011-09-08 15:10:20 UTC
[master be4d0cd] fixes this as described above.

Comment 3 Mike Foley 2011-09-09 15:15:02 UTC
verified by performing the basic use-case inventory/uninventory/inventory.

Comment 4 Mike Foley 2012-02-07 19:31:39 UTC
marking VERIFIED BZs to CLOSED/CURRENTRELEASE

Comment 5 Mike Foley 2012-02-07 19:31:40 UTC
changing status of VERIFIED BZs for JON 2.4.2 and JON 3.0 to CLOSED/CURRENTRELEASE


Note You need to log in before you can comment on or make changes to this bug.