Bug 736439

Summary: async uninventory job can fail due to timeout of overlord session when uninventorying lots of Resources
Product: [Other] RHQ Project Reporter: Ian Springer <ian.springer>
Component: Core ServerAssignee: Ian Springer <ian.springer>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: urgent Docs Contact:
Priority: medium    
Version: 4.1CC: ccrouch, hrupp
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 678340    

Description Ian Springer 2011-09-07 17:15:29 UTC
I uninventoried a platform with around 1500 descendant Resources. I then observed a slew of "PermissionException: The session ID for user [admin] is invalid!" warnings in the Server log (full stack trace below). It looks like the issue is that AsyncResourceDeleteJob looks up the overlord once during its instantiation and then tries to use that same overlord instance for the entire job execution. The overlord instance is only usable for 2 minutes, at which point its session expires (see SessionManager.OVERLORD_TIMEOUT). And uninventorying a large number of Resources can easily take more than 2 minutes. We should change the code to look up a fresh overlord instance from the session manager for each Resource being uninventoried.

---

12:53:42,468 WARN  [AsyncResourceDeleteJob] Failed to get jobs for a resource being deleted [11073]; will not attempt to unschedule anything
org.rhq.enterprise.server.authz.PermissionException: The session ID for user [admin] is invalid!: invocation: method=public java.util.List<org.rhq.core.domain.operation.bean.ResourceOperationSchedule> org.rhq.enterprise.server.operation.OperationManagerBean.findScheduledResourceOperations(org.rhq.core.domain.auth.Subject,int) throws java.lang.Exception,context-data={}
	at org.rhq.enterprise.server.authz.RequiredPermissionsInterceptor.buildPermissionException(RequiredPermissionsInterceptor.java:164)
	at org.rhq.enterprise.server.authz.RequiredPermissionsInterceptor.buildPermissionException(RequiredPermissionsInterceptor.java:160)
	at org.rhq.enterprise.server.authz.RequiredPermissionsInterceptor.checkRequiredPermissions(RequiredPermissionsInterceptor.java:100)
	at sun.reflect.GeneratedMethodAccessor114.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.jboss.ejb3.interceptor.InvocationContextImpl.proceed(InvocationContextImpl.java:118)
	at org.jboss.ejb3.interceptor.EJB3InterceptorsInterceptor.invoke(EJB3InterceptorsInterceptor.java:63)
	at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
	at org.jboss.ejb3.entity.TransactionScopedEntityManagerInterceptor.invoke(TransactionScopedEntityManagerInterceptor.java:54)
	at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
	at org.jboss.ejb3.AllowedOperationsInterceptor.invoke(AllowedOperationsInterceptor.java:47)
	at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
	at org.jboss.aspects.tx.TxPolicy.invokeInOurTx(TxPolicy.java:79)
	at org.jboss.aspects.tx.TxInterceptor$Required.invoke(TxInterceptor.java:191)
	at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
	at org.jboss.aspects.tx.TxPropagationInterceptor.invoke(TxPropagationInterceptor.java:95)
	at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
	at org.jboss.ejb3.stateless.StatelessInstanceInterceptor.invoke(StatelessInstanceInterceptor.java:62)
	at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
	at org.jboss.aspects.security.AuthenticationInterceptor.invoke(AuthenticationInterceptor.java:77)
	at org.jboss.ejb3.security.Ejb3AuthenticationInterceptor.invoke(Ejb3AuthenticationInterceptor.java:110)
	at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
	at org.jboss.ejb3.ENCPropagationInterceptor.invoke(ENCPropagationInterceptor.java:46)
	at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
	at org.jboss.ejb3.asynchronous.AsynchronousInterceptor.invoke(AsynchronousInterceptor.java:106)
	at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:101)
	at org.jboss.ejb3.stateless.StatelessContainer.localInvoke(StatelessContainer.java:240)
	at org.jboss.ejb3.stateless.StatelessContainer.localInvoke(StatelessContainer.java:210)
	at org.jboss.ejb3.stateless.StatelessLocalProxy.invoke(StatelessLocalProxy.java:84)
	at $Proxy431.findScheduledResourceOperations(Unknown Source)
	at org.rhq.enterprise.server.scheduler.jobs.AsyncResourceDeleteJob.unscheduleJobs(AsyncResourceDeleteJob.java:110)
	at org.rhq.enterprise.server.scheduler.jobs.AsyncResourceDeleteJob.uninventoryResource(AsyncResourceDeleteJob.java:95)
	at org.rhq.enterprise.server.scheduler.jobs.AsyncResourceDeleteJob.executeJobCode(AsyncResourceDeleteJob.java:68)
	at org.rhq.enterprise.server.scheduler.jobs.AbstractStatefulJob.execute(AbstractStatefulJob.java:48)
	at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
	at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:525)

Comment 1 Ian Springer 2011-09-07 17:24:51 UTC
Note, besides passing the potentially expired overlord to unscheduleJobs(), AsyncResourceDeleteJob also passes it to the below SLSB call:

  resourceManager.uninventoryResourceAsyncWork(overlord, doomedResourceId);

which would cause the actual uninventorying to fail.

Comment 2 Ian Springer 2011-09-08 15:10:20 UTC
[master be4d0cd] fixes this as described above.

Comment 3 Mike Foley 2011-09-09 15:15:02 UTC
verified by performing the basic use-case inventory/uninventory/inventory.

Comment 4 Mike Foley 2012-02-07 19:31:39 UTC
marking VERIFIED BZs to CLOSED/CURRENTRELEASE

Comment 5 Mike Foley 2012-02-07 19:31:40 UTC
changing status of VERIFIED BZs for JON 2.4.2 and JON 3.0 to CLOSED/CURRENTRELEASE