Bug 591501

Summary:	NPE's when shutting down EWP 5.0.1 CR2
Product:	[Other] RHQ Project	Reporter:	Ondřej Žižka <ozizka>
Component:	Plugin Container	Assignee:	RHQ Project Maintainer <rhq-maint>
Status:	CLOSED CURRENTRELEASE	QA Contact:	Mike Foley <mfoley>
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	3.0.0	CC:	ccrouch, jshaughn, loleary
Target Milestone:	---
Target Release:	---
Hardware:	All
OS:	All
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2014-05-29 18:03:02 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	725852, 577052

Description Ondřej Žižka 2010-05-12 12:27:39 UTC

14:21:58,822 INFO  [ServerImpl] Runtime shutdown hook called, forceHalt: true
14:21:59,285 INFO  [Http11Protocol] Pausing Coyote HTTP/1.1 on http-0.0.0.0-8080
14:21:59,287 INFO  [Http11Protocol] Stopping Coyote HTTP/1.1 on http-0.0.0.0-8080
14:22:00,079 INFO  [AjpProtocol] Pausing Coyote AJP/1.3 on ajp-0.0.0.0-8009
14:22:00,084 INFO  [AjpProtocol] Stopping Coyote AJP/1.3 on ajp-0.0.0.0-8009
14:22:00,538 INFO  [TomcatDeployment] undeploy, ctxPath=/jbossws
14:22:00,604 INFO  [TomcatDeployment] undeploy, ctxPath=/admin-console
14:22:00,659 INFO  [BootstrapAction] Shutting down Administration Console...
14:22:00,766 WARN  [InventoryManager] Failure during discovery for [Service Binding Manager] Resources - failed after 525 ms.
java.lang.NullPointerException
        at org.rhq.core.pc.util.DiscoveryComponentProxyFactory$ResourceDiscoveryComponentInvocationHandler.invokeInNewThread(DiscoveryComponentProxyFactory.java:204)
        at org.rhq.core.pc.util.DiscoveryComponentProxyFactory$ResourceDiscoveryComponentInvocationHandler.invoke(DiscoveryComponentProxyFactory.java:193)
        at $Proxy142.discoverResources(Unknown Source)
        at org.rhq.core.pc.inventory.InventoryManager.invokeDiscoveryComponent(InventoryManager.java:279)
        at org.rhq.core.pc.inventory.InventoryManager.executeComponentDiscovery(InventoryManager.java:1855)
        at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.discoverForResource(RuntimeDiscoveryExecutor.java:239)
        at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.runtimeDiscover(RuntimeDiscoveryExecutor.java:141)
        at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.call(RuntimeDiscoveryExecutor.java:98)
        at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.call(RuntimeDiscoveryExecutor.java:55)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:207)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
14:22:00,882 WARN  [PluginClassLoader] Failed to purge embedded jars directory. Cause: java.io.IOException: Failed to delete file or directory: /home/brq/ozizka/work/JON/jboss-ewp-5.0/server/default/tmp/embjopr/jopr-jboss-as-5-plugin-2.3.0.EmbJopr.1.3.0-4.jar3341543028101815950.classloader
14:22:00,997 WARN  [PluginClassLoader] Failed to purge embedded jars directory. Cause: java.io.IOException: Failed to delete file or directory: /home/brq/ozizka/work/JON/jboss-ewp-5.0/server/default/tmp/embjopr/rhq-jmx-plugin-1.3.0.EmbJopr.1.3.0-4.jar4976930695679864498.classloader
14:22:01,003 WARN  [PluginClassLoader] Failed to purge embedded jars directory. Cause: java.io.IOException: Failed to delete file or directory: /home/brq/ozizka/work/JON/jboss-ewp-5.0/server/default/tmp/embjopr/jopr-jboss-as-5-plugin-2.3.0.EmbJopr.1.3.0-4.jar3341543028101815950.classloader/.nfs0000000000c553d8000011d8
14:22:01,040 WARN  [PluginContainer] Failed to purge contents of temporary directory - cause: java.io.IOException: Failed to delete file or directory: /home/brq/ozizka/work/JON/jboss-ewp-5.0/server/default/tmp/embjopr/rhq-jmx-plugin-1.3.0.EmbJopr.1.3.0-4.jar4976930695679864498.classloader/.nfs0000000000c553d5000011df
14:22:01,045 ERROR [RuntimeDiscoveryExecutor] Error in runtime discovery
java.lang.NullPointerException
        at org.rhq.core.pc.inventory.InventoryManager.removeResourceAndIndicateIfScanIsNeeded(InventoryManager.java:952)
        at org.rhq.core.pc.inventory.InventoryManager.removeResourceAndIndicateIfScanIsNeeded(InventoryManager.java:944)
        at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.removeStaleResources(RuntimeDiscoveryExecutor.java:276)
        at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.discoverForResource(RuntimeDiscoveryExecutor.java:257)
        at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.runtimeDiscover(RuntimeDiscoveryExecutor.java:141)
        at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.call(RuntimeDiscoveryExecutor.java:98)
        at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.call(RuntimeDiscoveryExecutor.java:55)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:207)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
14:22:01,064 ERROR [RuntimeDiscoveryExecutor] Error in runtime discovery
java.lang.NullPointerException
        at org.rhq.core.pc.plugin.PluginComponentFactory.getDiscoveryComponent(PluginComponentFactory.java:81)
        at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.discoverForResource(RuntimeDiscoveryExecutor.java:222)
        at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.runtimeDiscover(RuntimeDiscoveryExecutor.java:141)
        at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.call(RuntimeDiscoveryExecutor.java:98)
        at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.call(RuntimeDiscoveryExecutor.java:55)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:207)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
14:22:01,065 INFO  [TomcatDeployment] undeploy, ctxPath=/
14:22:01,065 ERROR [RuntimeDiscoveryExecutor] Error in runtime discovery
java.lang.NullPointerException
        at org.rhq.core.pc.plugin.PluginComponentFactory.getDiscoveryComponent(PluginComponentFactory.java:81)
        at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.discoverForResource(RuntimeDiscoveryExecutor.java:222)
        at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.runtimeDiscover(RuntimeDiscoveryExecutor.java:141)
        at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.call(RuntimeDiscoveryExecutor.java:98)
        at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.call(RuntimeDiscoveryExecutor.java:55)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:207)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
14:22:01,116 INFO  [TomcatDeployment] undeploy, ctxPath=/jmx-console
14:22:01,249 INFO  [TomcatDeployment] undeploy, ctxPath=/EarWithWar-Counter-war
14:22:01,252 INFO  [TomcatDeployment] undeploy, ctxPath=/test
14:22:01,252 WARN  [TomcatDeployment] Failed to map vhost: snert.home.pilhuhn.de
14:22:01,252 WARN  [TomcatDeployment] Failed to map vhost: snert.home.bsd.de
14:22:01,257 INFO  [StandardService] Stopping service jboss.web
14:22:01,609 INFO  [MailService] Mail service 'java:/Mail' removed from JNDI
14:22:01,649 INFO  [ConnectionFactoryBindingService] Unbound ConnectionManager 'jboss.jca:service=DataSourceBinding,name=DefaultDS' from JNDI name 'java:DefaultDS'
14:22:02,148 INFO  [TransactionManagerService] Stopping transaction recovery manager
14:22:02,149 INFO  [TransactionManagerService] Destroying TransactionManagerService
14:22:02,482 INFO  [HypersonicDatabase] Database standalone closed clean
14:22:05,276 INFO  [ServerImpl] Shutdown complete
Shutdown complete
Halting VM

Comment 1 Charles Crouch 2010-05-14 15:48:11 UTC

So this is related to the Embedded Console, not JON correct?

Comment 2 Charles Crouch 2010-05-14 15:59:25 UTC

Ozizka, how repeatable are these errors? Do they also occur in EAP5?

There appear to be several separate issues:

14:22:00,766 WARN  [InventoryManager] Failure during discovery for [Service
Binding Manager] Resources - failed after 525 ms.
java.lang.NullPointerException        at
org.rhq.core.pc.util.DiscoveryComponentProxyFactory$ResourceDiscoveryComponentInvocationHandler.invokeInNewThread(DiscoveryComponentProxyFactory.java:204)
 

14:22:01,045 ERROR [RuntimeDiscoveryExecutor] Error in runtime discovery
java.lang.NullPointerException        at
org.rhq.core.pc.inventory.InventoryManager.removeResourceAndIndicateIfScanIsNeeded(InventoryManager.java:952)

then multiple instances of 

14:22:01,064 ERROR [RuntimeDiscoveryExecutor] Error in runtime discovery
java.lang.NullPointerException        at
org.rhq.core.pc.plugin.PluginComponentFactory.getDiscoveryComponent(PluginComponentFactory.java:81)

It may not be feasible to fix these because they appear due to the discovery process being interrupted by a shutdown of the app server, but we can investigate.

Comment 3 Ian Springer 2010-07-15 12:56:03 UTC

Charles, correct, these errors have only been seen in the admin console (https://jira.jboss.org/browse/EMBJOPR-301), not RHQ enterprise.

Comment 4 Ian Springer 2010-07-19 16:49:50 UTC

I have not been able to reproduce this with an EAP 5.1 snapshot. But from looking at Ondra's log output, here is what I suspect is happening:

1) user hits Ctrl-C to initiate shutdown of the app server
2) app server invokes BootstrapAction.destroy() to shutdown admin console
3) admin console calls PluginContainer.shutdown() to shutdown PC
4) 
  a) before the PC has had a chance to call InventoryManager.shutdown(), the discovery executor kicks off a runtime discovery scan 
  -or-
  b) a runtime scan that was started prior to 3) could still be in progress
5) InventoryManager.shutdown() shuts down, nulling out all its ResourceContainers and other fields
6) the runtime scan encounters NPEs when it tries to access the now-nulled-out fields from the InventoryManager

To greatly minimize the chances of such NPEs, we could do the following:

1) as the very first thing it does, PluginContainer.shutdown() could flip an internal stopping flag to true
2) it could expose this flag via a public isStopping() method
3) PC jobs such as avail scans and discovery scans could check whether the PC is stopping, and if so, gracefully abort; they could also check if their thread has been interrupted, and in that case also gracefully abort
4) various PC managers could also make sure not to service any requests from the Server if the PC is in the stopping state

Comment 5 Corey Welton 2010-09-24 12:28:28 UTC

 ips - how important is this to fix, or can this be closed?

Comment 6 Ian Springer 2010-09-24 23:00:33 UTC

It's not critical, but I'd like to see it eventually get done, so I'd prefer to keep it open.

Comment 7 Ondřej Žižka 2011-07-19 19:23:51 UTC

Any progress?