Bug 1149622
| Summary: | Description of the discovered resources missing in some situations | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [JBoss] JBoss Operations Network | Reporter: | bkramer <bkramer> | ||||
| Component: | Plugin Container | Assignee: | Michael Burman <miburman> | ||||
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Mike Foley <mfoley> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | JON 3.2.3 | CC: | bkramer, fbrychta, jshaughn, loleary, myarboro, theute | ||||
| Target Milestone: | --- | ||||||
| Target Release: | JON 3.3.6 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2016-05-05 14:31:58 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1149625 | ||||||
| Attachments: |
|
||||||
Cannot reproduce on JON 3.3.0 with agent on RHEL 5.8 2.6.18-308.el5. I checked sigar .. and it has not been updated since 3.2.0, I also checked the code and I was not able to find any coincidence between exceptions in agent.log - which denote your disk may have been very busy and was not able to calcuclate free space within 30s and empty descriptions of discovered resources. Could you please try to reproduce on 3.3.0? Moving into CR01 target milestone as missed ER01 cutoff. @bkramer were you able to reproduce this using JBoss ON 3.3? @Libor, do you have any thoughts on why the default description is not getting applied to newly discovered platform and its child RHQ Agent resource? Even if we are not able to reproduce this issue, it should be possible to review the code to see in what circumstance the description could go missing. To be clear, the issue here is that the resource description is not getting set to the default specified by the resource type definition. It may be possible that other attributes values also go missing but the focus here is the observed bug in where descriptions sometimes go missing. The severity of the issue is probably on the low or medium side but was increased to high due to the other errors surrounding the circumstance in where the issue occurred. Primarily with the discovery thread interruptions and the blacklisting of valid resources. (In reply to Libor Zoubek from comment #4) > Cannot reproduce on JON 3.3.0 with agent on RHEL 5.8 2.6.18-308.el5. > > I checked sigar .. and it has not been updated since 3.2.0, I also checked > the code and I was not able to find any coincidence between exceptions in > agent.log - which denote your disk may have been very busy and was not able > to calcuclate free space within 30s and empty descriptions of discovered > resources. > > Could you please try to reproduce on 3.3.0? Sorry for late reply! I saw above error only twice while working on 3.2.0. So far, I didn't notice the same in 3.3.0. Does anyone know if when the description is missing if it's actually set in the database on the NEW resource? Wondering if this is maybe some sort of display issue only. From the original case, we do not know. @bkramer, do you by chance still have this reproduced somewhere and can check. The question is whether this is a UI display issue or if the actual row in the RHQ_RESOURCE table is too missing the description. (In reply to Larry O'Leary from comment #17) > From the original case, we do not know. > > @bkramer, do you by chance still have this reproduced somewhere and can > check. The question is whether this is a UI display issue or if the actual > row in the RHQ_RESOURCE table is too missing the description. Larry, I don't have that environment any more. I can try to reproduce it again but since I logged this Bugzilla I never noticed the same issue. |
Created attachment 944192 [details] screen shot showing "missing description" issue Description of problem: In some cases, description field of the newly discovered resource is missing. Version-Release number of selected component (if applicable): JBoss ON 3.2.3 How reproducible: I have managed to reproduce it twice and another user saw the same behaviour quite a few times. Steps to Reproduce: 1. My box is running on Linux 2.6.35.14-106.fc14.x86_64 #1 SMP Wed Nov 23 13:07:52 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux and the result of df -T command gives: ******************************** Filesystem Type 1K-blocks Used Available Use% Mounted on /dev/mapper/vg_bkramer-lv_root ext4 51606140 27049268 21935432 56% / tmpfs tmpfs 3993744 612 3993132 1% /dev/shm /dev/sda1 ext4 495844 49585 420659 11% /boot /dev/mapper/vg_bkramer-lv_home ext4 422717952 269655532 131589552 68% /home ******************************* The other boxes where the same behaviour has been seen are: ******************************* 1. Linux 2.6.18-274.el5 #1 SMP Fri Jul 8 17:36:59 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux 2. Linux 2.6.18-194.el5 #1 SMP Tue Mar 16 21:52:39 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux ******************************* 2. Installed JBoss ON 3.2.3 Server and Agent; 3. Navigate to JBoss ON UI and Discovery Queue; Actual results: Platform resource and RHQ Agent resource have been discovered and in the Discovery Queue but Description is missing for both or for only one of the resources. At the time when above happened, the following has been logged in the agent.log file: ******************************** 2014-09-24 14:44:43,944 WARN [InventoryManager.discovery-1] (rhq.core.pc.util.DiscoveryComponentProxyFactory)- The discovery component for resource type [ResourceType[id=0, name=File System, plugin=Platforms, category=Service]] has been blacklisted 2014-09-24 14:44:43,946 WARN [InventoryManager.discovery-1] (rhq.core.pc.inventory.InventoryManager)- Discovery for Resources of [ResourceType[id=0, name=File System, plugin=Platforms, category=Service]] has been running for more than 300000 milliseconds. This may be a plugin bug. org.rhq.core.pc.inventory.TimeoutException: Call to [org.rhq.plugins.platform.FileSystemDiscoveryComponent.discoverResources()] with args [[org.rhq.core.pluginapi.inventory.ResourceDiscoveryContext@712dd44f]] timed out. Invocation thread will be interrupted. at org.rhq.core.pc.util.DiscoveryComponentProxyFactory$ResourceDiscoveryComponentInvocationHandler.invokeInNewThread(DiscoveryComponentProxyFactory.java:256) at org.rhq.core.pc.util.DiscoveryComponentProxyFactory$ResourceDiscoveryComponentInvocationHandler.invoke(DiscoveryComponentProxyFactory.java:218) at com.sun.proxy.$Proxy39.discoverResources(Unknown Source) at org.rhq.core.pc.inventory.InventoryManager.invokeDiscoveryComponent(InventoryManager.java:348) at org.rhq.core.pc.inventory.InventoryManager.executeComponentDiscovery(InventoryManager.java:2682) at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.discoverForResource(RuntimeDiscoveryExecutor.java:280) at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.runtimeDiscover(RuntimeDiscoveryExecutor.java:142) at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.call(RuntimeDiscoveryExecutor.java:105) at org.rhq.core.pc.inventory.RuntimeDiscoveryExecutor.call(RuntimeDiscoveryExecutor.java:61) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) Caused by: java.lang.Exception: Thread[ResourceDiscoveryComponent.invoker.daemon-1,5,main] with id [22] is hung. This exception contains its stack trace. at org.hyperic.sigar.FileSystemUsage.gather(Native Method) at org.hyperic.sigar.FileSystemUsage.fetch(FileSystemUsage.java:30) at org.hyperic.sigar.Sigar.getMountedFileSystemUsage(Sigar.java:712) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.rhq.core.system.SigarAccessHandler.invoke(SigarAccessHandler.java:128) at com.sun.proxy.$Proxy38.getMountedFileSystemUsage(Unknown Source) at org.rhq.core.system.FileSystemInfo.refresh(FileSystemInfo.java:60) at org.rhq.core.system.FileSystemInfo.<init>(FileSystemInfo.java:43) at org.rhq.core.system.NativeSystemInfo.getFileSystems(NativeSystemInfo.java:324) at org.rhq.plugins.platform.FileSystemDiscoveryComponent.discoverResources(FileSystemDiscoveryComponent.java:62) at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.rhq.core.pc.util.DiscoveryComponentProxyFactory$ComponentInvocationThread.call(DiscoveryComponentProxyFactory.java:305) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) ... 3 more 2014-09-24 14:44:43,951 INFO [InventoryManager.discovery-1] (rhq.core.pc.inventory.InventoryManager)- Discovery for [File System] resources took [300008] ms ******************************** On the second installation that showed the same problem the following has been logged: ******************************** 2014-09-24 13:22:07,600 DEBUG [InventoryManager.availability-1] (rhq.core.pc.inventory.ResourceContainer$ResourceComponentInvocationHandler)- Call to [org.rhq.plugins.platform.FileSystemComponent.getAvailability()] with args [] timed out after 5000 milliseconds - invocation thread will be interrupted. 2014-09-24 13:22:07,600 WARN [InventoryManager.availability-1] (rhq.core.pc.inventory.ResourceContainer$ComponentInvocation)- Invocation has been marked interrupted for method [public abstract org.rhq.core.domain.measurement.AvailabilityType org.rhq.core.pluginapi.availability.AvailabilityFacet.getAvailability()] on resource [Resource[id=10017, uuid=0f6381e0-4e4c-4b69-8000-551048c35c51, type={Platforms}File System, key=/home, name=/home, parent=bkramerlt.usersys.redhat.com]] 2014-09-24 13:22:07,600 ERROR [ResourceContainer.invoker.daemon-4] (org.rhq.core.system.FileSystemInfo)- An error occurred while refreshing the usage data for file system mounted at [/home]. java.lang.reflect.UndeclaredThrowableException at com.sun.proxy.$Proxy38.getMountedFileSystemUsage(Unknown Source) at org.rhq.core.system.FileSystemInfo.refresh(FileSystemInfo.java:60) at org.rhq.core.system.FileSystemInfo.<init>(FileSystemInfo.java:43) at org.rhq.core.system.NativeSystemInfo.getFileSystem(NativeSystemInfo.java:341) at org.rhq.plugins.platform.FileSystemComponent.getFileSystemInfo(FileSystemComponent.java:95) at org.rhq.plugins.platform.FileSystemComponent.getAvailability(FileSystemComponent.java:61) at sun.reflect.GeneratedMethodAccessor47.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.rhq.core.pc.inventory.ResourceContainer$ComponentInvocation.call(ResourceContainer.java:654) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) Caused by: java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireNanos(AbstractQueuedSynchronizer.java:934) at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1245) at java.util.concurrent.locks.ReentrantLock.tryLock(ReentrantLock.java:445) at org.rhq.core.system.SigarAccessHandler.invoke(SigarAccessHandler.java:119) ... 15 more 2014-09-24 13:22:07,600 DEBUG [InventoryManager.availability-1] (rhq.core.pc.inventory.ResourceContainer$ResourceComponentInvocationHandler)- Facet lock status for [Resource[id=10017, uuid=0f6381e0-4e4c-4b69-8000-551048c35c51, type={Platforms}File System, key=/home, name=/home, parent=bkramerlt.usersys.redhat.com]], is-write-locked=[false], is-write-locked-by-current-thread=[false], read-locks=[0], waiting-for-lock-queue-size=[0] 2014-09-24 13:22:07,603 WARN [InventoryManager.availability-1] (rhq.core.pc.inventory.AvailabilityExecutor)- Availability collection timed out on Resource[id=10017, uuid=0f6381e0-4e4c-4b69-8000-551048c35c51, type={Platforms}File System, key=/home, name=/home, parent=bkramerlt.usersys.redhat.com], availability will be reported as DOWN ******************************** See also attached screen shot. Expected results: No exception is thrown and everything is properly discovered. Additional info: