Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 895689

Summary: cache maintenance thread in wallabyoperations sometimes hangs in calls to wallaby client or QMF api preventing thread shutdown
Product: Red Hat Enterprise MRG Reporter: Trevor McKay <tmckay>
Component: cuminAssignee: grid-maint-list <grid-maint-list>
Status: CLOSED WONTFIX QA Contact: MRG Quality Engineering <mrgqe-bugs>
Severity: medium Docs Contact:
Priority: low    
Version: 2.2CC: matt, sgraf
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-05-26 20:23:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Trevor McKay 2013-01-15 19:17:51 UTC
Description of problem:

The cache thread sometimes does not seem to return from calls to the wallaby api, or does not return very quickly.  If cumin is shutting down during one of these calls it can cause the shutdown thread to timeout.

Two cases that have been seen:

A call to get data takes a long time 10-15 seconds or greater which is longer than the 5 second allowed shutdown time.

The call to delBroker to clean up the broker object for wallaby operations does not return and does not raise an exception (or maybe just takes a really, really, long time to return)

Version-Release number of selected component (if applicable):

anything 2.2+ I believe

How reproducible:

25-50% ?

Steps to Reproduce:
1. Set up cumin and wallaby on a broker
2. Populate wallaby with tags and features
3. Start Cumin
4. Stop Cumin
5. Look for messages in web.log that look like this

5004 2013-01-15 17:45:48,845 DEBUG WallabyOperations: waiting for cache maintenance thread to exit

<there may be other stuff in here>

5004 2013-01-15 17:45:53,845 INFO Shutdown thread timed out, exiting

Trying to time the shutdown may help reproduce.  If 'service cumin stop' is issued immediately after a lot message that says "WallabyOperations: refreshing XXX" it might increase reproducibility.  trace -f /var/log/cumin-web | grep works nicely.
  
Actual results:

The shutdown thread sometimes times out

Expected results:

Ideally, the shutdown thread should not timeout

Additional info:

Comment 1 Trevor McKay 2013-01-18 20:51:39 UTC
Note, this has now been seen on both el5 and el6

Comment 3 Anne-Louise Tangring 2016-05-26 20:23:00 UTC
MRG-Grid is in maintenance and only customer escalations will be considered. This issue can be reopened if a customer escalation associated with it occurs.