Bug 752550 - agent not cleaning up old downloaded bundle version files
Summary: agent not cleaning up old downloaded bundle version files
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: RHQ Project
Classification: Other
Component: Agent
Version: 3.0.0
Hardware: x86_64
OS: Linux
high
medium
Target Milestone: ---
: ---
Assignee: John Mazzitelli
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On:
Blocks: jon30-sprint10, rhq43-sprint10
TreeView+ depends on / blocked
 
Reported: 2011-11-09 20:15 UTC by wertnick
Modified: 2012-02-07 19:30 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-02-07 19:29:46 UTC
Embargoed:


Attachments (Terms of Use)

Description wertnick 2011-11-09 20:15:03 UTC
Description of problem:

We have a job that uses JON to deploy every 20 minutes.  Over a day or so, the bundle will eventually fail its deployment with the following error. I replaced the changing filename with <somefile>.  The error goes away after re-setting the agent configuration (rhq-agent.sh -l).

java.io.FileNotFoundException: data/tmp/bundle-versions/12322/<somefile> (No such file or directory)
         at java.io.FileOutputStream.open(Native Method)
         at java.io.FileOutputStream.<init>(FileOutputStream.java:179)
         at java.io.FileOutputStream.<init>(FileOutputStream.java:131)
         at org.rhq.core.pc.bundle.BundleManager.downloadBundleFiles(BundleManager.java:267)
         at org.rhq.core.pc.bundle.BundleManager.access$000(BundleManager.java:75)
         at org.rhq.core.pc.bundle.BundleManager$1.run(BundleManager.java:162)
         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
         at java.lang.Thread.run(Thread.java:662)

Version-Release number of selected component (if applicable):


How reproducible:

This seems to pretty consistently occur, but it takes quite a few deploys before this happens.

Steps to Reproduce:
1. Start deploying a bundle to an instance repeatedly for a day or two.
2. Wait for a failed deploy.
3. Check the reason: FileNotFoundException.
  
Actual results:

A failed deploy.

Expected results:

A working deploy, like the other 50 or so times I deployed.

Additional info:

None.

Comment 1 wertnick 2011-11-10 16:06:44 UTC
I think I've found the real root cause of the problem. Your agents store bundle deployments in data/tmp, but data/tmp is never cleared, leading to the mount that this is stored on to fill up. Please advise. I guess there are two options I can see. Maybe you have a third.

1: periodically restart the agent with a --purgedata.
2: Some sort of JON call from the server that will purge the data (I don't see this command anywhere)
3: Something else?

Comment 2 John Mazzitelli 2011-12-05 15:16:27 UTC
yes, we'll have to implement some type of cleanup. the bundle files in data/tmp (IIRC) are not used once the bundle is fully deployed, so I think they can be purged.

Comment 3 John Mazzitelli 2011-12-06 19:25:03 UTC
master commit: 16823b9934ea91e81e0b7102bd464a57875fe033

the agent will now purge all old downloaded files from the tmp directory

to test:

1) start an agent
2) upload a bundle to the server, and prepare to deploy the bundle to the agent machine
3) before you deploy the bundle, create a temporary directory/file in the agent's data/tmp/bundle-versions directory:
   $ mkdir -p $RHQ_AGENT_HOME/data/tmp/bundle-versions/blah
   $ echo abc > $RHQ_AGENT_HOME/data/tmp/bundle-versions/blah/hello.txt
4) deploy the bundle

after the bundle is done deploying, you should see the "blah" directory you just created get deleted.  You will see a new directory created under bundle-versions (with a name like "10001"). This is fine - this is the latest bundle that was downloaded. We leave this be mainly for debugging purposes (in case something goes wrong, you can go in here and see what files actually got pulled down). You should only ever have one directory under bundle-versions, no matter how many bundles you've deployed in the past.

Comment 4 Sunil Kondkar 2011-12-08 12:04:06 UTC
Verified on master build#823 (Version: 4.3.0-SNAPSHOT Build Number: d7ef96e)

Followed the steps and verified that the "blah" directory in $RHQ_AGENT_HOME/data/tmp/bundle-versions get deleted after bundle deployment and a new directory with a name "10001" with deployed files is created under 'bundle-versions' directory.

Deployed bundles multiple times and verified that 'bundle-versions' directory has only one directory with latest bundle files and old directories with deployed files get deleted.

Marking as verified.

Comment 5 Mike Foley 2012-02-07 19:29:46 UTC
changing status of VERIFIED BZs for JON 2.4.2 and JON 3.0 to CLOSED/CURRENTRELEASE

Comment 6 Mike Foley 2012-02-07 19:30:22 UTC
marking VERIFIED BZs to CLOSED/CURRENTRELEASE


Note You need to log in before you can comment on or make changes to this bug.