Bug 801926

Summary: what to do if bundle AND dest dir both specific a subdir but each has different files in subdir?
Product: [Other] RHQ Project Reporter: John Mazzitelli <mazz>
Component: ProvisioningAssignee: RHQ Project Maintainer <rhq-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: high Docs Contact:
Priority: high    
Version: 4.3CC: dvanbale, hrupp, jshaughn, lkrejci, loleary, lzoubek, snegrea
Target Milestone: ---Keywords: Reopened
Target Release: RHQ 4.9   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 807411 (view as bug list) Environment:
Last Closed: 2014-02-24 21:07:25 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 801464, 807411    

Description John Mazzitelli 2012-03-09 20:58:58 UTC
Suppose I have a bundle with a subdirectory in it with a file inside the subdir:

bundle.zip:
   subdir/foo.txt

Now suppose I deploy this to a destination directory that already has that subdirectory in it, BUT with a different file:

dest-dir:
   subdir/bar.txt

What should the proper behavior be with managedRootDir=true? This is easy - the bundle has subdir in it with a file foo.txt. Bundle deployment wants to make the destination directory IDENTICAL to the bundle, and since manageRootDir=true, we don't care about what's in the destination directory. We deploy and the destination directory will have "subdir/foo.txt" and that is it - this matches the bundle.

BUT! Now suppose that my bundle has manageRootDir=FALSE. What should the behavior be?

On the one hand - the bundle specifies a "subdir" subdirectory inside of it. Therefore, we could say "the bundle has declared that it wants to own a subdirectory called "subdir" under the destination directory; therefore, the bundle will want to manage that subdirectory entirely - any files found in the subdir will be managed by the bundle, therefore, bar.txt will be removed and only foo.txt will exist." This is how it should be working today in the current code.

On the other hand, however, the destination directory (aka the "root dir") already has this subdir in it and our recipe told the bundle to not manage the root directory. What should we do with subdir/bar.txt? Should we leave it alone because we aren't supposed to manage the root directory (which, in this case, had a subdirectory with a file in it - and this file is not mentioned in the bundle).

The crux of the issue is therefore:

1) manageRootDir=false 
2) AND both the bundle and the destination directory has a subdirectory in it
3) AND the bundle subdirectory and the destination directory subdirectory do not have the same files in them.

What do we do with the files in the destination directory subdirectory that are NOT specified in the bundle subdirectory?

Which ever way we do it, we need some unit tests to test this out. Because this is not something I think we have unit tests for.

Comment 1 Larry O'Leary 2012-03-10 00:42:36 UTC
In either case, we need to make the initial deploy to a destination consistent with the update/redeploy to the same destination.

As of http://git.fedorahosted.org/git/?p=rhq/rhq.git;a=commit;h=ccd5b1f03334064e17f6353bff208aa304a116f8 the behavior is inconsistent. When you deploy the bundle for the first time to a destination and manageRootDir=false, we do not touch any of the files or subdirectories in the destination directory even when the bundle has the same subdirectory but on redeploy we do cleanup those files not in the bundle.

To illustrate:

bundle.zip:
   subdir/foo.txt


dest-dir:
   subdir/bar.txt


On initial deploy I end up with:

dest-dir:
   subdir/bar.txt
   subdir/foo.txt


But on update or redeploy I end up with:

dest-dir:
   subdir/foo.txt


From a package management stand-point, it seems that the initial deploy behavior is the correct one and the update/redeploy is the incorrect one.

Comment 2 Mike Foley 2012-03-12 15:41:05 UTC
per bz triage 3/12/2012 target for 3.1, medium priority, (crouch, loleary, foley)

Comment 3 John Mazzitelli 2012-03-12 20:07:54 UTC
(In reply to comment #1)
> In either case, we need to make the initial deploy to a destination consistent
> with the update/redeploy to the same destination.

I created BZ #802537 to track this.

Comment 4 John Mazzitelli 2012-03-14 20:33:10 UTC
I documented some use-cases on the wiki:

http://rhq-project.org/display/JOPR2/Provisioning#Provisioning-ExampleUseCases

and this use case specifically is here:

http://rhq-project.org/display/JOPR2/Provisioning#Provisioning-DeployingConfigurationFilesToAnAlreadyInstalledApplicationServer

After fixing bug #802537 which forced me to think about this more, I'm of the opinion now that this issue should be closed as NOTABUG. Here's why....

First, recall what bundles are. They are a *complete* set of files that make up a FULL piece of software. When first envisioned, this was supposed to be an entire set of files that make up a fully installed server (such as, for example, a JBoss Application Server installation). People liked bundles so much :) they wanted to use this to deploy applications to already installed app servers so we later enhanced this to allow things like a WAR (or EAR) to be deployed inside an already existing JBossAS server (such that we can deploy it in the deploy/ directory and leave other WARS and EARS (aka xARs) found here alone). This was the manageRootDir=false setting. But notice that these xAR bundles are STILL FULL applications. All the files that make up the xAR are in the bundle - there is nothing missing and nothing is to be added. Thus, when you lay down a xAR bundle, all the files from the bundle make up the full WAR/EAR app and any files found in the deployment directory (and more specifically their subdirectories) are removed to make way for the bundle.

Bundles were not meant to be a partial set of miscellenous or ad-hoc files that could be "added" to existing software in an existing deployment directory to "complete" that software installation. By this I mean, whatever is in a bundle is ALL the files that should be in the destination directory when the bundle is laid down. The bundle distribution file should look exactly how you want the end resultant destination directory to look (including all subdirectories found in the bundle). Any extra files that were found in the destination directory were backed up and removed.  That means any files found in any subdirectories that are in the bundle are cleaned to make way for the bundle's subdirectory content.

This is being stretched beyong its limits via this latest use-case - that being, someone wants to AUGMENT an existing JBoss Application Server's conf/ and conf/props directories. This is because, while it is allowed to deploy a bundle to the conf/ directory with manageRootDir=false, the problem is the bundle is shipped with a "props/jmx-user.properties" files but the app server already has files in this "props/" subdirectory. According to the design and rules of bundle deployment, because the bundle has a "props" subdirectory, this is saying that the bundle wants the "props" subdirectory to look EXACTLY how the bundle has it - in this case, with a single file called jmx-users.properties. Any other files found here during deployment are removed, because other wise, the destination directory's subdirectories won't look EXACTLY how the bundle has defined it (and that is what the RHQ bundle subsystem is told to ensure).

So, how can you support doing this? There are two things you can do. Either:

a) use the bundle subsystem as it was originally designed/intended - that is, to deploy the full JBoss App Server installation itself - this way, you can control the entire content of the JBossAS - including the content of conf/ and conf/props. So create a bundle with your full JBossAS installation, including your custom conf and conf/props files.

or

b) in your bundle, place all the files that you want in the "props" directory - not just the jmx-user.properties, but put in *all* the files that you want to exist in the conf/props (like jmx-roles.properties, etc.). This way, your bundle will manage the full conf/props directory and not just partially manage some of the content (remember, the bundle subsystem manages the entire content of all subdirectories, not just a partial subset of content).

I'm going to close this as NOTABUG - if folks feel this needs to be fixed, we can revisit. But if we are going to support something like this, realize this will require some new design/architecture in the bundle subsystem because it is going against one of the cardinal rules of the bundle system (that being, the bundles define the FULL set of content, not a subset). Implementation wise, I believe alot of changes and tests will be required to allow for this use-case.

Comment 5 John Mazzitelli 2012-03-14 20:39:28 UTC
there is a third alternative that I didn't think of. You could use additional ant code in your recipe to move files around if you want. So have the bundle deploy the files in some central location, then you just use ant in a post-install target to move the files to where you want.

Comment 6 Larry O'Leary 2012-03-14 21:11:42 UTC
I think this needs to be fully documented in the product documentation before we call this done.

Basically, your use-cases and examples above describe using bundles as a means of deploying WARs or EARs to an EAP instance. If this is the goal then we need to create EAR/WAR bundles. For example, the actual EAR/WAR resource type itself needs to extend itself to this concept. Outside of that, we are simply talking about provisioning an application server. Such provisioning requires configuration and library management outside of the deploy directory. So, my suggestion is not to think about *deploy* but to think about the *root* file system. An example of this is an application which requires database configuration. It is possible that a file would need to be added to lib to allow the application to access the database yet the bundle won't know or understand what other bundles and applications and configuration has been deployed in order to explicitly list the files and directories to ignore. The list is essentially impossible to know.

I agree that this is not how it was originally envisioned but what I can tell is that real-world use-cases were not considered when the proof-of-concept was created and instead of fleshing out those real use-cases, we ran with the PoC.

As it stands, this functionality is broken because *manageRootDir* seems to do something different then implied. Perhaps we need to change the name of this property to *cleanDestDir* or *cleanDeployDir*.

Comment 7 John Mazzitelli 2012-03-14 21:31:18 UTC
changing assignment to deon for documentation

Comment 8 Charles Crouch 2012-03-27 17:48:43 UTC
I've cloned a separate doc issue for this, and I'm keeping this open to discuss this issue further.

suggestion from loleary: managedRootDir could possibly be renamed maintainFilesOutsideBundle

Comment 9 John Mazzitelli 2012-05-10 20:39:28 UTC
(In reply to comment #8)
> suggestion from loleary: managedRootDir could possibly be renamed
> maintainFilesOutsideBundle

I don't think that's good enough either because it doesn't have any indication that we are only maintaining files in the TOP LEVEL, PEER directories that are at the same level as the top-level destination directory. The bundle always manages files in its subdirectories located under its destination directory.

some ideas for a renamed "manageRootDir" - these would be equivalent to manageRootDir=true:

manageDestDirFully=true
doNotTouchTopLevelDirectoriesFoundInDestDirButNotFoundInBundle=false
ignoreTopLevelDirectoriesInDestDirButNotFoundInBundle=false
cleanTopLevelDirectoriesFoundInDestDirAndNotFoundInBundle=true
bundleToManageAllTopLevelDirectoriesInDestDir=true
manageAllTopLevelDirectoriesInDestDir=true

Notice the explicit mentioning of "top level directories" to make it clear that we are talking about all the directories that are explicit children of the dest dir (not subdirectories).

Comment 10 Deon Ballard 2012-05-23 03:14:24 UTC
Reassigning to mazz, since Charles made a doc bug for me.

Comment 11 Larry O'Leary 2012-05-23 04:11:32 UTC
(In reply to comment #9)
> (In reply to comment #8)
> > suggestion from loleary: managedRootDir could possibly be renamed
> > maintainFilesOutsideBundle
> 
> I don't think that's good enough either because it doesn't have any
> indication that we are only maintaining files in the TOP LEVEL, PEER
> directories that are at the same level as the top-level destination
> directory. The bundle always manages files in its subdirectories located
> under its destination directory.
> 
> some ideas for a renamed "manageRootDir" - these would be equivalent to
> manageRootDir=true:
> 
> manageDestDirFully=true
> doNotTouchTopLevelDirectoriesFoundInDestDirButNotFoundInBundle=false
> ignoreTopLevelDirectoriesInDestDirButNotFoundInBundle=false
> cleanTopLevelDirectoriesFoundInDestDirAndNotFoundInBundle=true
> bundleToManageAllTopLevelDirectoriesInDestDir=true
> manageAllTopLevelDirectoriesInDestDir=true
> 
> Notice the explicit mentioning of "top level directories" to make it clear
> that we are talking about all the directories that are explicit children of
> the dest dir (not subdirectories).

The issue from my perspective is the limited scope and use-case of a bundle in the regards of provisioning a service or other component to a server resource. Bundle deployment should work like other package management systems. Content is laid down along side existing content and replaces files that already exist. An RPM for example contains a /usr/bin directory. The presence of the directory does not equal delete the existing directory before extracting the contents of the package. Instead, it lays the contents along side the existing files. However, if an existing file is contained in the package, the packaged version replaces the existing unless the existing is from another package at which point the new package fails to install allowing the user to resolve the conflicts.

I agree that the property might be doing what we wanted it to, but in the platform use-case. We did not plan for the property to be used in the provisioning of an EAP server child resource. As described in your previous comment, the property was essentially a way of providing a hack to the original use-case to be able to deploy a EAR or WAR to the deploy directory within the platform's file system without disturbing the other files and directories in the deploy directory. But when we are talking about bundles being deployed to an EAP resource, the property no longer makes sense and we should be rethinking how bundles get deployed alongside other bundles and existing content.

Comment 12 Charles Crouch 2012-05-30 20:40:37 UTC
Having discussed this with Larry, we believe it makes sense to move the discussion around an alternative name for the manageRootDir property that had been going over in the comments of bug 801926 over to bug 801464. This is because the fundamental issue behind 801464 is misinterpreting what the manageRootDir property does, where as 801926 is more around whether manageRootDir is doing the right thing or not.

The general plan is to investigate what can be done to mitigate the confusion around manageRootDir in the near team, hence 801464 will be targeted at JON3.1.1. 801926 will be closed as working as expected, since there have been no bugs identified in the current execution of the bundle subsystem with respect to the managedRootDir propert. Instead an RFE will be raised to extend the bundle subsystem to support a more generalized set of provisioning use cases, one sort Larry advocates for here: https://bugzilla.redhat.com/show_bug.cgi?id=801926#c11

Comment 13 Larry O'Leary 2012-09-24 16:00:06 UTC
Re-targeting as this seems to have been ignored and has very serious impact on users.

Comment 14 Lukas Krejci 2013-02-13 14:29:17 UTC
Having looked at both this BZ and BZ 801464, my gut feeling is that the use case Larry described in comment 11 is the most intuitive one - it just behaves exactly the way people are used to.

Therefore I think we should make that behavior the default, while also providing a way of keeping the original behavior.

Unfortunately all the existing bundles come with the default value of managerRootDir=true, which makes changing the default behavior rather problematic - it'd would break people's expectations.

So unfortunately, I think it won't be possible to change the default behavior. A workable way forward might be this:

1) leave manageRootDir=true as default
2) introduce a new attribute "overlay" with the default value of "false". This would mean the same thing as manageRootDir=true. If "overlay" was true, it would mean that the bundle deployment would just be laid over the existing filesystem contents with no additional "managing".
3) The deployment rules would follow this condition table (using tristate logic to add more fun):
manageRootDir | overlay | Action
------------------------------------------------------------------------------
   unspec     | unspec  | The default behavior as if manageRootDir was true
   unspec     | false   | The default behavior as if manageRootDir was true
   unspec     | true    | Overlay the bundle contents on top of the filesystem 
   false      | unspec  | Leave root dir alone, clean subdirs
   false      | false   | Illegal, not clear what to do
   false      | true    | Illegal, not clear what to do
   true       | unspec  | The default behavior, manageRootDir is true
   true       | false   | The default behavior, manageRootDir is true
   true       | true    | Illegal, not clear what to do

The above would:
1) Keep the current default behavior, leaving the bundles that do not explicitly set manageRootDir intact
2) Keep the option provided by manageRootDir=false - this actually is useful in case of EARs and similar deployments that need to have precisely defined contents
3) Provide a new way of deploying bundles using the "overlay" approach
4) Fail the deployment if the user tries to provide unclear instructions with the unsupported combination of managerRootDir and overlay

One other thing that comes to mind when thinking about this, is that (at least the docs say that) we currently only support single deployment-unit per bundle.

This basically means that one can only ever support deploying a single xAR into an app server. In one of the comments, Larry mentions an interesting use case where a bundle would contain a JDBC driver along with the xAR. This would call for multiple deployment units in the bundle, the xAR one with manageRootDir=false and the driver one with overlay=true. That's not all though because if we specified a single deployment root for the bundle (as is the case in the current implementation), it would be difficult for the two deployment units to exactly find out what is their relative path to this deployment root dir (which would probably have to be the path of the AS itself). But this is most probably a case for a separate RFE.

Comment 15 Jay Shaughnessy 2013-07-23 15:40:59 UTC
I like where Lukas is going in Comment 14.  Although, design and implementation changes are two different things.  The actual bundle deployment code is "intense" and changes to subdirectory handling may be quite significant.

Here's another possible revision of the proposal:
- Deprecate manageRootDir
- Introduce overlay, default=false
- Introduce overlayRootDir, no default(overlayDeployDir?)

If either of the new properties are present then manageRootDir, if present, would be ignored.  It really shouldn't be there because use of the new properties would indicate an updated recipe.

If overlay is false (default) then the bundle structure would be strictly applied.  If overlay is true then bundle files will replace the same existing file, or be added if the file does not exist.

overlayRootDir supersedes the overlay setting, for the root dir only, and is used only when the roodir and subdirectories need to be treated differently.

If absolutely no properties are present then default behavior is the same as today, because the default of manageRootDir=true is the same as the default of overlay=false.

If only manageRootDir is in the recipe:

 manageRootDir | Action
------------------------------------------------------------------------------
   false       | same as overlay=false + overlayRootDir=true
   true        | same as overlay=false + overlayRootDir=false

Otherwise:

 overlay | overlayRootDir |  Action
------------------------------------------------------------------------------
  unspec |     unspec     | Strict bundle structure
  unspec |     false      | Strict bundle structure
  unspec |     true       | Like legacy manageRootDir=false
  false  |     unspec     | Strict bundle structure
  false  |     false      | Strict bundle structure
  false  |     true       | Like legacy manageRootDir=false
  true   |     unspec     | Overlay existing files, no deletions (new!)
  true   |     false      | Strict bundle root dir, overlay subdir files (new!)
  true   |     true       | Overlay existing files, no deletions (new!)


How technically feasible this is is still a question to be answered.  But it's an attempt to define behavior in all situations, allow back compat, and move to new, and hopefully more clear, attribute names.

Comment 17 Lukas Krejci 2013-07-23 16:45:08 UTC
To further simplify Jay's proposal, I think we could go with:

* deprecation of manageRootDir
* new mandatory attribute "wipe" with values: "nothing", "subdirectories", "onlyRoot" and "everything"


Mapping:
 
manageRootDir=true                   : wipe="everything"
manageRootDir=false                  : wipe="subdirectories"
overlay=false & overlayRootDir=false : wipe="everything"
overlay=false & overlayRootDir=true  : wipe="subdirectories"
overlay=true  & overlayRootDir=false : wipe="onlyRoot"
overlay=true  & overlayRootDir=true  : wipe="nothing"

I am not too sure about the usefulness (and implementation consequences) of wipe="onlyRoot".

Comment 18 Lukas Krejci 2013-08-15 11:34:58 UTC
So the conclusion of this bug is the following:

1) For RHQ 4.9.0 we don't change the semantics, nor do we add any new functionality to the codebase.
2) We only deprecated the "manageRootDir" attribute and provided an equivalent new one called "compliance" with the following supported values:
* full - upon deployment the destination contains only files from the bundle (i.e. is in full compliance with the bundle)
* filesAndDirectories - all files and directories in the destination directory that are also in the bundle are made compliant with the bundle - i.e. the files from the bundle are copied over, and directories contain only stuff from the bundle. Files and directories in the destination that are NOT part of the bundle are left intact.

In the future, we may add the following 2 values to this attribute:
* rootDirectoryAndFiles - this corresponds to above mentioned wipe="onlyRoot"
* files - this corresponds to wipe="nothing"

commit 4e54703565e946030da436aa96d01bc7cb8d0dc2
Author: Lukas Krejci <lkrejci>
Date:   Fri Aug 2 00:54:49 2013 +0200

    [BZ 801926] - manageRootDir deprecated, supeseded by "compliance".
    The compliance has now 2 possible values:
    * full (corresponds to manageRootDir=true, i.e. the default),
    * filesAndDirectories (corresponds to manageRootDir=false)
    
    The name "full" should convey the fact that the deployment directory is in
    full compliance with the contents of the bundle.
    
    The name "filesAndDirectories" should convey the behavior of
    manageRootDir=false - i.e. the files and directories in the root dir that
    are not present in the bundle are left intact. When there is a directory or
    file in the root directory that is both in the deployment directory and
    the bundle, the file or directory is made compliant to the contents in the
    bundle.
    
    The other two proposed deployment behaviors are "rootDirectoryAndFiles"
    and "files", but those are commented out for the moment, because we don't
    plan to add support for them in RHQ 4.9.

Comment 19 Heiko W. Rupp 2013-09-03 14:44:52 UTC
Bulk closing of issues in old RHQ releases that are in production for a while now.

Please open a new issue when running into an issue.

Comment 20 Lukas Krejci 2013-09-03 14:57:40 UTC
Re-opening, this had an incorrect target release set.