652669 – RFE: have file-backed bundle files

Bug 652669 - RFE: have file-backed bundle files

Summary: RFE: have file-backed bundle files

Keywords:
Status:	NEW
Alias:	None
Product:	RHQ Project
Classification:	Other
Component:	Provisioning
Sub Component:
Version:	3.0.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Nobody
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2010-11-12 14:03 UTC by John Mazzitelli
Modified:	2022-03-31 04:28 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:
Embargoed:

Attachments	(Terms of Use)

Description John Mazzitelli 2010-11-12 14:03:51 UTC

Today, when you upload a bundle file (either via browser file-upload or via URL), the content of the bundle file gets stored in the DB and wrapped as a PackageVersion.

It does not support a PackageVersion with its content in a file on the file system - it must be in the DB.

There is a reason for this - if I am using HA (that is, multiple RHQ Servers are in use) and I create a bundle and upload it, if that bundle file is stored on the file system, only the RHQ Server that received the bundle upload could access the bundle file. If another RHQ Server was told to deploy that bundle, it would not have the bundle file content to deploy and thus an error would occur. The only workaround would be to use NFS (or some distributed file system) such that the RHQ Server that received the bundle file content would store the file in a directory that all RHQ Servers could see.

The reason why file-backed PackageVersions work in the Content Source subsystem is because the real content it actually found in a remote repository (like a yum repo) that all RHQ Servers can access - and if a RHQ Server doesn't have the content on its own file system, it goes to the remote repository to pull it down.

Because there is no "remote repository" for bundle files that were file-uploaded via the browser, we'd have to make the RHQ Server its own "remote repository" that other RHQ Servers can "see". However, in today's architecture, we explicitly designed it such that RHQ Servers do not directly talk to each other and in fact that don't have to have even the ability to connect to each other (across WANs or geographical regions for example). The only common point is the database - hence why bundle files are stored in the DB.

This issue is documenting the need to come up with a clever way to be able to store bundle files somewhere other than the DB because some people don't want to store their bundle files in our DB - especially if they already have a repository where they store their bundle files (say, for example, a git repository, or an HTTP server).

The answer might be to be able to say a bundle file can either have content in the DB OR in some remote URL (similar to the content source subsystem). We should be able to use alot of the same code that the content source subsystem uses).

Comment 1 John Mazzitelli 2010-11-12 14:11:34 UTC

BTW: if I recall correctly, many months ago someone wrote (and if its not
completed, its nearly completed) the ability to expose our content subsystem's
data via a servlet (I think it was to mimic something like a yum repo - see
org.rhq.gui.content.ContentHTTPServlet). We should investigate it to see if we
could re-use this (however, this would require a paradigm shift - it would mean
the RHQ Servers would have to have connectivity with each other which we do not
have as a requirement today)

Comment 2 Rajiv Jaisankar 2010-11-16 09:38:49 UTC

Hi,
What is the ETA for this feature? is exposing content filesystem via sevlet feature already available? Would it be possible to store bundle distribution or zip file in windows file share(say X:\ drive) or local windows drive with this feature?

Regards,
Rajiv

Comment 3 John Mazzitelli 2010-11-16 13:08:26 UTC

This is not scheduled on the roadmap yet. As mentioned in an earlier comment, one possible implementation using some previously developed code (but as yet unused)  would require a major change in design and thus would require a change in the way people deploy RHQ Servers (i.e. they would have to ensure that each server has connectivity with each other, a requirement that does not exist today). Because of the big implications, we need to think through this, ideally to come up with a way that doesn't require inter-server-connectivity.

"is exposing content filesystem via sevlet feature already available?" - not completely - someone needs to investigate the code to see what it really does today (it was part of an aborted effort to support Linux distributions stored in the content subsystem making it look like a yum repo) - so it didn't expose the content in a generic way as far as I know (I didn't write it, I just know about its existence) - we need to figure out what needs to be done to it to get it to do what we want. And again, even if that is to be used, that would introduce the requirement that all servers be able to talk to each other, something we want to avoid if at all possible.

Also, any implementation we do would not be a Windows-specific solution, in order to support non-Windows platforms. However, shared or local Windows drives could be specified in normal java.util.File objects (and java.net.URL - you can specify windows paths as file: URLs) so that probably wouldn't present any problems itself.

In the end, I think the solution would have to look something like our Content Source subsystem feature where today we support remote repositories that content package version (files). We could hook that into the Bundle subsystem so in the bundle create wizard, you just point to an existing package version and if the content bits haven't been pulled down yet, it would do so lazily. This would require the user to have the bundle files installed in some remote location (like a Windows drive, HTTP Server, etc). That, it would seem to me, would be the easiest thing to do - it would require the user to have a remote repository available to store their content (like an existing Apache Web Server or a shared, distributed file system) but that seems like something that is doable.

Comment 4 Alan Santos 2010-12-07 22:08:53 UTC

Without knowing much about (if any of) the implementation details, the features described above seem to fit well with a java content repository - http://www.jcp.org/en/jsr/detail?id=283

JBoss Modeshape implements the core and most of the optional capabilities described by that JSR. Query support through REST, webdav, and Java with access  federated, versioned content that is backed by an RDBMS, infinispan, filesystem or source control.

It also has an RHQ plugin for configuration - not sure about monitoring support though. 

I don't think it offers direct file based access, but I assume that backing stores - e.g. svn, disks - can continue to be accessed directly without modeshape.

Note You need to log in before you can comment on or make changes to this bug.