Description of problem: Running the 'ccm get --all' displays a full list of configuration parameters and their values. For instances of the ResourceParameter, however, it merely shows the BufferedInputStream object itself. eg: ccm.forum.traversal_adapters=java.io.BufferedInputStream@47ea5e5a ccm.london.portal.traversal_adapters=java.io.BufferedInputStream@1afa5e5a ccm.london.subsite.traversal_adapters=java.io.BufferedInputStream@42449e5a Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
ccm get doesn't really control how the parameter is displayed, it simply delegates to the parameter's marshal method. Most likely these parameters have a messed up or simply nonexistent marshal method. What parameter class is being used here?
com.arsdigita.util.parameter.ResourceParameter; The code in question is thus: m_adapters = new ResourceParameter ("ccm.forum.traversal_adapters", Parameter.REQUIRED, null) { public final Object getDefaultValue() { return Thread.currentThread().getContextClassLoader ().getResourceAsStream ("WEB-INF/resources/forum-adapters.xml"); } }; register(m_adapters);
Justin: can you take a look at this? I don't see how it's possible to either marshal or set ResourceParameter since there is no way to convert from a random input stream back to the text string used to obtain that input stream.
I agree, we have a conceptual problem in that you can't marshal an input stream (well, apart from writing it out). I figure we should change this code to use URLParameter and eliminate ResourceParameter.
Ok, thanks Justin. Dan: If you change usages of ResourceParameter to URLParameter then your problem should go away. There isn't really anything we can do to fix ResourceParameter.
It strikles me that we're trying to make the 'unmarshall' method fullfill two distinct use cases. It would be more reliable if the impl of 'ccm get' went back to the master config files that were originally loaded, thus skipping all use of 'unmarshall' rather than going via the intermediate representation. The only reason I see for going via the intermediate parameter classes is to pull in the 'default value' info, but this isn't required for the primary use case of 'ccm get' which is to export the custom settings for xfr to other JVMs in a cluster. cf http://central.boston.redhat.com/autobuild/files/core-dev-install/rhea-ig-waf-en-rickshaw/s1-prod-install-config-mjvm.html
Whether its more reliable depends on what you think the common failure case is. If you're trying deal with people hand editing the config files and entering malformed values, then the current implementation is far more robust since it validates the values and warns the user before displaying them. Of course if you're trying to be robust to pathological parameter implementations then obviously it's more robust to simply cat $CCM_HOME/conf/**/*.properties, however you could only do so at the cost of being able to validate any of values you're displaying.
I'm not saying that we shouldn't do validation of the config. The 'ccm get' command could be a two step process. First is does the normal load & validate process for all config parameters. Only when this has successfully completed, does it output the config optoions from the master files. There is potential for a race condition, if you close & re-open the config files between these two steps, but that wouldn't be hard to code around.
What you're describing is A) more complicated than the current implementation B) requires depending on the locking semantics of the filesystem being used to store the config, and C) has the dubious "feature" of making ccm get allow a buggy parameter implementation to go unnoticed until the next time someone tries to set the value.
WRT point A), the 'current implementation' has been shown to not work. Just deprecating ResourceParameter doesn't fix the fundamental problem with the config parameter API, namely the lack of an API for directly retrieving the original config parameter. Given the current flawed API, some developer can quite accidently implement a new parameter type with the same problems as ResourceParameter. As for B) that's an implementation detail - you could have the class that originally loads the config property files keep an map of the raw values, so you wouldn't have to re-open/lock the file. Likewise C) is an implementation detail. If its implemented in a buggy manner, then it will obviously cause the problem you mention, however, I can't see why it couldn't be implemented in a non-buggy manner.
Is there actually a feature request or bug report buried in your posts somewhere? If so then please take the time to consider what you're asking for and then describe it clearly. As far as I can tell you're just generally griping about the fact that marshal and unmarshal need to be implemented symmetrically and you seem to think the system can be robust to a parameter with an asymmetric marshal/unmarshal pair. If this in fact what you're saying then let me be crystal clear. It is logically impossible for the system to behave correctly in the face of a parameter with an asymmetric marshal/unmarshal implementation, and there is no implementation or design change we can make that will keep parameters with asymmetric marshal/unmarshal implementations from failing miserably when trying to validate or save configuration values. Now given that these buggy parameters are guaranteed to fail sooner or later, we generally want them to fail sooner. All your implementation suggestions simply defer the failure of the parameters until later, and I don't see how this really helps anyone. It would in fact simply make the system harder to debug.
Resolving per discussion above. ResourceParameter is deprecated; this is not a bug.
Deprecating the ResourceParameter class does nothing to solve the underlying problem with the config parameter API. As noted above there is no reliable way to get from an initialized instance of the Parameter interface back to the original config value. This is what needs to be fixed. The problem I see is thus: * Configuration files contains a value such as com.arsdigita.london.portal.traversal_adapters=/WEB-INF/portal-adapters.xml * Implementations of the c.a.util.parameter.Paramete interface are responsible for converting this String config value into a Java object. eg, ResourceParameter converts from config value representing a file name & path into a java.lang.BufferedInputStream. * The 'ccm get' command assumes that calling toString() on the Java object created by the your impl of the Parameter interface will return the original String config value. ie, In this case it assumes the BufferedInputStream#toString() created by ResourceParameter will return the value '/WEB-INF/portal-adapters.xml'. This is a bogus assumption. Just deprecating ResourceParameter does *NOT* solve the problem. Any developer could come along & create an instance of the 'Parameter' interface which has exactly the same problem as we have here with the ResourceParameter implementation. The concrete 'feature request' is to make it possible to reliably get the configuration values, without making assumptions on the implementation of the toString() method of the objects being created.
This is straight from the Parameter interface javadoc. Note that the parameter interface has methods to both read *and* write values: [snip] * Describes a named property that can read, write, and validate its * own value. See the documentation on {@link #read}, {@link #write}, * and {@link #validate} for details. [snip] Appended is the javadoc for the read, write, and validate methods of the Parameter interface. As I pointed out in my first comment, the 'ccm get' command does not in fact depend on the toString() method of the parameter value. It depends on the write method of the Parameter interface. So I believe we've always had the 'feature' you seem to be requesting. Regarding ResourceParameter, if you read the appended javadoc it should be evident that it clearly violates the contract specified by the Parameter interface. /** * Gets the parameter value as a Java object. The value will have * a specific runtime type and so may be appropriately cast. * * Reading typically follows the following procedure: * * <ul> * <li>Read the literal string value associated with the * parameter from <code>reader</code></li> * * <li>Convert the literal string value into an approprite Java * object</li> * </ul> * * If at any point in the process an error is encountered, it is * added to <code>errors</code>. Callers of this method will * typically construct an <code>ErrorList</code> in which to * collect errors. * * @param reader The <code>ParameterReader</code> from which to * recover a string literal value; it cannot be null * @param errors The <code>ErrorList</code> in which to collect * any errors encountered; it cannot be null * @return The Java object value of the parameter */ Object read(ParameterReader reader, ErrorList errors); /** * Validates the parameter value, <code>value</code>. Any * validation errors encountered are added to <code>errors</code>. * * @param value The value to validate; this is typically the value * returned by {@link #read}; it may be null * @param errors The <code>ErrorList</code> in which to collect * any errors encountered; it cannot be null */ void validate(Object value, ErrorList errors); /** * Writes the parameter value as a string literal. The parameter * marshals he object <code>value</code> to a string and sends it * to <code>writer</code>. * * @param writer The <code>ParameterWriter</code> that will take * the marshaled value and store it; it cannot be null * @param value The Java object value of the parameter */ void write(ParameterWriter writer, Object value);
Well, most of the Javadoc didn't exist when I was last investigating this last month. In any case, looking through it now, it looks like Parameter interface is fine. The problem lies with ResourceParameter & the way it subclasses AbstractParameter. In AbstractParameter the 'write' method delegates to doWrite. This in turn delegates to a 'marshal' method which simply does value.toString(). So, it would appear that ResourceParameter needs to override the 'marshal' method as well as 'unmarshal' to ensure these two methods are consistent with each other. One way to achieve this would be to create a InputStream impl that acts as a wrapper around the FileInputStream (or whatever) & preserves the resource path, thus making it available to the marshal method. We also ought to have unit tests for every impl of Parameter to ensure that reading & writing parameters are symmetric.
As stated above, ResourceParameter is deprecated. Use URLParameter. Closing.
Urm, there is no URL scheme for referring to a resource within a JAR or a webapp. The form of the absolute URL you get back from ClassLoader#getResource / ServletContext#getResource is dependent on the particular servlet engine's implementation of these classes. There is no way for the administrator to predict what it might be. Currently the property looks like: com.arsdigita.cms.ContentSectionConfig.item_adapters_file=/WEB-INF/resources/aplaws-cms-item-adapters.xml What would we change the path to in order that URLParameter would find it - the code in the unmarshal method for URLParameter merely does 'new URL(value)'. There is no lookup of files in the webapp. One approach would be for us to define a URL scheme such as resource:// that uses ClassLoader#getResource as its backend. Perhaps another approach is to make ResourceParameter subclass URLParameter & override the unmarshal method to do the resolution via the ClassLoader / ServletContext. Anyway as it stands I don't see how it is possible to use URLParameter to replace ResourceParameter. Please do not close this ticket again until we have a solution that is working satisfactorily in *practice*.
Here is the actual requirement: danpb_xml_monkey> so we need some parameter impl that can accept a path to a resource in the JAR, & return an InputStream / URL / other relvant class <danpb_xml_monkey> perhaps we can fix ResourceParameter, or maybe we can extend URLParameter to make it useable in this case <danpb_xml_monkey> or maybe we need another impl altogether <danpb_xml_monkey> richardl: FYI, forum/src/com/arsdigita/forum/ForumConfig.java is basically the class which summarises the use case
There's a fundamental problem with ResourceParameter: it cannot be read-write symmetric, location independent, *and* do more than StringParameter does. You have to pick any two. Options that don't work are: new URL(String) -> URL x -> x.toString() -> s s is not location independent getResourceAsStream(String) -> InputStream x -> x.toString() -> s s is useless, or it's the whole thing streamed out new File(String) -> File x -> x.toString() -> s s is not location independent What does work: String -> String x -> x.toString() (StringParameter) then ClassLoader#getResource(x) The latter is what we actually do in core where we need to load resources. As to why this is hard: Fundamentally, file:/// URLs don't give us enough information to distinguish between a resource's location-dependent and -independent parts--they don't round trip. bash-2.05b$ ccm which com/arsdigita/core/Initializer.class file:/var/ccm-devel/dev/justin/trunk/core/build/classes/com/arsdigita/core/Initializer.class So, on balance, I think StringParameter is the way to go for location-independent resources. Now, we could make ResourceParameter be simply StringParameter plus validation so you can ensure the resource exists using the parameter validation framework. And I haven't thought much about the resource:/// URL handler. Maybe that would work, but I think the gain is too little over StringParameter or StringParameterPlusPlus. Justin
The problem with StringParameter is that we're loosing an important bit of parameter validation. ie the check that the file actually exists. If we make ResourceParameter simply StringParameter plus validation, we open up the door to consistency problems, because there is no guarentee that the process ResourceParameter used for validation would match what the application code would do when loading the resource. Can we not make ResourceParameter use a custom impl of the InputStream interface. So similar FileInputStream, you could have a ResourceInputStream that acccepted a Resource name in its constructor & loaded that resource using ClassLoader#getResource. The impl of ResoruceInputStream would basically just delegate to the underlying real stream object retrieved from the CLassLoader. It would also keep a reference to the resource name & implement the toString() method such that it gave back the original resource name. So new ResoureInputStream(resname) -> ResourceInputStream x -> x.toString() -> resname
Using ResourceInputStream as you describe strikes me as quite dangerous. Lets say you have code that wants to use this input stream, it would call config.get(param) which would return the input stream. The code would then read the contents of the input stream. The trouble arises when you have some code running later that wants to use the same param again. This code would call config.get(param) and end up getting a closed or partially read input stream. You could try to hack around this by resetting the input stream or rewrapping it before returning a new one, but any such solution would require people using ResourceParameter to deviate from the standard pattern, and in general it's still a really bad idea to combine an immutable class (the class representing the resource location) with a mutable class (the input stream) since the rules for dealing with pointers to mutable and immutable classes are different and people using this class in any context will be more prone to making mistakes such as accidentally sharing a pointer when it is thought to be safe, or forgetting to rewrap or rewind the input stream when passing it off to some other code. I think what you actually want here is a class other than String that represents the resource location, and has an open() method on it that will return a brand new InputStream. We could either make up a new class for this purpose, or simply use the URL class with a new protocol, e.g. resource:/blah. I'm thinking the latter solution is preferable since it allows the same parameter to be easily used to refer to files, and also allows any existing URL parameter to easily refer to resources in the classpath.
Yes, I see how mutability could be troublesome here. So I think we have a solution at hand here that deals with all the cases: * Implement a new URL protocol handler for resource:// to deal with loading URLs via the ClassLoader * Add documentation to the Parameter class that objects returned by 'get' should in general be immutable. * Delete ResourceParameter class (cf bug 110965) * Replace all use of ResourceParameter with URLParameter & resource:// protocol for default values
Dan, Is there any more work to be done on this ticket? If so, do we know who should be doing it?
Yes, the four items detailed in my last comment need to be implemented. The first 3 items are changes in core for Rafi/Justin, the last is updating PS code which will be myself.
As of change 38437 there should be a URLStreamHandler in core for resource: URLs. Let me know when you've removed usages of ResourceParameter and I will delete it.
I've tried using the 'resource://' protocol in ForumConfig to no avail. I still get java.lang.ExceptionInInitializerError: com.arsdigita.util.UncheckedWrapperException: Cannot parse URL (root cause: java.net.MalformedURLException: unknown protocol: resource) at com.arsdigita.forum.ForumConfig.<init>(ForumConfig.java:51) at com.arsdigita.forum.Forum.<clinit>(Forum.java:61) at com.arsdigita.forum.Initializer.init(Initializer.java:53) at com.arsdigita.runtime.CompoundInitializer.init(CompoundInitializer.java:139) at com.arsdigita.runtime.Startup.run(Startup.java:244) at com.arsdigita.web.BaseServlet.<clinit>(BaseServlet.java:67) at java.lang.Class.newInstance0(Native Method) at java.lang.Class.newInstance(Class.java:262) at java.beans.Beans.instantiate(Beans.java:233) at java.beans.Beans.instantiate(Beans.java:77) at com.caucho.server.http.Application.instantiateServlet(Application.java:3178) at com.caucho.server.http.Application.createServlet(Application.java:3101) at com.caucho.server.http.Application.loadServlet(Application.java:3062) at com.caucho.server.http.Application.initServlets(Application.java:1923) at com.caucho.server.http.Application.init(Application.java:1849) at com.caucho.server.http.VirtualHost.initWars(VirtualHost.java:837) at com.caucho.server.http.VirtualHost.init(VirtualHost.java:695) at com.caucho.server.http.ServletServer.initHosts(ServletServer.java:885) at com.caucho.server.http.ServletServer.initInternal(ServletServer.java:727) at com.caucho.server.http.ServletServer.init(ServletServer.java:538) at com.caucho.server.http.ResinServer.init(ResinServer.java:391) at com.caucho.server.http.ResinServer.main(ResinServer.java:1152) at com.caucho.server.http.HttpServer.main(HttpServer.java:103) I'll create an attachment showing my code diff. Can you verify whether it works for you or not.
Created attachment 96401 [details] ForumConfig patch Generated using: P4DIFF="diff -cb" p4 diff forum/... > ~/forum.patch
I can't seem to get it to throw the exeption in this test case: public void test() throws Exception { URL url = new URL("resource:///WEB-INF/resources/forum-adapters.xml"); System.out.println("URL is: " + url); } I've tried on Sun 1.3.1 & 1.4.2, IBM 1.3.1 and 1.4.1 Dan, can you try this simple test as well?
Running Jon's code from the command line works fine. I can also explicitly pass in a resource.Handler in the URL() constructor: new URL(null, "resource:///WEB-INF/resources/subsite-adapters.xml", new sun.net.www.protocol.resource.Handler()); So, it just seems to be limited to the auto-detection of handler impls within servlet's.
Verified that this seems to be the case with Resin.
One short term workaround is thus: ==== //core-platform/dev/src/com/arsdigita/util/parameter/URLParameter.java#5 - /var/ccm-devel/dev/dan/aplaws-rickshaw/core/src/com/arsdigita/util/parameter/URLParameter.java ==== 49c49,53 < return new URL(value); --- > if (value.startsWith("resource://")) { > return new URL(null,value,new sun.net.www.protocol.resource.Handler()); > } else { > return new URL(value); > } but this rather sucks :-(
Technically I believe you want value.startsWith("resource:") in that workaround since the "//" is not part of the protocol, but part of the protocol specific format of the URL. I believe what is happening here is that the code in the URL constructor that uses reflection to locate default protocol handlers is only using the ClassLoader for the URL class and the system ClassLoader in order to attempt to locate a suitable protocol handler. This won't work in a servlet container because the Handler class is probably deployed to the system lib dir for the servlet container which is probably a child ClassLoader of both the class used to load the URL class, and the system ClassLoader. We can verify this theory by explicitly putting the Handler in the class path before starting the servlet container.
This theory is backed up by the following thread: http://lists.samba.org/archive/jcifs/2002-December/001584.html
@39294
I never got an answer to this question: http://post-office.corp.redhat.com/archives/ccm-engineering-list/2003-December/msg00019.html It seems like using the URL#setURLStreamHandlerFactory(URLStreamHandlerFactory) method (http://tinyurl.com/3aezr), we could avoid having the rather unsightly sun/net/www/protocol/handler directory. Just a minor gripe.
Unfortunately the setURLStreamHandlerFactory method isn't really suited to a multi-servlet environment since it can be called at most once per JVM. If some other servlet, or even the servlet container called it, they could easily blow away anything we set, or conversely we could be blowing away something the servlet container had set. <javadoc> public static void setURLStreamHandlerFactory(URLStreamHandlerFactory fac) Sets an application's URLStreamHandlerFactory. This method can be called at most once in a given Java Virtual Machine. </javadoc>
Ah, ok, that makes sense. Rather unfortunate though. Thanks.
Second thoughts in response to comment #36: > If some other servlet ... called it, they could easily blow away > anything we set, or conversely we could be blowing away something > the servlet container had set. We can't blow anything away, because the method can be called only once (as you pointed out). What they actually mean by "only once" is, an Error is thrown if you call it twice. |$ cat Main.java |import java.net.*; | |public class Main { | | public final static void main(String[] args) { | URL.setURLStreamHandlerFactory(new Factory()); | URL.setURLStreamHandlerFactory(new Factory()); | } | | private static class Factory implements URLStreamHandlerFactory { | public URLStreamHandler createURLStreamHandler(String proto) { | throw new Error("not implemented"); | } | } |} | |$ javac Main.java |$ java -cp . Main |Exception in thread "main" java.lang.Error: factory already defined | at java.net.URL.setURLStreamHandlerFactory(URL.java:857) | at Main.main(Main.java:7) So, we can support at most *one* application that calls setURLStreamHandlerFactory. If we use two third-party servlets that do that, we are screwed. As an additional restriction, in order for the third party URLStreamHandlerFactory to play nicely with our "resource:" handler, its implementation must duplicate what Sun's default implementation does. If the third party's library implementation fails to look in the "sun.net.www.protocol.resource" package, we are screwed again. Quoting from java.net.URL#URL(String, String, int, String) [http://tinyurl.com/2a5pu]: If this is the first URL object being created with the specified protocol, a stream protocol handler object, an instance of class URLStreamHandler, is created for that protocol: 1. If the application has previously set up an instance of URLStreamHandlerFactory as the stream handler factory, then the createURLStreamHandler method of that instance is called with the protocol string as an argument to create the stream protocol handler. So, I think the bottom line is, the current solution is no more (and no less) robust against rogue third-party libraries than the alternative method relying on setURLStreamHandlerFactory. It's just a little uglier Javadoc-wise.